Inequality test.
Inequality test.
// Scala: df("colA") !== df("colB") ) !(df("colA") === df("colB")) ) // Java: import static org.apache.spark.sql.functions.*; df.filter( col("colA").notEqual(col("colB")) );
Modulo (a.
Modulo (a.k.a. remainder) expression.
Boolean AND.
Boolean AND.
// Scala: The following selects people that are in school and employed at the same time. people("inSchool") && people("isEmployed") ) // Java: people("inSchool").and(people("isEmployed")) );
Multiplication of this expression and another expression.
Multiplication of this expression and another expression.
// Scala: The following multiplies a person's height by their weight. people("height") * people("weight") ) // Java: people("height").multiply(people("weight")) );
Sum of this expression and another expression.
Sum of this expression and another expression.
// Scala: The following selects the sum of a person's height and weight. people("height") + people("weight") ) // Java: people("height").plus(people("weight")) );
Subtraction. Subtract the other expression from this expression.
// Scala: The following selects the difference between people's height and their weight. people("height") - people("weight") ) // Java: people("height").minus(people("weight")) );
Division this expression by another expression.
Division this expression by another expression.
// Scala: The following divides a person's height by their weight. people("height") / people("weight") ) // Java: people("height").divide(people("weight")) );
Less than.
Less than.
// Scala: The following selects people younger than 21. people("age") < 21 ) // Java: people("age").lt(21) );
Less than or equal to.
Less than or equal to.
// Scala: The following selects people age 21 or younger than 21. people("age") <= 21 ) // Java: people("age").leq(21) );
Equality test that is safe for null values.
Equality test that is safe for null values.
Equality test.
Equality test.
// Scala: df.filter( df("colA") === df("colB") ) // Java import static org.apache.spark.sql.functions.*; df.filter( col("colA").equalTo(col("colB")) );
Greater than.
Greater than.
// Scala: The following selects people older than 21. people("age") > 21 ) // Java: import static org.apache.spark.sql.functions.*; people("age").gt(21) );
Greater than or equal to an expression.
Greater than or equal to an expression.
// Scala: The following selects people age 21 or older than 21. people("age") >= 21 ) // Java: people("age").geq(21) )
Boolean AND.
Boolean AND.
// Scala: The following selects people that are in school and employed at the same time. people("inSchool") && people("isEmployed") ) // Java: people("inSchool").and(people("isEmployed")) );
Creates a new AttributeReference of type array
Gives the column an alias.
Gives the column an alias.
// Renames colA to colB in select output.$"colA".as('colB))
Gives the column an alias.
Gives the column an alias.
// Renames colA to colB in select output.$"colA".as("colB"))
Returns an ordering used in sorting.
Returns an ordering used in sorting.
// Scala: sort a DataFrame by age column in ascending order. df.sort(df("age").asc) // Java df.sort(df.col("age").asc());
Creates a new AttributeReference of type binary
Creates a new AttributeReference of type boolean
Creates a new AttributeReference of type byte
Casts the column to a different data type, using the canonical string representation of the type.
Casts the column to a different data type, using the canonical string representation
of the type. The supported types are: string
, boolean
, byte
, short
, int
, long
, double
, decimal
, date
, timestamp
// Casts colA to integer."colA").cast("int"))
Casts the column to a different data type.
Casts the column to a different data type.
// Casts colA to IntegerType. import org.apache.spark.sql.types.IntegerType"colA").cast(IntegerType)) // equivalent to"colA").cast("int"))
Contains the other element.
Contains the other element.
Creates a new AttributeReference of type date
Creates a new AttributeReference of type decimal
Creates a new AttributeReference of type decimal
Returns an ordering used in sorting.
Returns an ordering used in sorting.
// Scala: sort a DataFrame by age column in descending order. df.sort(df("age").desc) // Java df.sort(df.col("age").desc());
Division this expression by another expression.
Division this expression by another expression.
// Scala: The following divides a person's height by their weight. people("height") / people("weight") ) // Java: people("height").divide(people("weight")) );
Creates a new AttributeReference of type double
String ends with another string literal.
String ends with another string literal.
String ends with.
String ends with.
Equality test that is safe for null values.
Equality test that is safe for null values.
Equality test.
Equality test.
// Scala: df.filter( df("colA") === df("colB") ) // Java import static org.apache.spark.sql.functions.*; df.filter( col("colA").equalTo(col("colB")) );
Prints the expression to the console for debugging purpose.
Prints the expression to the console for debugging purpose.
Creates a new AttributeReference of type float
Greater than or equal to an expression.
Greater than or equal to an expression.
// Scala: The following selects people age 21 or older than 21. people("age") >= 21 ) // Java: people("age").geq(21) )
An expression that gets a field by name in a StructField.
An expression that gets a field by name in a StructField.
An expression that gets an item at position ordinal
out of an array.
An expression that gets an item at position ordinal
out of an array.
Greater than.
Greater than.
// Scala: The following selects people older than 21. people("age") > lit(21) ) // Java: import static org.apache.spark.sql.functions.*; people("age").gt(21) );
A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.
A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.
Creates a new AttributeReference of type int
True if the current expression is NOT null.
True if the current expression is NOT null.
True if the current expression is null.
True if the current expression is null.
Less than or equal to.
Less than or equal to.
// Scala: The following selects people age 21 or younger than 21. people("age") <= 21 ) // Java: people("age").leq(21) );
SQL like expression.
SQL like expression.
Creates a new AttributeReference of type long
Less than.
Less than.
// Scala: The following selects people younger than 21. people("age") < 21 ) // Java: people("age").lt(21) );
Creates a new AttributeReference of type map
Subtraction. Subtract the other expression from this expression.
// Scala: The following selects the difference between people's height and their weight. people("height") - people("weight") ) // Java: people("height").minus(people("weight")) );
Modulo (a.
Modulo (a.k.a. remainder) expression.
Multiplication of this expression and another expression.
Multiplication of this expression and another expression.
// Scala: The following multiplies a person's height by their weight. people("height") * people("weight") ) // Java: people("height").multiply(people("weight")) );
Inequality test.
Inequality test.
// Scala: df("colA") !== df("colB") ) !(df("colA") === df("colB")) ) // Java: import static org.apache.spark.sql.functions.*; df.filter( col("colA").notEqual(col("colB")) );
Boolean OR.
Boolean OR.
// Scala: The following selects people that are in school or employed. people.filter( people("inSchool") || people("isEmployed") ) // Java: people.filter( people("inSchool").or(people("isEmployed")) );
Sum of this expression and another expression.
Sum of this expression and another expression.
// Scala: The following selects the sum of a person's height and weight. people("height") + people("weight") ) // Java: people("height").plus(people("weight")) );
SQL RLIKE expression (LIKE with Regex).
SQL RLIKE expression (LIKE with Regex).
Creates a new AttributeReference of type short
String starts with another string literal.
String starts with another string literal.
String starts with.
String starts with.
Creates a new AttributeReference of type string
Creates a new AttributeReference of type struct
An expression that returns a substring.
An expression that returns a substring.
starting position.
length of the substring.
An expression that returns a substring.
An expression that returns a substring.
expression for the starting position.
expression for the length of the substring.
Creates a new AttributeReference of type timestamp
Inversion of boolean expression, i.
Inversion of boolean expression, i.e. NOT. {{ // Scala: select rows that are not active (isActive === false) df.filter( !df("isActive") )
// Java: import static org.apache.spark.sql.functions.*; df.filter( not(df.col("isActive")) ); }}
Unary minus, i.
Unary minus, i.e. negate the expression.
// Scala: select the amount column and negates all values. -df("amount") ) // Java: import static org.apache.spark.sql.functions.*; negate(col("amount") );
Boolean OR.
Boolean OR.
// Scala: The following selects people that are in school or employed. people.filter( people("inSchool") || people("isEmployed") ) // Java: people.filter( people("inSchool").or(people("isEmployed")) );
:: Experimental :: A convenient class used for constructing schema.