package expressions
- Alphabetic
 
- Public
 - All
 
Type Members
- 
      
      
      
        
      
    
      
        abstract 
        class
      
      
        Aggregator[-IN, BUF, OUT] extends Serializable
      
      
      
A base class for user-defined aggregations, which can be used in
Datasetoperations to take all of the elements of a group and reduce them to a single value.A base class for user-defined aggregations, which can be used in
Datasetoperations to take all of the elements of a group and reduce them to a single value.For example, the following aggregator extracts an
intfrom a specific class and adds them up:case class Data(i: Int) val customSummer = new Aggregator[Data, Int, Int] { def zero: Int = 0 def reduce(b: Int, a: Data): Int = b + a.i def merge(b1: Int, b2: Int): Int = b1 + b2 def finish(r: Int): Int = r def bufferEncoder: Encoder[Int] = Encoders.scalaInt def outputEncoder: Encoder[Int] = Encoders.scalaInt }.toColumn() val ds: Dataset[Data] = ... val aggregated = ds.select(customSummer)
Based loosely on Aggregator from Algebird: https://github.com/twitter/algebird
- IN
 The input type for the aggregation.
- BUF
 The type of the intermediate value of the reduction.
- OUT
 The type of the final output result.
- Since
 1.6.0
 - 
      
      
      
        
      
    
      
        abstract 
        class
      
      
        MutableAggregationBuffer extends Row
      
      
      
A
Rowrepresenting a mutable aggregation buffer.A
Rowrepresenting a mutable aggregation buffer.This is not meant to be extended outside of Spark.
- Annotations
 - @Stable()
 - Since
 1.5.0
 - 
      
      
      
        
      
    
      
        sealed abstract 
        class
      
      
        UserDefinedFunction extends AnyRef
      
      
      
A user-defined function.
A user-defined function. To create one, use the
udffunctions infunctions.As an example:
// Define a UDF that returns true or false based on some numeric score. val predict = udf((score: Double) => score > 0.5) // Projects a column that adds a prediction column based on the score column. df.select( predict(df("score")) )
- Annotations
 - @Stable()
 - Since
 1.3.0
 - 
      
      
      
        
      
    
      
        
        class
      
      
        Window extends AnyRef
      
      
      
Utility functions for defining window in DataFrames.
Utility functions for defining window in DataFrames.
// PARTITION BY country ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW Window.partitionBy("country").orderBy("date") .rowsBetween(Window.unboundedPreceding, Window.currentRow) // PARTITION BY country ORDER BY date ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING Window.partitionBy("country").orderBy("date").rowsBetween(-3, 3)
- Annotations
 - @Stable()
 - Since
 1.4.0
 - 
      
      
      
        
      
    
      
        
        class
      
      
        WindowSpec extends AnyRef
      
      
      
A window specification that defines the partitioning, ordering, and frame boundaries.
A window specification that defines the partitioning, ordering, and frame boundaries.
Use the static methods in Window to create a WindowSpec.
- Annotations
 - @Stable()
 - Since
 1.4.0
 - 
      
      
      
        
      
    
      
        abstract 
        class
      
      
        UserDefinedAggregateFunction extends Serializable
      
      
      
The base class for implementing user-defined aggregate functions (UDAF).
The base class for implementing user-defined aggregate functions (UDAF).
- Annotations
 - @Stable() @deprecated
 - Deprecated
 (Since version 3.0.0)
- Since
 1.5.0
 
Value Members
- 
      
      
      
        
      
    
      
        
        object
      
      
        Window
      
      
      
Utility functions for defining window in DataFrames.
Utility functions for defining window in DataFrames.
// PARTITION BY country ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW Window.partitionBy("country").orderBy("date") .rowsBetween(Window.unboundedPreceding, Window.currentRow) // PARTITION BY country ORDER BY date ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING Window.partitionBy("country").orderBy("date").rowsBetween(-3, 3)
- Annotations
 - @Stable()
 - Since
 1.4.0
- Note
 When ordering is not defined, an unbounded window frame (rowFrame, unboundedPreceding, unboundedFollowing) is used by default. When ordering is defined, a growing window frame (rangeFrame, unboundedPreceding, currentRow) is used by default.