DataFrameNaFunctions

abstract class DataFrameNaFunctions extends AnyRef

Functionality for working with missing data in DataFrames.

Annotations: @Stable()
Source: DataFrameNaFunctions.scala
Since: 1.3.1

Linear Supertypes

AnyRef, Any

Ordering

Alphabetic
By Inheritance

Inherited

DataFrameNaFunctions
AnyRef
Any

Hide All
Show All

Visibility

Public
Protected

Instance Constructors

new DataFrameNaFunctions()

Abstract Value Members

abstract def drop(minNonNulls: Option[Int], cols: Seq[String]): DataFrame
Attributes
protected
abstract def drop(minNonNulls: Option[Int]): DataFrame
Attributes
protected
abstract def fill(value: Boolean, cols: Seq[String]): DataFrame
(Scala-specific) Returns a new DataFrame that replaces null values in specified boolean columns.
(Scala-specific) Returns a new DataFrame that replaces null values in specified boolean columns. If a specified column is not a boolean column, it is ignored.
Since
2.3.0
abstract def fill(value: Boolean): DataFrame
Returns a new DataFrame that replaces null values in boolean columns with value.
Returns a new DataFrame that replaces null values in boolean columns with value.
Since
2.3.0
abstract def fill(value: String, cols: Seq[String]): DataFrame
(Scala-specific) Returns a new DataFrame that replaces null values in specified string columns.
(Scala-specific) Returns a new DataFrame that replaces null values in specified string columns. If a specified column is not a string column, it is ignored.
Since
1.3.1
abstract def fill(value: Double, cols: Seq[String]): DataFrame
(Scala-specific) Returns a new DataFrame that replaces null or NaN values in specified numeric columns.
(Scala-specific) Returns a new DataFrame that replaces null or NaN values in specified numeric columns. If a specified column is not a numeric column, it is ignored.
Since
1.3.1
abstract def fill(value: Long, cols: Seq[String]): DataFrame
(Scala-specific) Returns a new DataFrame that replaces null or NaN values in specified numeric columns.
(Scala-specific) Returns a new DataFrame that replaces null or NaN values in specified numeric columns. If a specified column is not a numeric column, it is ignored.
Since
2.2.0
abstract def fill(value: String): DataFrame
Returns a new DataFrame that replaces null values in string columns with value.
Returns a new DataFrame that replaces null values in string columns with value.
Since
1.3.1
abstract def fill(value: Double): DataFrame
Returns a new DataFrame that replaces null or NaN values in numeric columns with value.
Returns a new DataFrame that replaces null or NaN values in numeric columns with value.
Since
1.3.1
abstract def fill(value: Long): DataFrame
Returns a new DataFrame that replaces null or NaN values in numeric columns with value.
Returns a new DataFrame that replaces null or NaN values in numeric columns with value.
Since
2.2.0
abstract def fillMap(values: Seq[(String, Any)]): DataFrame
Attributes
protected
abstract def replace[T](cols: Seq[String], replacement: Map[T, T]): DataFrame
(Scala-specific) Replaces values matching keys in replacement map.
(Scala-specific) Replaces values matching keys in replacement map.
```
// Replaces all occurrences of 1.0 with 2.0 in column "height" and "weight".
df.na.replace("height" :: "weight" :: Nil, Map(1.0 -> 2.0));

// Replaces all occurrences of "UNKNOWN" with "unnamed" in column "firstname" and "lastname".
df.na.replace("firstname" :: "lastname" :: Nil, Map("UNKNOWN" -> "unnamed"));
```
cols
list of columns to apply the value replacement. If col is "*", replacement is applied on all string, numeric or boolean columns.
replacement
value replacement map. Key and value of replacement map must have the same type, and can only be doubles, strings or booleans. The map value can have nulls.
Since
1.3.1
abstract def replace[T](col: String, replacement: Map[T, T]): DataFrame
(Scala-specific) Replaces values matching keys in replacement map.
(Scala-specific) Replaces values matching keys in replacement map.
```
// Replaces all occurrences of 1.0 with 2.0 in column "height".
df.na.replace("height", Map(1.0 -> 2.0));

// Replaces all occurrences of "UNKNOWN" with "unnamed" in column "name".
df.na.replace("name", Map("UNKNOWN" -> "unnamed"));

// Replaces all occurrences of "UNKNOWN" with "unnamed" in all string columns.
df.na.replace("*", Map("UNKNOWN" -> "unnamed"));
```
col
name of the column to apply the value replacement. If col is "*", replacement is applied on all string, numeric or boolean columns.
replacement
value replacement map. Key and value of replacement map must have the same type, and can only be doubles, strings or booleans. The map value can have nulls.
Since
1.3.1

Concrete Value Members

final def !=(arg0: Any): Boolean
Definition Classes
AnyRef → Any
final def ##: Int
Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean
Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0
Definition Classes
Any
def clone(): AnyRef
Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.CloneNotSupportedException]) @IntrinsicCandidate() @native()
def drop(minNonNulls: Int, cols: Seq[String]): DataFrame
(Scala-specific) Returns a new DataFrame that drops rows containing less than minNonNulls non-null and non-NaN values in the specified columns.
(Scala-specific) Returns a new DataFrame that drops rows containing less than minNonNulls non-null and non-NaN values in the specified columns.
Since
1.3.1
def drop(minNonNulls: Int, cols: Array[String]): DataFrame
Returns a new DataFrame that drops rows containing less than minNonNulls non-null and non-NaN values in the specified columns.
Returns a new DataFrame that drops rows containing less than minNonNulls non-null and non-NaN values in the specified columns.
Since
1.3.1
def drop(minNonNulls: Int): DataFrame
Returns a new DataFrame that drops rows containing less than minNonNulls non-null and non-NaN values.
Returns a new DataFrame that drops rows containing less than minNonNulls non-null and non-NaN values.
Since
1.3.1
def drop(how: String, cols: Seq[String]): DataFrame
(Scala-specific) Returns a new DataFrame that drops rows containing null or NaN values in the specified columns.
(Scala-specific) Returns a new DataFrame that drops rows containing null or NaN values in the specified columns.
If how is "any", then drop rows containing any null or NaN values in the specified columns. If how is "all", then drop rows only if every specified column is null or NaN for that row.
Since
1.3.1
def drop(how: String, cols: Array[String]): DataFrame
Returns a new DataFrame that drops rows containing null or NaN values in the specified columns.
Returns a new DataFrame that drops rows containing null or NaN values in the specified columns.
If how is "any", then drop rows containing any null or NaN values in the specified columns. If how is "all", then drop rows only if every specified column is null or NaN for that row.
Since
1.3.1
def drop(cols: Seq[String]): DataFrame
(Scala-specific) Returns a new DataFrame that drops rows containing any null or NaN values in the specified columns.
(Scala-specific) Returns a new DataFrame that drops rows containing any null or NaN values in the specified columns.
Since
1.3.1
def drop(cols: Array[String]): DataFrame
Returns a new DataFrame that drops rows containing any null or NaN values in the specified columns.
Returns a new DataFrame that drops rows containing any null or NaN values in the specified columns.
Since
1.3.1
def drop(how: String): DataFrame
Returns a new DataFrame that drops rows containing null or NaN values.
Returns a new DataFrame that drops rows containing null or NaN values.
If how is "any", then drop rows containing any null or NaN values. If how is "all", then drop rows only if every column is null or NaN for that row.
Since
1.3.1
def drop(): DataFrame
Returns a new DataFrame that drops rows containing any null or NaN values.
Returns a new DataFrame that drops rows containing any null or NaN values.
Since
1.3.1
final def eq(arg0: AnyRef): Boolean
Definition Classes
AnyRef
def equals(arg0: AnyRef): Boolean
Definition Classes
AnyRef → Any
def fill(valueMap: Map[String, Any]): DataFrame
(Scala-specific) Returns a new DataFrame that replaces null values.
(Scala-specific) Returns a new DataFrame that replaces null values.
The key of the map is the column name, and the value of the map is the replacement value. The value must be of the following type: Int, Long, Float, Double, String, Boolean. Replacement values are cast to the column data type.
For example, the following replaces null values in column "A" with string "unknown", and null values in column "B" with numeric value 1.0.
```
df.na.fill(Map(
  "A" -> "unknown",
  "B" -> 1.0
))
```
Since
1.3.1
def fill(valueMap: Map[String, Any]): DataFrame
Returns a new DataFrame that replaces null values.
Returns a new DataFrame that replaces null values.
The key of the map is the column name, and the value of the map is the replacement value. The value must be of the following type: Integer, Long, Float, Double, String, Boolean. Replacement values are cast to the column data type.
For example, the following replaces null values in column "A" with string "unknown", and null values in column "B" with numeric value 1.0.
```
import com.google.common.collect.ImmutableMap;
df.na.fill(ImmutableMap.of("A", "unknown", "B", 1.0));
```
Since
1.3.1
def fill(value: Boolean, cols: Array[String]): DataFrame
Returns a new DataFrame that replaces null values in specified boolean columns.
Returns a new DataFrame that replaces null values in specified boolean columns. If a specified column is not a boolean column, it is ignored.
Since
2.3.0
def fill(value: String, cols: Array[String]): DataFrame
Returns a new DataFrame that replaces null values in specified string columns.
Returns a new DataFrame that replaces null values in specified string columns. If a specified column is not a string column, it is ignored.
Since
1.3.1
def fill(value: Double, cols: Array[String]): DataFrame
Returns a new DataFrame that replaces null or NaN values in specified numeric columns.
Returns a new DataFrame that replaces null or NaN values in specified numeric columns. If a specified column is not a numeric column, it is ignored.
Since
1.3.1
def fill(value: Long, cols: Array[String]): DataFrame
Returns a new DataFrame that replaces null or NaN values in specified numeric columns.
Returns a new DataFrame that replaces null or NaN values in specified numeric columns. If a specified column is not a numeric column, it is ignored.
Since
2.2.0
final def getClass(): Class[_ <: AnyRef]
Definition Classes
AnyRef → Any
Annotations
@IntrinsicCandidate() @native()
def hashCode(): Int
Definition Classes
AnyRef → Any
Annotations
@IntrinsicCandidate() @native()
final def isInstanceOf[T0]: Boolean
Definition Classes
Any
final def ne(arg0: AnyRef): Boolean
Definition Classes
AnyRef
final def notify(): Unit
Definition Classes
AnyRef
Annotations
@IntrinsicCandidate() @native()
final def notifyAll(): Unit
Definition Classes
AnyRef
Annotations
@IntrinsicCandidate() @native()
def replace[T](cols: Array[String], replacement: Map[T, T]): DataFrame
Replaces values matching keys in replacement map with the corresponding values.
Replaces values matching keys in replacement map with the corresponding values.
```
import com.google.common.collect.ImmutableMap;

// Replaces all occurrences of 1.0 with 2.0 in column "height" and "weight".
df.na.replace(new String[] {"height", "weight"}, ImmutableMap.of(1.0, 2.0));

// Replaces all occurrences of "UNKNOWN" with "unnamed" in column "firstname" and "lastname".
df.na.replace(new String[] {"firstname", "lastname"}, ImmutableMap.of("UNKNOWN", "unnamed"));
```
cols
list of columns to apply the value replacement. If col is "*", replacement is applied on all string, numeric or boolean columns.
replacement
value replacement map. Key and value of replacement map must have the same type, and can only be doubles, strings or booleans. The map value can have nulls.
Since
1.3.1
def replace[T](col: String, replacement: Map[T, T]): DataFrame
Replaces values matching keys in replacement map with the corresponding values.
Replaces values matching keys in replacement map with the corresponding values.
```
import com.google.common.collect.ImmutableMap;

// Replaces all occurrences of 1.0 with 2.0 in column "height".
df.na.replace("height", ImmutableMap.of(1.0, 2.0));

// Replaces all occurrences of "UNKNOWN" with "unnamed" in column "name".
df.na.replace("name", ImmutableMap.of("UNKNOWN", "unnamed"));

// Replaces all occurrences of "UNKNOWN" with "unnamed" in all string columns.
df.na.replace("*", ImmutableMap.of("UNKNOWN", "unnamed"));
```
col
name of the column to apply the value replacement. If col is "*", replacement is applied on all string, numeric or boolean columns.
replacement
value replacement map. Key and value of replacement map must have the same type, and can only be doubles, strings or booleans. The map value can have nulls.
Since
1.3.1
final def synchronized[T0](arg0: => T0): T0
Definition Classes
AnyRef
def toString(): String
Definition Classes
AnyRef → Any
final def wait(arg0: Long, arg1: Int): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException])
final def wait(arg0: Long): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException]) @native()
final def wait(): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException])

Deprecated Value Members

def finalize(): Unit
Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.Throwable]) @Deprecated
Deprecated
(Since version 9)

Packages

DataFrameNaFunctions

abstract class DataFrameNaFunctions extends AnyRef

Instance Constructors

Abstract Value Members

Concrete Value Members

Deprecated Value Members

Inherited from AnyRef

Inherited from Any

Ungrouped

Packages

DataFrameNaFunctions

abstract class DataFrameNaFunctions extends AnyRef

Instance Constructors

Abstract Value Members

Concrete Value Members

Deprecated Value Members

Inherited from AnyRef

Inherited from Any

Ungrouped

DataFrameNaFunctions