Object

org.apache.spark.sql.util.SchemaUtils

public class SchemaUtils extends Object

Utils for handling schemas.

TODO: Merge this file with SchemaUtils.

Constructor Summary

Constructors

Constructor

Description

SchemaUtils()
Method Summary

Modifier and Type

Method

Description

static void

checkColumnNameDuplication(scala.collection.Seq<String> columnNames, boolean caseSensitiveAnalysis)

Checks if input column names have duplicate identifiers.

static void

checkColumnNameDuplication(scala.collection.Seq<String> columnNames, scala.Function2<String,String,Object> resolver)

Checks if input column names have duplicate identifiers.

static void

checkSchemaColumnNameDuplication(DataType schema, boolean caseSensitiveAnalysis)

Checks if an input schema has duplicate column names.

static void

checkSchemaColumnNameDuplication(StructType schema, scala.Function2<String,String,Object> resolver)

Checks if an input schema has duplicate column names.

static void

checkTransformDuplication(scala.collection.Seq<Transform> transforms, String checkType, boolean isCaseSensitive)

Checks if the partitioning transforms are being duplicated or not.

static String

escapeMetaCharacters(String str)

static scala.collection.Seq<String>

explodeNestedFieldNames(StructType schema)

Returns all column names in this schema as a flat list.

static scala.collection.Seq<Object>

findColumnPosition(scala.collection.Seq<String> column, StructType schema, scala.Function2<String,String,Object> resolver)

Returns the given column's ordinal within the given schema.

static scala.collection.Seq<String>

getColumnName(scala.collection.Seq<Object> position, StructType schema)

Gets the name of the column in the given position.

static scala.collection.Seq<org.apache.spark.sql.catalyst.expressions.NamedExpression>

restoreOriginalOutputNames(scala.collection.Seq<org.apache.spark.sql.catalyst.expressions.NamedExpression> projectList, scala.collection.Seq<String> originalNames)

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- SchemaUtils
  
  public SchemaUtils()
Method Details
- checkSchemaColumnNameDuplication
  
  public static void checkSchemaColumnNameDuplication(DataType schema, boolean caseSensitiveAnalysis)
  
  Checks if an input schema has duplicate column names. This throws an exception if the duplication exists.
  
  Parameters:
  
  schema - schema to check
  
  caseSensitiveAnalysis - whether duplication checks should be case sensitive or not
- checkSchemaColumnNameDuplication
  
  public static void checkSchemaColumnNameDuplication(StructType schema, scala.Function2<String,String,Object> resolver)
  
  Checks if an input schema has duplicate column names. This throws an exception if the duplication exists.
  
  Parameters:
  
  schema - schema to check
  
  resolver - resolver used to determine if two identifiers are equal
- checkColumnNameDuplication
  
  public static void checkColumnNameDuplication(scala.collection.Seq<String> columnNames, scala.Function2<String,String,Object> resolver)
  
  Checks if input column names have duplicate identifiers. This throws an exception if the duplication exists.
  
  Parameters:
  
  columnNames - column names to check
  
  resolver - resolver used to determine if two identifiers are equal
- checkColumnNameDuplication
  
  public static void checkColumnNameDuplication(scala.collection.Seq<String> columnNames, boolean caseSensitiveAnalysis)
  
  Checks if input column names have duplicate identifiers. This throws an exception if the duplication exists.
  
  Parameters:
  
  columnNames - column names to check
  
  caseSensitiveAnalysis - whether duplication checks should be case sensitive or not
- explodeNestedFieldNames
  
  public static scala.collection.Seq<String> explodeNestedFieldNames(StructType schema)
  
  Returns all column names in this schema as a flat list. For example, a schema like: | - a | | - 1 | | - 2 | - b | - c | | - nest | | - 3 will get flattened to: "a", "a.1", "a.2", "b", "c", "c.nest", "c.nest.3"
  
  Parameters:
  
  schema - (undocumented)
  
  Returns:
  
  (undocumented)
- checkTransformDuplication
  
  public static void checkTransformDuplication(scala.collection.Seq<Transform> transforms, String checkType, boolean isCaseSensitive)
  
  Checks if the partitioning transforms are being duplicated or not. Throws an exception if duplication exists.
  
  Parameters:
  
  transforms - the schema to check for duplicates
  
  checkType - contextual information around the check, used in an exception message
  
  isCaseSensitive - Whether to be case sensitive when comparing column names
- findColumnPosition
  
  public static scala.collection.Seq<Object> findColumnPosition(scala.collection.Seq<String> column, StructType schema, scala.Function2<String,String,Object> resolver)
  
  Returns the given column's ordinal within the given schema. The length of the returned position will be as long as how nested the column is.
  
  Parameters:
  
  column - The column to search for in the given struct. If the length of column is greater than 1, we expect to enter a nested field.
  
  schema - The current struct we are looking at.
  
  resolver - The resolver to find the column.
  
  Returns:
  
  (undocumented)
- getColumnName
  
  public static scala.collection.Seq<String> getColumnName(scala.collection.Seq<Object> position, StructType schema)
  
  Gets the name of the column in the given position.
  
  Parameters:
  
  position - (undocumented)
  
  schema - (undocumented)
  
  Returns:
  
  (undocumented)
- restoreOriginalOutputNames
  
  public static scala.collection.Seq<org.apache.spark.sql.catalyst.expressions.NamedExpression> restoreOriginalOutputNames(scala.collection.Seq<org.apache.spark.sql.catalyst.expressions.NamedExpression> projectList, scala.collection.Seq<String> originalNames)
- escapeMetaCharacters
  
  public static String escapeMetaCharacters(String str)
  
  Parameters:
  
  str - The string to be escaped.
  
  Returns:
  
  The escaped string.

Class SchemaUtils

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

SchemaUtils

Method Details

checkSchemaColumnNameDuplication

checkSchemaColumnNameDuplication

checkColumnNameDuplication

checkColumnNameDuplication

explodeNestedFieldNames

checkTransformDuplication

findColumnPosition

getColumnName

restoreOriginalOutputNames

escapeMetaCharacters