column_avro_functions {SparkR}R Documentation

Avro processing functions for Column operations

Description

Avro processing functions defined for Column.

Usage

from_avro(x, ...)

to_avro(x, ...)

## S4 method for signature 'characterOrColumn'
from_avro(x, jsonFormatSchema, ...)

## S4 method for signature 'characterOrColumn'
to_avro(x, jsonFormatSchema = NULL)

Arguments

x

Column to compute on.

...

additional argument(s) passed as parser options.

jsonFormatSchema

character Avro schema in JSON string format

Details

from_avro Converts a binary column of Avro format into its corresponding catalyst value. The specified schema must match the read data, otherwise the behavior is undefined: it may fail or return arbitrary result. To deserialize the data with a compatible and evolved schema, the expected Avro schema can be set via the option avroSchema.

to_avro Converts a column into binary of Avro format.

Note

Avro is built-in but external data source module since Spark 2.4. Please deploy the application as per the deployment section of "Apache Avro Data Source Guide".

from_avro since 3.1.0

to_avro since 3.1.0

Examples

## Not run: 
##D df <- createDataFrame(iris)
##D schema <- paste(
##D   c(
##D     '{"type": "record", "namespace": "example.avro", "name": "Iris", "fields": [',
##D     '{"type": ["double", "null"], "name": "Sepal_Length"},',
##D     '{"type": ["double", "null"], "name": "Sepal_Width"},',
##D     '{"type": ["double", "null"], "name": "Petal_Length"},',
##D     '{"type": ["double", "null"], "name": "Petal_Width"},',
##D     '{"type": ["string", "null"], "name": "Species"}]}'
##D   ),
##D   collapse="\\n"
##D )
##D 
##D df_serialized <- select(
##D   df,
##D   alias(to_avro(alias(struct(column("*")), "fields")), "payload")
##D )
##D 
##D df_deserialized <- select(
##D   df_serialized,
##D   from_avro(df_serialized$payload, schema)
##D )
##D 
##D head(df_deserialized)
## End(Not run)

[Package SparkR version 3.1.2 Index]