Interface PartitionReaderFactory
- All Superinterfaces:
Serializable
- All Known Subinterfaces:
ContinuousPartitionReaderFactory
A factory used to create
PartitionReader
instances.
If Spark fails to execute any methods in the implementations of this interface or in the returned
PartitionReader
(by throwing an exception), corresponding Spark task would fail and
get retried until hitting the maximum retry times.
- Since:
- 3.0.0
-
Method Summary
Modifier and TypeMethodDescriptiondefault PartitionReader<ColumnarBatch>
createColumnarReader
(InputPartition partition) Returns a columnar partition reader to read data from the givenInputPartition
.PartitionReader<org.apache.spark.sql.catalyst.InternalRow>
createReader
(InputPartition partition) Returns a row-based partition reader to read data from the givenInputPartition
.default boolean
supportColumnarReads
(InputPartition partition) Returns true if the givenInputPartition
should be read by Spark in a columnar way.
-
Method Details
-
createReader
Returns a row-based partition reader to read data from the givenInputPartition
.Implementations probably need to cast the input partition to the concrete
InputPartition
class defined for the data source. -
createColumnarReader
Returns a columnar partition reader to read data from the givenInputPartition
.Implementations probably need to cast the input partition to the concrete
InputPartition
class defined for the data source. -
supportColumnarReads
Returns true if the givenInputPartition
should be read by Spark in a columnar way. This means, implementations must also implementcreateColumnarReader(InputPartition)
for the input partitions that this method returns true.As of Spark 2.4, Spark can only read all input partition in a columnar way, or none of them. Data source can't mix columnar and row-based partitions. This may be relaxed in future versions.
-