SupportsScanColumnarBatch (Spark 2.4.7 JavaDoc)

All Superinterfaces:

DataSourceReader
```
@InterfaceStability.Evolving
public interface SupportsScanColumnarBatch
extends DataSourceReader
```
A mix-in interface for DataSourceReader. Data source readers can implement this interface to output ColumnarBatch and make the scan faster.

Method Summary

All Methods Instance Methods Abstract Methods Default Methods
Modifier and Type	Method and Description
`default boolean`	`enableBatchRead()` Returns true if the concrete data source reader can read data in batch according to the scan properties like required columns, pushes filters, etc.
`java.util.List<InputPartition<ColumnarBatch>>`	`planBatchInputPartitions()` Similar to `DataSourceReader.planInputPartitions()`, but returns columnar data in batches.
`default java.util.List<InputPartition<org.apache.spark.sql.catalyst.InternalRow>>`	`planInputPartitions()` Returns a list of `InputPartition`s.

Methods inherited from interface org.apache.spark.sql.sources.v2.reader.DataSourceReader
readSchema

- Method Detail
  - planInputPartitions
```
default java.util.List<InputPartition<org.apache.spark.sql.catalyst.InternalRow>> planInputPartitions()
```
    Description copied from interface: DataSourceReader
    
    Returns a list of InputPartitions. Each InputPartition is responsible for creating a data reader to output data of one RDD partition. The number of input partitions returned here is the same as the number of RDD partitions this scan outputs. Note that, this may not be a full scan if the data source reader mixes in other optimization interfaces like column pruning, filter push-down, etc. These optimizations are applied before Spark issues the scan request. If this method fails (by throwing an exception), the action will fail and no Spark job will be submitted.
    
    Specified by:
    
    planInputPartitions in interface DataSourceReader
  - planBatchInputPartitions
```
java.util.List<InputPartition<ColumnarBatch>> planBatchInputPartitions()
```
    Similar to DataSourceReader.planInputPartitions(), but returns columnar data in batches.
  - enableBatchRead
```
default boolean enableBatchRead()
```
    Returns true if the concrete data source reader can read data in batch according to the scan properties like required columns, pushes filters, etc. It's possible that the implementation can only support some certain columns with certain types. Users can overwrite this method and planInputPartitions() to fallback to normal read path under some conditions.

Interface SupportsScanColumnarBatch

Method Summary

Methods inherited from interface org.apache.spark.sql.sources.v2.reader.DataSourceReader

Method Detail

planInputPartitions

planBatchInputPartitions

enableBatchRead