@Experimental public interface SupportsRuntimeV2Filtering extends Scan
Scan
. Data sources can implement this interface if they can
filter initially planned InputPartition
s using predicates Spark infers at runtime.
This interface is very similar to SupportsRuntimeFiltering
except it uses
data source V2 Predicate
instead of data source V1 Filter
.
SupportsRuntimeV2Filtering
is preferred over SupportsRuntimeFiltering
and only one of them should be implemented by the data sources.
Note that Spark will push runtime filters only if they are beneficial.
Modifier and Type | Method and Description |
---|---|
void |
filter(Predicate[] predicates)
Filters this scan using runtime predicates.
|
NamedReference[] |
filterAttributes()
Returns attributes this scan can be filtered by at runtime.
|
description, readSchema, reportDriverMetrics, supportedCustomMetrics, toBatch, toContinuousStream, toMicroBatchStream
NamedReference[] filterAttributes()
Spark will call filter(Predicate[])
if it can derive a runtime
predicate for any of the filter attributes.
void filter(Predicate[] predicates)
The provided expressions must be interpreted as a set of predicates that are ANDed together.
Implementations may use the predicates to prune initially planned InputPartition
s.
If the scan also implements SupportsReportPartitioning
, it must preserve
the originally reported partitioning during runtime filtering. While applying runtime
predicates, the scan may detect that some InputPartition
s have no matching data. It
can omit such partitions entirely only if it does not report a specific partitioning.
Otherwise, the scan can replace the initially planned InputPartition
s that have no
matching data with empty InputPartition
s but must preserve the overall number of
partitions.
Note that Spark will call Scan.toBatch()
again after filtering the scan at runtime.
predicates
- data source V2 predicates used to filter the scan at runtime