@Evolving public interface Partitioning
SupportsReportPartitioning.outputPartitioning(). Note that this should work like a snapshot. Once created, it should be deterministic and always report the same number of partitions and the same "satisfy" result for a certain distribution.
|Modifier and Type||Method and Description|
Returns the number of partitions(i.e.,
Returns true if this partitioning can satisfy the given distribution, which means Spark does not need to shuffle the output data of this data source for some certain operations.
InputPartitions) the data source outputs.
boolean satisfy(Distribution distribution)
Distributionin new releases. This method should be aware of it and always return false for unrecognized distributions. It's recommended to check every Spark new release and support new distributions if possible, to avoid shuffle at Spark side for more cases.