org.apache.spark.sql.connector.read.streaming (Spark 4.1.0-preview1 JavaDoc)

package org.apache.spark.sql.connector.read.streaming

Related Packages

Package

Description

org.apache.spark.sql.connector.read

org.apache.spark.sql.connector.read.colstats

org.apache.spark.sql.connector.read.partitioning
Class

Description

AcceptsLatestSeenOffset

Indicates that the source accepts the latest seen offset, which requires streaming execution to provide the latest seen offset when restarting the streaming query from checkpoint.

CompositeReadLimit

/** Represents a ReadLimit where the MicroBatchStream should scan approximately given maximum number of rows with at least the given minimum number of rows.

ContinuousPartitionReader<T>

A variation on PartitionReader for use with continuous streaming processing.

ContinuousPartitionReaderFactory

A variation on PartitionReaderFactory that returns ContinuousPartitionReader instead of PartitionReader.

ContinuousStream

A SparkDataStream for streaming queries with continuous mode.

MicroBatchStream

A SparkDataStream for streaming queries with micro-batch mode.

Offset

An abstract representation of progress through a MicroBatchStream or ContinuousStream.

PartitionOffset

Used for per-partition offsets in continuous processing.

ReadAllAvailable

Represents a ReadLimit where the MicroBatchStream must scan all the data available at the streaming source.

ReadLimit

Interface representing limits on how much to read from a MicroBatchStream when it implements SupportsAdmissionControl.

ReadMaxBytes

Represents a ReadLimit where the MicroBatchStream should scan files which total size doesn't go beyond a given maximum total size.

ReadMaxFiles

Represents a ReadLimit where the MicroBatchStream should scan approximately the given maximum number of files.

ReadMaxRows

Represents a ReadLimit where the MicroBatchStream should scan approximately the given maximum number of rows.

ReadMinRows

Represents a ReadLimit where the MicroBatchStream should scan approximately at least the given minimum number of rows.

ReportsSinkMetrics

A mix-in interface for streaming sinks to signal that they can report metrics.

ReportsSourceMetrics

A mix-in interface for SparkDataStream streaming sources to signal that they can report metrics.

SparkDataStream

The base interface representing a readable data stream in a Spark streaming query.

SupportsAdmissionControl

A mix-in interface for SparkDataStream streaming sources to signal that they can control the rate of data ingested into the system.

SupportsTriggerAvailableNow

An interface for streaming sources that supports running in Trigger.AvailableNow mode, which will process all the available data at the beginning of the query in (possibly) multiple batches.

Package org.apache.spark.sql.connector.read.streaming