Interface SparkDataStream

All Known Subinterfaces:
AcceptsLatestSeenOffset, ContinuousStream, MicroBatchStream, ReportsSourceMetrics, SupportsAdmissionControl, SupportsTriggerAvailableNow

@Evolving public interface SparkDataStream
The base interface representing a readable data stream in a Spark streaming query. It's responsible to manage the offsets of the streaming source in the streaming query.

Data sources should implement concrete data stream interfaces: MicroBatchStream and ContinuousStream.

Since:
3.0.0
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    Informs the source that Spark has completed processing all data for offsets less than or equal to `end` and will only request offsets greater than `end` in the future.
    Deserialize a JSON string into an Offset of the implementation-defined offset type.
    Returns the initial offset for a streaming query to start reading from.
    void
    Stop this source and free any resources it has allocated.
  • Method Details

    • initialOffset

      Offset initialOffset()
      Returns the initial offset for a streaming query to start reading from. Note that the streaming data source should not assume that it will start reading from its initial offset: if Spark is restarting an existing query, it will restart from the check-pointed offset rather than the initial one.
    • deserializeOffset

      Offset deserializeOffset(String json)
      Deserialize a JSON string into an Offset of the implementation-defined offset type.
      Throws:
      IllegalArgumentException - if the JSON does not encode a valid offset for this reader
    • commit

      void commit(Offset end)
      Informs the source that Spark has completed processing all data for offsets less than or equal to `end` and will only request offsets greater than `end` in the future.
    • stop

      void stop()
      Stop this source and free any resources it has allocated.