Package org.apache.spark.sql.pipelines.graph
package org.apache.spark.sql.pipelines.graph
-
ClassDescriptionUsed in full graph update to select all flows.Used in full graph updates to select all tables.A
Flowthat reads source[s] completely and appends data to the target, just once.A `FlowExecution` that writes a batch `DataFrame` to a `Table`.Raised when there's a circular dependency in the current pipeline.AFlowthat declares exactly what data should be in the target table.Processor that is responsible for analyzing each flow and sort the nodes in topological orderDataflowGraph represents the core graph structure for Spark declarative pipelines.Resolves theDataflowGraphby processing each node in the graph.Exception thrown when transforming a node in the graph fails with a non-retryable error.Exception thrown when transforming a node in the graph fails because at least one of its dependencies weren't yet transformed.DatasetManageris responsible for materializing tables in the catalog based on the given graph.Wraps table materialization exceptions.A flow's execution may complete for two reasons: 1.Indicates that there was a failure while stopping the flow.Abstract class used to identify failures related to failures stopping an operation/timeouts.AFlowis a node of data transformation in a dataflow graph.A `FlowExecution` specifies how to execute a flow and manages its execution.Specifies how we should filter Flows.A wrapper for the lambda function that defines aFlow.Holds the DataFrame returned by aFlowFunctionalong with the inputs used to construct it.param: identifier The identifier of the flow.Plans execution ofFlows in aDataflowGraphby convertingFlows into 'FlowExecution's.Used in partial graph updates to select flows that flow to "selectedTables".An element in aDataflowGraph.Collection of errors that can be thrown during graph resolution / analysis.Represents the reason why a flow execution should be stopped.Indicates that the flow execution should be retried.Indicates that the flow execution should be stopped with a specific reason.GraphFilter<E>Specifies how we should filter Graph elements.Responsible for properly qualify the identifiers for datasets inside or referenced by the dataflow graph.Represents the identifier for a dataset that is defined or referenced in a pipeline.Represents the identifier for a dataset that is external to the current pipeline.Represents the identifier for a dataset that is defined by the current pipeline.A mutable context for registering tables, views, and flows in a dataflow graph.Validations performed on a `DataflowGraph`.Specifies an input that can be referenced by another Dataset's query.Exception raised when a flow fails to read from a table defined within the pipelineUsed to specify that no flows should be refreshed.Used to select no tables.Represents a node in aDataflowGraphthat can be written to by aFlow.Representing a persistedViewin aDataflowGraph.Executes aDataflowGraphby resolving the graph, materializing datasets, and running the flows.Interface for validating and accessing Pipeline-specific table properties.An implementation of the PipelineUpdateContext trait used in production.Contains the catalog and database context information for query execution.Indicates that run has failed due to a query execution failure.Records information used to track the provenance of a given query to user code.AFlowwhose flow function has been invoked, meaning either: - Its output schema and dependencies are known.AFlowwhose flow function has failed to resolve.AFlowwhose flow function has successfully resolved.A wrapper for a resolved internal input that includes the alias provided by the user.Indicates that a triggered run has successfully completed execution.Indicates that an run entered the failed state..Helper exception class that indicates that a run has to be terminated and tracks the associated termination reason.Used in partial graph updates to select "selectedTables".SQL statement processor context.Class that holds the logical plan and query origin parsed from a SQL statement.Data class for all state that is accumulated while processing a particularSqlGraphRegistrationContext.AFlowthat represents stateful movement of data to some target.A 'FlowExecution' that processes data statefully using Structured Streaming.A `StreamingFlowExecution` that writes a streaming `DataFrame` to a `Table`.A table representing a materialized dataset in aDataflowGraph.Specifies how we should filter Tables.A type ofInputwhere data is loaded from a table.Representing a temporaryViewin aDataflowGraph.Executes all of the flows in the given graph in topological order.Uncaught exception handler which first calls the delegate and then calls the OnFailure function with the uncaught exception.Run could not be associated with a proper root cause.Returns a flow filter that is a union of two flow filtersException raised when a flow tries to read from a dataset that exists but is unresolvedAFlowwhose output schema and dependencies aren't known.Exception raised when a pipeline has one or more flows that cannot be resolvedRepresenting a view in theDataflowGraph.A type ofTableInputthat returns data from a specified schema or from the inferredFlows that write to the table.