Class GraphExecution
Object
org.apache.spark.sql.pipelines.graph.GraphExecution
- All Implemented Interfaces:
org.apache.spark.internal.Logging
- Direct Known Subclasses:
TriggeredGraphExecution
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic interfacestatic interfaceRepresents the reason why a flow execution should be stopped.static classIndicates that the flow execution should be retried.static classIndicates that the flow execution should be stopped with a specific reason.static classNested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionabstract voidBlocks the current thread while any flows are queued or running.determineFlowExecutionActionFromError(scala.Function0<Throwable> ex, scala.Function0<String> flowDisplayName, scala.Function0<Object> currentNumTries, scala.Function0<Object> maxAllowedRetries) Analyze the exception thrown by flow execution and figure out if we should retry the execution, or we need to reanalyze the flow entirely to resolve issues like schema changes.scala.collection.concurrent.TrieMap<org.apache.spark.sql.catalyst.TableIdentifier,FlowExecution> FlowExecutions currently being executed and tracked by the graph execution.abstract RunTerminationReasonReturns the reason why this flow execution has terminated.static org.apache.spark.internal.Logging.LogStringContextLogStringContext(scala.StringContext sc) intmaxRetryAttemptsForFlow(org.apache.spark.sql.catalyst.TableIdentifier flowName) static org.slf4j.Loggerstatic voidorg$apache$spark$internal$Logging$$log__$eq(org.slf4j.Logger x$1) scala.Option<FlowExecution>planAndStartFlow(ResolvedFlow flow) Plans the logicalResolvedFlowinto aFlowExecutionand then starts executing it.voidstart()Starts the execution of flows ingraphForExecution.voidstop()Stops this execution by stopping all streams and terminating any other resources.voidStops execution of a `FlowExecution`.voidstopThread(Thread thread) Stop a thread timeout.abstract TriggerstreamTrigger(Flow flow) The `Trigger` configuration for a streaming flow.Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logBasedOnLevel, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext
-
Constructor Details
-
GraphExecution
-
-
Method Details
-
determineFlowExecutionActionFromError
public static GraphExecution.FlowExecutionAction determineFlowExecutionActionFromError(scala.Function0<Throwable> ex, scala.Function0<String> flowDisplayName, scala.Function0<Object> currentNumTries, scala.Function0<Object> maxAllowedRetries) Analyze the exception thrown by flow execution and figure out if we should retry the execution, or we need to reanalyze the flow entirely to resolve issues like schema changes. This should be the narrow waist for all exception analysis in flow execution. TODO: currently it only handles schema change and max retries, we should aim to extend this to include other non-retryable exception as well so we can have a single SoT for all these error matching logic.- Parameters:
ex- Exception to analyze.flowDisplayName- The user facing flow name with the error.currentNumTries- Number of times the flow has been tried.maxAllowedRetries- Maximum number of retries allowed for the flow.- Returns:
- (undocumented)
-
org$apache$spark$internal$Logging$$log_
public static org.slf4j.Logger org$apache$spark$internal$Logging$$log_() -
org$apache$spark$internal$Logging$$log__$eq
public static void org$apache$spark$internal$Logging$$log__$eq(org.slf4j.Logger x$1) -
LogStringContext
public static org.apache.spark.internal.Logging.LogStringContext LogStringContext(scala.StringContext sc) -
graphForExecution
-
streamTrigger
The `Trigger` configuration for a streaming flow. -
flowExecutions
public scala.collection.concurrent.TrieMap<org.apache.spark.sql.catalyst.TableIdentifier,FlowExecution> flowExecutions()FlowExecutions currently being executed and tracked by the graph execution.- Returns:
- (undocumented)
-
planAndStartFlow
Plans the logicalResolvedFlowinto aFlowExecutionand then starts executing it. Implementation note: Thread safe- Parameters:
flow- (undocumented)- Returns:
- None if the flow planner decided that there is no actual update required here. Otherwise returns the corresponding physical flow.
-
start
public void start()Starts the execution of flows ingraphForExecution. Does not block. -
stop
public void stop()Stops this execution by stopping all streams and terminating any other resources.This method may be called multiple times due to race conditions and must be idempotent.
-
stopFlow
Stops execution of a `FlowExecution`. -
awaitCompletion
public abstract void awaitCompletion()Blocks the current thread while any flows are queued or running. Returns when all flows that could be run have completed. When this returns, all flows are either SUCCESSFUL, TERMINATED_WITH_ERROR, SKIPPED, CANCELED, or EXCLUDED. -
getRunTerminationReason
Returns the reason why this flow execution has terminated. If the function is called before the flow has not terminated yet, the behavior is undefined, and may returnUnexpectedRunFailure.- Returns:
- (undocumented)
-
maxRetryAttemptsForFlow
public int maxRetryAttemptsForFlow(org.apache.spark.sql.catalyst.TableIdentifier flowName) -
stopThread
Stop a thread timeout.- Parameters:
thread- (undocumented)
-