Class BarrierTaskContext
- All Implemented Interfaces:
Serializable
,org.apache.spark.internal.Logging
TaskContext
with extra contextual info and tooling for tasks in a barrier stage.
Use get()
to obtain the barrier context for a running barrier task.- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.SparkShellLoggingFilter
-
Method Summary
Modifier and TypeMethodDescriptionAdds a (Java friendly) listener to be executed on task completion.addTaskFailureListener
(TaskFailureListener listener) Adds a listener to be executed on task failure (which includes completion listener failure, if the task body did not already fail).String[]
:: Experimental :: Blocks until all tasks in the same stage have reached this routine.int
How many times this task has been attempted.void
barrier()
:: Experimental :: Sets a global barrier and waits until all tasks in this stage hit this barrier.int
cpus()
CPUs allocated to the task.static BarrierTaskContext
get()
:: Experimental :: Returns the currently active BarrierTaskContext.getLocalProperty
(String key) Get a local property set upstream in the driver, or null if it is missing.scala.collection.Seq<Source>
getMetricsSources
(String sourceName) ::DeveloperApi:: Returns all metrics sources with the given name which are associated with the instance which runs the task.:: Experimental :: ReturnsBarrierTaskInfo
for all tasks in this barrier stage, ordered by partition ID.boolean
Returns true if the task has completed.boolean
isFailed()
Returns true if the task has failed.boolean
Returns true if the task has been killed.int
Total number of partitions in the stage that this task belongs to.int
The ID of the RDD partition that is computed by this task.scala.collection.immutable.Map<String,
ResourceInformation> Resources allocated to the task.(java-specific) Resources allocated to the task.int
How many times the stage that this task belongs to has been attempted.int
stageId()
The ID of the stage that this task belong to.long
An ID that is unique to this task attempt (within the same SparkContext, no two task attempts will share the same attempt ID).org.apache.spark.executor.TaskMetrics
Methods inherited from class org.apache.spark.TaskContext
addTaskCompletionListener, addTaskFailureListener, getPartitionId
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq
-
Method Details
-
get
:: Experimental :: Returns the currently active BarrierTaskContext. This can be called inside of user functions to access contextual information about running barrier tasks.- Returns:
- (undocumented)
-
barrier
public void barrier():: Experimental :: Sets a global barrier and waits until all tasks in this stage hit this barrier. Similar to MPI_Barrier function in MPI, the barrier() function call blocks until all tasks in the same stage have reached this routine.CAUTION! In a barrier stage, each task must have the same number of barrier() calls, in all possible code branches. Otherwise, you may get the job hanging or a SparkException after timeout. Some examples of '''misuses''' are listed below: 1. Only call barrier() function on a subset of all the tasks in the same barrier stage, it shall lead to timeout of the function call.
rdd.barrier().mapPartitions { iter => val context = BarrierTaskContext.get() if (context.partitionId() == 0) { // Do nothing. } else { context.barrier() } iter }
2. Include barrier() function in a try-catch code block, this may lead to timeout of the second function call.
rdd.barrier().mapPartitions { iter => val context = BarrierTaskContext.get() try { // Do something that might throw an Exception. doSomething() context.barrier() } catch { case e: Exception => logWarning("...", e) } context.barrier() iter }
-
allGather
:: Experimental :: Blocks until all tasks in the same stage have reached this routine. Each task passes in a message and returns with a list of all the messages passed in by each of those tasks.CAUTION! The allGather method requires the same precautions as the barrier method
The message is type String rather than Array[Byte] because it is more convenient for the user at the cost of worse performance.
- Parameters:
message
- (undocumented)- Returns:
- (undocumented)
-
getTaskInfos
:: Experimental :: ReturnsBarrierTaskInfo
for all tasks in this barrier stage, ordered by partition ID.- Returns:
- (undocumented)
-
isCompleted
public boolean isCompleted()Description copied from class:TaskContext
Returns true if the task has completed.- Specified by:
isCompleted
in classTaskContext
- Returns:
- (undocumented)
-
isFailed
public boolean isFailed()Description copied from class:TaskContext
Returns true if the task has failed.- Specified by:
isFailed
in classTaskContext
- Returns:
- (undocumented)
-
isInterrupted
public boolean isInterrupted()Description copied from class:TaskContext
Returns true if the task has been killed.- Specified by:
isInterrupted
in classTaskContext
- Returns:
- (undocumented)
-
addTaskCompletionListener
Description copied from class:TaskContext
Adds a (Java friendly) listener to be executed on task completion. This will be called in all situations - success, failure, or cancellation. Adding a listener to an already completed task will result in that listener being called immediately.Two listeners registered in the same thread will be invoked in reverse order of registration if the task completes after both are registered. There are no ordering guarantees for listeners registered in different threads, or for listeners registered after the task completes. Listeners are guaranteed to execute sequentially.
An example use is for HadoopRDD to register a callback to close the input stream.
Exceptions thrown by the listener will result in failure of the task.
- Specified by:
addTaskCompletionListener
in classTaskContext
- Parameters:
listener
- (undocumented)- Returns:
- (undocumented)
-
addTaskFailureListener
Description copied from class:TaskContext
Adds a listener to be executed on task failure (which includes completion listener failure, if the task body did not already fail). Adding a listener to an already failed task will result in that listener being called immediately.Note: Prior to Spark 3.4.0, failure listeners were only invoked if the main task body failed.
- Specified by:
addTaskFailureListener
in classTaskContext
- Parameters:
listener
- (undocumented)- Returns:
- (undocumented)
-
stageId
public int stageId()Description copied from class:TaskContext
The ID of the stage that this task belong to.- Specified by:
stageId
in classTaskContext
- Returns:
- (undocumented)
-
stageAttemptNumber
public int stageAttemptNumber()Description copied from class:TaskContext
How many times the stage that this task belongs to has been attempted. The first stage attempt will be assigned stageAttemptNumber = 0, and subsequent attempts will have increasing attempt numbers.- Specified by:
stageAttemptNumber
in classTaskContext
- Returns:
- (undocumented)
-
partitionId
public int partitionId()Description copied from class:TaskContext
The ID of the RDD partition that is computed by this task.- Specified by:
partitionId
in classTaskContext
- Returns:
- (undocumented)
-
numPartitions
public int numPartitions()Description copied from class:TaskContext
Total number of partitions in the stage that this task belongs to.- Specified by:
numPartitions
in classTaskContext
- Returns:
- (undocumented)
-
attemptNumber
public int attemptNumber()Description copied from class:TaskContext
How many times this task has been attempted. The first task attempt will be assigned attemptNumber = 0, and subsequent attempts will have increasing attempt numbers.- Specified by:
attemptNumber
in classTaskContext
- Returns:
- (undocumented)
-
taskAttemptId
public long taskAttemptId()Description copied from class:TaskContext
An ID that is unique to this task attempt (within the same SparkContext, no two task attempts will share the same attempt ID). This is roughly equivalent to Hadoop's TaskAttemptID.- Specified by:
taskAttemptId
in classTaskContext
- Returns:
- (undocumented)
-
getLocalProperty
Description copied from class:TaskContext
Get a local property set upstream in the driver, or null if it is missing. See alsoorg.apache.spark.SparkContext.setLocalProperty
.- Specified by:
getLocalProperty
in classTaskContext
- Parameters:
key
- (undocumented)- Returns:
- (undocumented)
-
taskMetrics
public org.apache.spark.executor.TaskMetrics taskMetrics()- Specified by:
taskMetrics
in classTaskContext
-
getMetricsSources
Description copied from class:TaskContext
::DeveloperApi:: Returns all metrics sources with the given name which are associated with the instance which runs the task. For more information seeorg.apache.spark.metrics.MetricsSystem
.- Specified by:
getMetricsSources
in classTaskContext
- Parameters:
sourceName
- (undocumented)- Returns:
- (undocumented)
-
cpus
public int cpus()Description copied from class:TaskContext
CPUs allocated to the task.- Specified by:
cpus
in classTaskContext
- Returns:
- (undocumented)
-
resources
Description copied from class:TaskContext
Resources allocated to the task. The key is the resource name and the value is information about the resource. Please refer toResourceInformation
for specifics.- Specified by:
resources
in classTaskContext
- Returns:
- (undocumented)
-
resourcesJMap
Description copied from class:TaskContext
(java-specific) Resources allocated to the task. The key is the resource name and the value is information about the resource. Please refer toResourceInformation
for specifics.- Specified by:
resourcesJMap
in classTaskContext
- Returns:
- (undocumented)
-