org.apache.spark.TaskContext

org.apache.spark.BarrierTaskContext

All Implemented Interfaces:: Serializable, org.apache.spark.internal.Logging

public class BarrierTaskContext extends TaskContext implements org.apache.spark.internal.Logging

:: Experimental :: A TaskContext with extra contextual info and tooling for tasks in a barrier stage. Use get() to obtain the barrier context for a running barrier task.

See Also:

Serialized Form

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.SparkShellLoggingFilter
Method Summary

Modifier and Type

Method

Description

BarrierTaskContext

addTaskCompletionListener(TaskCompletionListener listener)

Adds a (Java friendly) listener to be executed on task completion.

BarrierTaskContext

addTaskFailureListener(TaskFailureListener listener)

Adds a listener to be executed on task failure (which includes completion listener failure, if the task body did not already fail).

String[]

allGather(String message)

:: Experimental :: Blocks until all tasks in the same stage have reached this routine.

int

attemptNumber()

How many times this task has been attempted.

void

barrier()

:: Experimental :: Sets a global barrier and waits until all tasks in this stage hit this barrier.

int

cpus()

CPUs allocated to the task.

static BarrierTaskContext

get()

:: Experimental :: Returns the currently active BarrierTaskContext.

String

getLocalProperty(String key)

Get a local property set upstream in the driver, or null if it is missing.

scala.collection.Seq<Source>

getMetricsSources(String sourceName)

::DeveloperApi:: Returns all metrics sources with the given name which are associated with the instance which runs the task.

BarrierTaskInfo[]

getTaskInfos()

:: Experimental :: Returns BarrierTaskInfo for all tasks in this barrier stage, ordered by partition ID.

boolean

isCompleted()

Returns true if the task has completed.

boolean

isFailed()

Returns true if the task has failed.

boolean

isInterrupted()

Returns true if the task has been killed.

int

numPartitions()

Total number of partitions in the stage that this task belongs to.

int

partitionId()

The ID of the RDD partition that is computed by this task.

scala.collection.immutable.Map<String,ResourceInformation>

resources()

Resources allocated to the task.

Map<String,ResourceInformation>

resourcesJMap()

(java-specific) Resources allocated to the task.

int

stageAttemptNumber()

How many times the stage that this task belongs to has been attempted.

int

stageId()

The ID of the stage that this task belong to.

long

taskAttemptId()

An ID that is unique to this task attempt (within the same SparkContext, no two task attempts will share the same attempt ID).

org.apache.spark.executor.TaskMetrics

taskMetrics()

Methods inherited from class org.apache.spark.TaskContext
addTaskCompletionListener, addTaskFailureListener, getPartitionId

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq

Method Details
- get
  
  public static BarrierTaskContext get()
  
  :: Experimental :: Returns the currently active BarrierTaskContext. This can be called inside of user functions to access contextual information about running barrier tasks.
  
  Returns:
  
  (undocumented)
- barrier
  
  public void barrier()
  :: Experimental :: Sets a global barrier and waits until all tasks in this stage hit this barrier. Similar to MPI_Barrier function in MPI, the barrier() function call blocks until all tasks in the same stage have reached this routine.
  CAUTION! In a barrier stage, each task must have the same number of barrier() calls, in all possible code branches. Otherwise, you may get the job hanging or a SparkException after timeout. Some examples of '''misuses''' are listed below: 1. Only call barrier() function on a subset of all the tasks in the same barrier stage, it shall lead to timeout of the function call.
  rdd.barrier().mapPartitions { iter => val context = BarrierTaskContext.get() if (context.partitionId() == 0) { // Do nothing. } else { context.barrier() } iter }
  
  2. Include barrier() function in a try-catch code block, this may lead to timeout of the second function call.
  rdd.barrier().mapPartitions { iter => val context = BarrierTaskContext.get() try { // Do something that might throw an Exception. doSomething() context.barrier() } catch { case e: Exception => logWarning("...", e) } context.barrier() iter }
- allGather
  
  public String[] allGather(String message)
  
  :: Experimental :: Blocks until all tasks in the same stage have reached this routine. Each task passes in a message and returns with a list of all the messages passed in by each of those tasks.
  CAUTION! The allGather method requires the same precautions as the barrier method
  The message is type String rather than Array[Byte] because it is more convenient for the user at the cost of worse performance.
  
  Parameters:
  
  message - (undocumented)
  
  Returns:
  
  (undocumented)
- getTaskInfos
  
  public BarrierTaskInfo[] getTaskInfos()
  
  :: Experimental :: Returns BarrierTaskInfo for all tasks in this barrier stage, ordered by partition ID.
  
  Returns:
  
  (undocumented)
- isCompleted
  
  public boolean isCompleted()
  
  Description copied from class: TaskContext
  
  Returns true if the task has completed.
  
  Specified by:
  
  isCompleted in class TaskContext
  
  Returns:
  
  (undocumented)
- isFailed
  
  public boolean isFailed()
  
  Description copied from class: TaskContext
  
  Returns true if the task has failed.
  
  Specified by:
  
  isFailed in class TaskContext
  
  Returns:
  
  (undocumented)
- isInterrupted
  
  public boolean isInterrupted()
  
  Description copied from class: TaskContext
  
  Returns true if the task has been killed.
  
  Specified by:
  
  isInterrupted in class TaskContext
  
  Returns:
  
  (undocumented)
- addTaskCompletionListener
  
  public BarrierTaskContext addTaskCompletionListener(TaskCompletionListener listener)
  
  Description copied from class: TaskContext
  
  Adds a (Java friendly) listener to be executed on task completion. This will be called in all situations - success, failure, or cancellation. Adding a listener to an already completed task will result in that listener being called immediately.
  Two listeners registered in the same thread will be invoked in reverse order of registration if the task completes after both are registered. There are no ordering guarantees for listeners registered in different threads, or for listeners registered after the task completes. Listeners are guaranteed to execute sequentially.
  An example use is for HadoopRDD to register a callback to close the input stream.
  Exceptions thrown by the listener will result in failure of the task.
  
  Specified by:
  
  addTaskCompletionListener in class TaskContext
  
  Parameters:
  
  listener - (undocumented)
  
  Returns:
  
  (undocumented)
- addTaskFailureListener
  
  public BarrierTaskContext addTaskFailureListener(TaskFailureListener listener)
  
  Description copied from class: TaskContext
  
  Adds a listener to be executed on task failure (which includes completion listener failure, if the task body did not already fail). Adding a listener to an already failed task will result in that listener being called immediately.
  Note: Prior to Spark 3.4.0, failure listeners were only invoked if the main task body failed.
  
  Specified by:
  
  addTaskFailureListener in class TaskContext
  
  Parameters:
  
  listener - (undocumented)
  
  Returns:
  
  (undocumented)
- stageId
  
  public int stageId()
  
  Description copied from class: TaskContext
  
  The ID of the stage that this task belong to.
  
  Specified by:
  
  stageId in class TaskContext
  
  Returns:
  
  (undocumented)
- stageAttemptNumber
  
  public int stageAttemptNumber()
  
  Description copied from class: TaskContext
  
  How many times the stage that this task belongs to has been attempted. The first stage attempt will be assigned stageAttemptNumber = 0, and subsequent attempts will have increasing attempt numbers.
  
  Specified by:
  
  stageAttemptNumber in class TaskContext
  
  Returns:
  
  (undocumented)
- partitionId
  
  public int partitionId()
  
  Description copied from class: TaskContext
  
  The ID of the RDD partition that is computed by this task.
  
  Specified by:
  
  partitionId in class TaskContext
  
  Returns:
  
  (undocumented)
- numPartitions
  
  public int numPartitions()
  
  Description copied from class: TaskContext
  
  Total number of partitions in the stage that this task belongs to.
  
  Specified by:
  
  numPartitions in class TaskContext
  
  Returns:
  
  (undocumented)
- attemptNumber
  
  public int attemptNumber()
  
  Description copied from class: TaskContext
  
  How many times this task has been attempted. The first task attempt will be assigned attemptNumber = 0, and subsequent attempts will have increasing attempt numbers.
  
  Specified by:
  
  attemptNumber in class TaskContext
  
  Returns:
  
  (undocumented)
- taskAttemptId
  
  public long taskAttemptId()
  
  Description copied from class: TaskContext
  
  An ID that is unique to this task attempt (within the same SparkContext, no two task attempts will share the same attempt ID). This is roughly equivalent to Hadoop's TaskAttemptID.
  
  Specified by:
  
  taskAttemptId in class TaskContext
  
  Returns:
  
  (undocumented)
- getLocalProperty
  
  public String getLocalProperty(String key)
  
  Description copied from class: TaskContext
  
  Get a local property set upstream in the driver, or null if it is missing. See also org.apache.spark.SparkContext.setLocalProperty.
  
  Specified by:
  
  getLocalProperty in class TaskContext
  
  Parameters:
  
  key - (undocumented)
  
  Returns:
  
  (undocumented)
- taskMetrics
  
  public org.apache.spark.executor.TaskMetrics taskMetrics()
  
  Specified by:
  
  taskMetrics in class TaskContext
- getMetricsSources
  
  public scala.collection.Seq<Source> getMetricsSources(String sourceName)
  
  Description copied from class: TaskContext
  
  ::DeveloperApi:: Returns all metrics sources with the given name which are associated with the instance which runs the task. For more information see org.apache.spark.metrics.MetricsSystem.
  
  Specified by:
  
  getMetricsSources in class TaskContext
  
  Parameters:
  
  sourceName - (undocumented)
  
  Returns:
  
  (undocumented)
- cpus
  
  public int cpus()
  
  Description copied from class: TaskContext
  
  CPUs allocated to the task.
  
  Specified by:
  
  cpus in class TaskContext
  
  Returns:
  
  (undocumented)
- resources
  
  public scala.collection.immutable.Map<String,ResourceInformation> resources()
  
  Description copied from class: TaskContext
  
  Resources allocated to the task. The key is the resource name and the value is information about the resource. Please refer to ResourceInformation for specifics.
  
  Specified by:
  
  resources in class TaskContext
  
  Returns:
  
  (undocumented)
- resourcesJMap
  
  public Map<String,ResourceInformation> resourcesJMap()
  
  Description copied from class: TaskContext
  
  (java-specific) Resources allocated to the task. The key is the resource name and the value is information about the resource. Please refer to ResourceInformation for specifics.
  
  Specified by:
  
  resourcesJMap in class TaskContext
  
  Returns:
  
  (undocumented)

Class BarrierTaskContext

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging

Method Summary

Methods inherited from class org.apache.spark.TaskContext

Methods inherited from class java.lang.Object

Methods inherited from interface org.apache.spark.internal.Logging

Method Details

get

barrier

allGather

getTaskInfos

isCompleted

isFailed

isInterrupted

addTaskCompletionListener

addTaskFailureListener

stageId

stageAttemptNumber

partitionId

numPartitions

attemptNumber

taskAttemptId

getLocalProperty

taskMetrics

getMetricsSources

cpus

resources

resourcesJMap