Class BasicBlockReplicationPolicy

Object
org.apache.spark.storage.BasicBlockReplicationPolicy
All Implemented Interfaces:
org.apache.spark.internal.Logging, BlockReplicationPolicy

public class BasicBlockReplicationPolicy extends Object implements BlockReplicationPolicy, org.apache.spark.internal.Logging
  • Nested Class Summary

    Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging

    org.apache.spark.internal.Logging.SparkShellLoggingFilter
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    scala.collection.immutable.List<BlockManagerId>
    prioritize(BlockManagerId blockManagerId, scala.collection.Seq<BlockManagerId> peers, scala.collection.mutable.HashSet<BlockManagerId> peersReplicatedTo, BlockId blockId, int numReplicas)
    Method to prioritize a bunch of candidate peers of a block manager.

    Methods inherited from class java.lang.Object

    equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

    Methods inherited from interface org.apache.spark.internal.Logging

    initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq
  • Constructor Details

    • BasicBlockReplicationPolicy

      public BasicBlockReplicationPolicy()
  • Method Details

    • prioritize

      public scala.collection.immutable.List<BlockManagerId> prioritize(BlockManagerId blockManagerId, scala.collection.Seq<BlockManagerId> peers, scala.collection.mutable.HashSet<BlockManagerId> peersReplicatedTo, BlockId blockId, int numReplicas)
      Method to prioritize a bunch of candidate peers of a block manager. This implementation replicates the behavior of block replication in HDFS. For a given number of replicas needed, we choose a peer within the rack, one outside and remaining blockmanagers are chosen at random, in that order till we meet the number of replicas needed. This works best with a total replication factor of 3, like HDFS.

      Specified by:
      prioritize in interface BlockReplicationPolicy
      Parameters:
      blockManagerId - Id of the current BlockManager for self identification
      peers - A list of peers of a BlockManager
      peersReplicatedTo - Set of peers already replicated to
      blockId - BlockId of the block being replicated. This can be used as a source of randomness if needed.
      numReplicas - Number of peers we need to replicate to
      Returns:
      A prioritized list of peers. Lower the index of a peer, higher its priority