Class SparkConf

Object
org.apache.spark.SparkConf
All Implemented Interfaces:
Serializable, Cloneable, org.apache.spark.internal.Logging, scala.Cloneable, scala.Serializable

public class SparkConf extends Object implements scala.Cloneable, org.apache.spark.internal.Logging, scala.Serializable
Configuration for a Spark application. Used to set various Spark parameters as key-value pairs.

Most of the time, you would create a SparkConf object with new SparkConf(), which will load values from any spark.* Java system properties set in your application as well. In this case, parameters you set directly on the SparkConf object take priority over system properties.

For unit tests, you can also call new SparkConf(false) to skip loading external settings and get the same configuration no matter what the system properties are.

All setter methods in this class support chaining. For example, you can write new SparkConf().setMaster("local").setAppName("My app").

param: loadDefaults whether to also load values from Java system properties

See Also:
Note:
Once a SparkConf object is passed to Spark, it is cloned and can no longer be modified by the user. Spark does not support modifying the configuration at runtime.
  • Nested Class Summary

    Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging

    org.apache.spark.internal.Logging.SparkShellLoggingFilter
  • Constructor Summary

    Constructors
    Constructor
    Description
    Create a SparkConf that loads defaults from system properties and the classpath
    SparkConf(boolean loadDefaults)
     
  • Method Summary

    Modifier and Type
    Method
    Description
    Copy this object
    boolean
    Does the configuration contain a given parameter?
    get(String key)
    Get a parameter; throws a NoSuchElementException if it's not set
    get(String key, String defaultValue)
    Get a parameter, falling back to a default if not set
    scala.Tuple2<String,String>[]
    Get all parameters as a list of pairs
    scala.Tuple2<String,String>[]
    Get all parameters that start with prefix
    Returns the Spark application id, valid in the Driver after TaskScheduler registration and from the start in the Executor.
    scala.collection.immutable.Map<Object,String>
    Gets all the avro schemas in the configuration used in the generic Avro record serializer
    boolean
    getBoolean(String key, boolean defaultValue)
    Get a parameter as a boolean, falling back to a default if not set
    static scala.Option<String>
    Looks for available deprecated keys for the given config option, and return the first value available.
    double
    getDouble(String key, double defaultValue)
    Get a parameter as a double, falling back to a default if not ste
    scala.collection.Seq<scala.Tuple2<String,String>>
    Get all executor environment variables set on this SparkConf
    int
    getInt(String key, int defaultValue)
    Get a parameter as an integer, falling back to a default if not set
    long
    getLong(String key, long defaultValue)
    Get a parameter as a long, falling back to a default if not set
    scala.Option<String>
    Get a parameter as an Option
    long
    Get a size parameter as bytes; throws a NoSuchElementException if it's not set.
    long
    getSizeAsBytes(String key, long defaultValue)
    Get a size parameter as bytes, falling back to a default if not set.
    long
    getSizeAsBytes(String key, String defaultValue)
    Get a size parameter as bytes, falling back to a default if not set.
    long
    Get a size parameter as Gibibytes; throws a NoSuchElementException if it's not set.
    long
    getSizeAsGb(String key, String defaultValue)
    Get a size parameter as Gibibytes, falling back to a default if not set.
    long
    Get a size parameter as Kibibytes; throws a NoSuchElementException if it's not set.
    long
    getSizeAsKb(String key, String defaultValue)
    Get a size parameter as Kibibytes, falling back to a default if not set.
    long
    Get a size parameter as Mebibytes; throws a NoSuchElementException if it's not set.
    long
    getSizeAsMb(String key, String defaultValue)
    Get a size parameter as Mebibytes, falling back to a default if not set.
    long
    Get a time parameter as milliseconds; throws a NoSuchElementException if it's not set.
    long
    getTimeAsMs(String key, String defaultValue)
    Get a time parameter as milliseconds, falling back to a default if not set.
    long
    Get a time parameter as seconds; throws a NoSuchElementException if it's not set.
    long
    getTimeAsSeconds(String key, String defaultValue)
    Get a time parameter as seconds, falling back to a default if not set.
    static boolean
    Return whether the given config should be passed to an executor on start-up.
    static boolean
    Return true if the given config matches either spark.*.port or spark.port
    static void
    Logs a warning message if the given config key is deprecated.
    static org.slf4j.Logger
     
    static void
     
    registerAvroSchemas(scala.collection.Seq<org.apache.avro.Schema> schemas)
    Use Kryo serialization and register the given set of Avro schemas so that the generic record serializer can decrease network IO
    registerKryoClasses(Class<?>[] classes)
    Use Kryo serialization and register the given set of classes with Kryo.
    Remove a parameter from the configuration
    set(String key, String value)
    Set a configuration variable.
    setAll(scala.collection.Iterable<scala.Tuple2<String,String>> settings)
    Set multiple parameters together
    Set a name for your application.
    setExecutorEnv(String variable, String value)
    Set an environment variable to be used when launching executors for this application.
    setExecutorEnv(scala.collection.Seq<scala.Tuple2<String,String>> variables)
    Set multiple environment variables to be used when launching executors.
    setExecutorEnv(scala.Tuple2<String,String>[] variables)
    Set multiple environment variables to be used when launching executors.
    setIfMissing(String key, String value)
    Set a parameter if it isn't already configured
    setJars(String[] jars)
    Set JAR files to distribute to the cluster.
    setJars(scala.collection.Seq<String> jars)
    Set JAR files to distribute to the cluster.
    setMaster(String master)
    The master URL to connect to, such as "local" to run locally with one thread, "local[4]" to run locally with 4 cores, or "spark://master:7077" to run on a Spark standalone cluster.
    Set the location where Spark is installed on worker nodes.
    Return a string listing all keys and values, one per line.

    Methods inherited from class java.lang.Object

    equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

    Methods inherited from interface org.apache.spark.internal.Logging

    initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq
  • Constructor Details

    • SparkConf

      public SparkConf(boolean loadDefaults)
    • SparkConf

      public SparkConf()
      Create a SparkConf that loads defaults from system properties and the classpath
  • Method Details

    • isExecutorStartupConf

      public static boolean isExecutorStartupConf(String name)
      Return whether the given config should be passed to an executor on start-up.

      Certain authentication configs are required from the executor when it connects to the scheduler, while the rest of the spark configs can be inherited from the driver later.

      Parameters:
      name - (undocumented)
      Returns:
      (undocumented)
    • isSparkPortConf

      public static boolean isSparkPortConf(String name)
      Return true if the given config matches either spark.*.port or spark.port.*.
      Parameters:
      name - (undocumented)
      Returns:
      (undocumented)
    • getDeprecatedConfig

      public static scala.Option<String> getDeprecatedConfig(String key, Map<String,String> conf)
      Looks for available deprecated keys for the given config option, and return the first value available.
      Parameters:
      key - (undocumented)
      conf - (undocumented)
      Returns:
      (undocumented)
    • logDeprecationWarning

      public static void logDeprecationWarning(String key)
      Logs a warning message if the given config key is deprecated.
      Parameters:
      key - (undocumented)
    • org$apache$spark$internal$Logging$$log_

      public static org.slf4j.Logger org$apache$spark$internal$Logging$$log_()
    • org$apache$spark$internal$Logging$$log__$eq

      public static void org$apache$spark$internal$Logging$$log__$eq(org.slf4j.Logger x$1)
    • set

      public SparkConf set(String key, String value)
      Set a configuration variable.
    • setMaster

      public SparkConf setMaster(String master)
      The master URL to connect to, such as "local" to run locally with one thread, "local[4]" to run locally with 4 cores, or "spark://master:7077" to run on a Spark standalone cluster.
      Parameters:
      master - (undocumented)
      Returns:
      (undocumented)
    • setAppName

      public SparkConf setAppName(String name)
      Set a name for your application. Shown in the Spark web UI.
    • setJars

      public SparkConf setJars(scala.collection.Seq<String> jars)
      Set JAR files to distribute to the cluster.
    • setJars

      public SparkConf setJars(String[] jars)
      Set JAR files to distribute to the cluster. (Java-friendly version.)
    • setExecutorEnv

      public SparkConf setExecutorEnv(String variable, String value)
      Set an environment variable to be used when launching executors for this application. These variables are stored as properties of the form spark.executorEnv.VAR_NAME (for example spark.executorEnv.PATH) but this method makes them easier to set.
      Parameters:
      variable - (undocumented)
      value - (undocumented)
      Returns:
      (undocumented)
    • setExecutorEnv

      public SparkConf setExecutorEnv(scala.collection.Seq<scala.Tuple2<String,String>> variables)
      Set multiple environment variables to be used when launching executors. These variables are stored as properties of the form spark.executorEnv.VAR_NAME (for example spark.executorEnv.PATH) but this method makes them easier to set.
      Parameters:
      variables - (undocumented)
      Returns:
      (undocumented)
    • setExecutorEnv

      public SparkConf setExecutorEnv(scala.Tuple2<String,String>[] variables)
      Set multiple environment variables to be used when launching executors. (Java-friendly version.)
      Parameters:
      variables - (undocumented)
      Returns:
      (undocumented)
    • setSparkHome

      public SparkConf setSparkHome(String home)
      Set the location where Spark is installed on worker nodes.
      Parameters:
      home - (undocumented)
      Returns:
      (undocumented)
    • setAll

      public SparkConf setAll(scala.collection.Iterable<scala.Tuple2<String,String>> settings)
      Set multiple parameters together
    • setIfMissing

      public SparkConf setIfMissing(String key, String value)
      Set a parameter if it isn't already configured
    • registerKryoClasses

      public SparkConf registerKryoClasses(Class<?>[] classes)
      Use Kryo serialization and register the given set of classes with Kryo. If called multiple times, this will append the classes from all calls together.
      Parameters:
      classes - (undocumented)
      Returns:
      (undocumented)
    • registerAvroSchemas

      public SparkConf registerAvroSchemas(scala.collection.Seq<org.apache.avro.Schema> schemas)
      Use Kryo serialization and register the given set of Avro schemas so that the generic record serializer can decrease network IO
      Parameters:
      schemas - (undocumented)
      Returns:
      (undocumented)
    • getAvroSchema

      public scala.collection.immutable.Map<Object,String> getAvroSchema()
      Gets all the avro schemas in the configuration used in the generic Avro record serializer
    • remove

      public SparkConf remove(String key)
      Remove a parameter from the configuration
    • get

      public String get(String key)
      Get a parameter; throws a NoSuchElementException if it's not set
    • get

      public String get(String key, String defaultValue)
      Get a parameter, falling back to a default if not set
    • getTimeAsSeconds

      public long getTimeAsSeconds(String key)
      Get a time parameter as seconds; throws a NoSuchElementException if it's not set. If no suffix is provided then seconds are assumed.
      Parameters:
      key - (undocumented)
      Returns:
      (undocumented)
      Throws:
      NoSuchElementException - If the time parameter is not set
      NumberFormatException - If the value cannot be interpreted as seconds
    • getTimeAsSeconds

      public long getTimeAsSeconds(String key, String defaultValue)
      Get a time parameter as seconds, falling back to a default if not set. If no suffix is provided then seconds are assumed.
      Parameters:
      key - (undocumented)
      defaultValue - (undocumented)
      Returns:
      (undocumented)
      Throws:
      NumberFormatException - If the value cannot be interpreted as seconds
    • getTimeAsMs

      public long getTimeAsMs(String key)
      Get a time parameter as milliseconds; throws a NoSuchElementException if it's not set. If no suffix is provided then milliseconds are assumed.
      Parameters:
      key - (undocumented)
      Returns:
      (undocumented)
      Throws:
      NoSuchElementException - If the time parameter is not set
      NumberFormatException - If the value cannot be interpreted as milliseconds
    • getTimeAsMs

      public long getTimeAsMs(String key, String defaultValue)
      Get a time parameter as milliseconds, falling back to a default if not set. If no suffix is provided then milliseconds are assumed.
      Parameters:
      key - (undocumented)
      defaultValue - (undocumented)
      Returns:
      (undocumented)
      Throws:
      NumberFormatException - If the value cannot be interpreted as milliseconds
    • getSizeAsBytes

      public long getSizeAsBytes(String key)
      Get a size parameter as bytes; throws a NoSuchElementException if it's not set. If no suffix is provided then bytes are assumed.
      Parameters:
      key - (undocumented)
      Returns:
      (undocumented)
      Throws:
      NoSuchElementException - If the size parameter is not set
      NumberFormatException - If the value cannot be interpreted as bytes
    • getSizeAsBytes

      public long getSizeAsBytes(String key, String defaultValue)
      Get a size parameter as bytes, falling back to a default if not set. If no suffix is provided then bytes are assumed.
      Parameters:
      key - (undocumented)
      defaultValue - (undocumented)
      Returns:
      (undocumented)
      Throws:
      NumberFormatException - If the value cannot be interpreted as bytes
    • getSizeAsBytes

      public long getSizeAsBytes(String key, long defaultValue)
      Get a size parameter as bytes, falling back to a default if not set.
      Parameters:
      key - (undocumented)
      defaultValue - (undocumented)
      Returns:
      (undocumented)
      Throws:
      NumberFormatException - If the value cannot be interpreted as bytes
    • getSizeAsKb

      public long getSizeAsKb(String key)
      Get a size parameter as Kibibytes; throws a NoSuchElementException if it's not set. If no suffix is provided then Kibibytes are assumed.
      Parameters:
      key - (undocumented)
      Returns:
      (undocumented)
      Throws:
      NoSuchElementException - If the size parameter is not set
      NumberFormatException - If the value cannot be interpreted as Kibibytes
    • getSizeAsKb

      public long getSizeAsKb(String key, String defaultValue)
      Get a size parameter as Kibibytes, falling back to a default if not set. If no suffix is provided then Kibibytes are assumed.
      Parameters:
      key - (undocumented)
      defaultValue - (undocumented)
      Returns:
      (undocumented)
      Throws:
      NumberFormatException - If the value cannot be interpreted as Kibibytes
    • getSizeAsMb

      public long getSizeAsMb(String key)
      Get a size parameter as Mebibytes; throws a NoSuchElementException if it's not set. If no suffix is provided then Mebibytes are assumed.
      Parameters:
      key - (undocumented)
      Returns:
      (undocumented)
      Throws:
      NoSuchElementException - If the size parameter is not set
      NumberFormatException - If the value cannot be interpreted as Mebibytes
    • getSizeAsMb

      public long getSizeAsMb(String key, String defaultValue)
      Get a size parameter as Mebibytes, falling back to a default if not set. If no suffix is provided then Mebibytes are assumed.
      Parameters:
      key - (undocumented)
      defaultValue - (undocumented)
      Returns:
      (undocumented)
      Throws:
      NumberFormatException - If the value cannot be interpreted as Mebibytes
    • getSizeAsGb

      public long getSizeAsGb(String key)
      Get a size parameter as Gibibytes; throws a NoSuchElementException if it's not set. If no suffix is provided then Gibibytes are assumed.
      Parameters:
      key - (undocumented)
      Returns:
      (undocumented)
      Throws:
      NoSuchElementException - If the size parameter is not set
      NumberFormatException - If the value cannot be interpreted as Gibibytes
    • getSizeAsGb

      public long getSizeAsGb(String key, String defaultValue)
      Get a size parameter as Gibibytes, falling back to a default if not set. If no suffix is provided then Gibibytes are assumed.
      Parameters:
      key - (undocumented)
      defaultValue - (undocumented)
      Returns:
      (undocumented)
      Throws:
      NumberFormatException - If the value cannot be interpreted as Gibibytes
    • getOption

      public scala.Option<String> getOption(String key)
      Get a parameter as an Option
    • getAll

      public scala.Tuple2<String,String>[] getAll()
      Get all parameters as a list of pairs
    • getAllWithPrefix

      public scala.Tuple2<String,String>[] getAllWithPrefix(String prefix)
      Get all parameters that start with prefix
      Parameters:
      prefix - (undocumented)
      Returns:
      (undocumented)
    • getInt

      public int getInt(String key, int defaultValue)
      Get a parameter as an integer, falling back to a default if not set
      Parameters:
      key - (undocumented)
      defaultValue - (undocumented)
      Returns:
      (undocumented)
      Throws:
      NumberFormatException - If the value cannot be interpreted as an integer
    • getLong

      public long getLong(String key, long defaultValue)
      Get a parameter as a long, falling back to a default if not set
      Parameters:
      key - (undocumented)
      defaultValue - (undocumented)
      Returns:
      (undocumented)
      Throws:
      NumberFormatException - If the value cannot be interpreted as a long
    • getDouble

      public double getDouble(String key, double defaultValue)
      Get a parameter as a double, falling back to a default if not ste
      Parameters:
      key - (undocumented)
      defaultValue - (undocumented)
      Returns:
      (undocumented)
      Throws:
      NumberFormatException - If the value cannot be interpreted as a double
    • getBoolean

      public boolean getBoolean(String key, boolean defaultValue)
      Get a parameter as a boolean, falling back to a default if not set
      Parameters:
      key - (undocumented)
      defaultValue - (undocumented)
      Returns:
      (undocumented)
      Throws:
      IllegalArgumentException - If the value cannot be interpreted as a boolean
    • getExecutorEnv

      public scala.collection.Seq<scala.Tuple2<String,String>> getExecutorEnv()
      Get all executor environment variables set on this SparkConf
    • getAppId

      public String getAppId()
      Returns the Spark application id, valid in the Driver after TaskScheduler registration and from the start in the Executor.
      Returns:
      (undocumented)
    • contains

      public boolean contains(String key)
      Does the configuration contain a given parameter?
    • clone

      public SparkConf clone()
      Copy this object
    • toDebugString

      public String toDebugString()
      Return a string listing all keys and values, one per line. This is useful to print the configuration out for debugging.
      Returns:
      (undocumented)