Packages

c

org.apache.spark.rdd

SequenceFileRDDFunctions

class SequenceFileRDDFunctions[K, V] extends Logging with Serializable

Extra functions available on RDDs of (key, value) pairs to create a Hadoop SequenceFile, through an implicit conversion.

Source
SequenceFileRDDFunctions.scala
Note

This can't be part of PairRDDFunctions because we need more implicit parameters to convert our keys and values to Writable.

Linear Supertypes
Serializable, Serializable, Logging, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. SequenceFileRDDFunctions
  2. Serializable
  3. Serializable
  4. Logging
  5. AnyRef
  6. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new SequenceFileRDDFunctions(self: RDD[(K, V)], _keyWritableClass: Class[_ <: Writable], _valueWritableClass: Class[_ <: Writable])(implicit arg0: IsWritable[K], arg1: ClassTag[K], arg2: IsWritable[V], arg3: ClassTag[V])

Value Members

  1. def saveAsSequenceFile(path: String, codec: Option[Class[_ <: CompressionCodec]] = None): Unit

    Output the RDD as a Hadoop SequenceFile using the Writable types we infer from the RDD's key and value types.

    Output the RDD as a Hadoop SequenceFile using the Writable types we infer from the RDD's key and value types. If the key or value are Writable, then we use their classes directly; otherwise we map primitive types such as Int and Double to IntWritable, DoubleWritable, etc, byte arrays to BytesWritable, and Strings to Text. The path can be on any Hadoop-supported file system.