public class NewHadoopRDD<K,V> extends RDD<scala.Tuple2<K,V>> implements org.apache.spark.internal.Logging
org.apache.hadoop.mapreduce).
 param: sc The SparkContext to associate the RDD with. param: inputFormatClass Storage format of the data to be read. param: keyClass Class of the key associated with the inputFormatClass. param: valueClass Class of the value associated with the inputFormatClass.
org.apache.spark.SparkContext.newAPIHadoopRDD()| Modifier and Type | Class and Description | 
|---|---|
| static class  | NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD$ | 
| Constructor and Description | 
|---|
| NewHadoopRDD(SparkContext sc,
            Class<? extends org.apache.hadoop.mapreduce.InputFormat<K,V>> inputFormatClass,
            Class<K> keyClass,
            Class<V> valueClass,
            org.apache.hadoop.conf.Configuration _conf) | 
| Modifier and Type | Method and Description | 
|---|---|
| InterruptibleIterator<scala.Tuple2<K,V>> | compute(Partition theSplit,
       TaskContext context):: DeveloperApi ::
 Implemented by subclasses to compute a given partition. | 
| static Object | CONFIGURATION_INSTANTIATION_LOCK()Configuration's constructor is not threadsafe (see SPARK-1097 and HADOOP-10456). | 
| org.apache.hadoop.conf.Configuration | getConf() | 
| Partition[] | getPartitions()Implemented by subclasses to return the set of partitions in this RDD. | 
| scala.collection.Seq<String> | getPreferredLocations(Partition hsplit)Optionally overridden by subclasses to specify placement preferences. | 
| <U> RDD<U> | mapPartitionsWithInputSplit(scala.Function2<org.apache.hadoop.mapreduce.InputSplit,scala.collection.Iterator<scala.Tuple2<K,V>>,scala.collection.Iterator<U>> f,
                           boolean preservesPartitioning,
                           scala.reflect.ClassTag<U> evidence$1)Maps over a partition, providing the InputSplit that was used as the base of the partition. | 
| NewHadoopRDD<K,V> | persist(StorageLevel storageLevel)Set this RDD's storage level to persist its values across operations after the first time
 it is computed. | 
aggregate, barrier, cache, cartesian, checkpoint, cleanShuffleDependencies, coalesce, collect, collect, context, count, countApprox, countApproxDistinct, countApproxDistinct, countByValue, countByValueApprox, dependencies, distinct, distinct, doubleRDDToDoubleRDDFunctions, filter, first, flatMap, fold, foreach, foreachPartition, getCheckpointFile, getNumPartitions, getResourceProfile, getStorageLevel, glom, groupBy, groupBy, groupBy, id, intersection, intersection, intersection, isCheckpointed, isEmpty, iterator, keyBy, localCheckpoint, map, mapPartitions, mapPartitionsWithEvaluator, mapPartitionsWithIndex, max, min, name, numericRDDToDoubleRDDFunctions, partitioner, partitions, persist, pipe, pipe, pipe, preferredLocations, randomSplit, rddToAsyncRDDActions, rddToOrderedRDDFunctions, rddToPairRDDFunctions, rddToSequenceFileRDDFunctions, reduce, repartition, sample, saveAsObjectFile, saveAsTextFile, saveAsTextFile, setName, sortBy, sparkContext, subtract, subtract, subtract, take, takeOrdered, takeSample, toDebugString, toJavaRDD, toLocalIterator, top, toString, treeAggregate, treeAggregate, treeReduce, union, unpersist, withResources, zip, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitionsWithEvaluator, zipWithIndex, zipWithUniqueId$init$, initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, initLock, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, org$apache$spark$internal$Logging$$log__$eq, org$apache$spark$internal$Logging$$log_, uninitializepublic NewHadoopRDD(SparkContext sc, Class<? extends org.apache.hadoop.mapreduce.InputFormat<K,V>> inputFormatClass, Class<K> keyClass, Class<V> valueClass, org.apache.hadoop.conf.Configuration _conf)
public static Object CONFIGURATION_INSTANTIATION_LOCK()
public org.apache.hadoop.conf.Configuration getConf()
public Partition[] getPartitions()
RDD
 The partitions in this array must satisfy the following property:
   rdd.partitions.zipWithIndex.forall { case (partition, index) => partition.index == index }
public InterruptibleIterator<scala.Tuple2<K,V>> compute(Partition theSplit, TaskContext context)
RDDpublic <U> RDD<U> mapPartitionsWithInputSplit(scala.Function2<org.apache.hadoop.mapreduce.InputSplit,scala.collection.Iterator<scala.Tuple2<K,V>>,scala.collection.Iterator<U>> f, boolean preservesPartitioning, scala.reflect.ClassTag<U> evidence$1)
public scala.collection.Seq<String> getPreferredLocations(Partition hsplit)
RDDhsplit - (undocumented)public NewHadoopRDD<K,V> persist(StorageLevel storageLevel)
RDD