Package org.apache.spark.api.r
Class RRDD<T>
Object
org.apache.spark.rdd.RDD<U>
org.apache.spark.api.r.BaseRRDD<T,byte[]>
org.apache.spark.api.r.RRDD<T>
- All Implemented Interfaces:
Serializable,org.apache.spark.internal.Logging
An RDD that stores serialized R objects as Array[Byte].
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionJavaRDD<byte[]>static JavaRDD<byte[]>createRDDFromArray(JavaSparkContext jsc, byte[][] arr) Create an RRDD given a sequence of byte arrays.static JavaRDD<byte[]>createRDDFromFile(JavaSparkContext jsc, String fileName, int parallelism) Create an RRDD given a temporary file name.static JavaSparkContextcreateSparkContext(String master, String appName, String sparkHome, String[] jars, Map<Object, Object> sparkEnvirMap, Map<Object, Object> sparkExecutorEnvMap) Methods inherited from class org.apache.spark.api.r.BaseRRDD
compute, getPartitionsMethods inherited from class org.apache.spark.rdd.RDD
aggregate, barrier, cache, cartesian, checkpoint, cleanShuffleDependencies, coalesce, collect, collect, context, count, countApprox, countApproxDistinct, countApproxDistinct, countByValue, countByValueApprox, dependencies, distinct, distinct, doubleRDDToDoubleRDDFunctions, filter, first, flatMap, fold, foreach, foreachPartition, getCheckpointFile, getNumPartitions, getResourceProfile, getStorageLevel, glom, groupBy, groupBy, groupBy, id, intersection, intersection, intersection, isCheckpointed, isEmpty, iterator, keyBy, localCheckpoint, map, mapPartitions, mapPartitionsWithEvaluator, mapPartitionsWithIndex, max, min, name, numericRDDToDoubleRDDFunctions, partitioner, partitions, persist, persist, pipe, pipe, pipe, preferredLocations, randomSplit, rddToAsyncRDDActions, rddToOrderedRDDFunctions, rddToPairRDDFunctions, rddToSequenceFileRDDFunctions, reduce, repartition, sample, saveAsObjectFile, saveAsTextFile, saveAsTextFile, setName, sortBy, sparkContext, subtract, subtract, subtract, take, takeOrdered, takeSample, toDebugString, toJavaRDD, toLocalIterator, top, toString, treeAggregate, treeAggregate, treeReduce, union, unpersist, withResources, zip, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitionsWithEvaluator, zipWithIndex, zipWithUniqueIdMethods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, waitMethods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext
-
Constructor Details
-
RRDD
-
-
Method Details
-
createSparkContext
-
createRDDFromArray
Create an RRDD given a sequence of byte arrays. Used to create RRDD whenparallelizeis called from R.- Parameters:
jsc- (undocumented)arr- (undocumented)- Returns:
- (undocumented)
-
createRDDFromFile
public static JavaRDD<byte[]> createRDDFromFile(JavaSparkContext jsc, String fileName, int parallelism) Create an RRDD given a temporary file name. This is used to create RRDD when parallelize is called on large R objects.- Parameters:
fileName- name of temporary file on driver machineparallelism- number of slices defaults to 4jsc- (undocumented)- Returns:
- (undocumented)
-
asJavaRDD
-