Package org.apache.spark.ml.feature
Class NGram
Object
org.apache.spark.ml.PipelineStage
org.apache.spark.ml.Transformer
org.apache.spark.ml.UnaryTransformer<scala.collection.immutable.Seq<String>,scala.collection.immutable.Seq<String>,NGram>
org.apache.spark.ml.feature.NGram
- All Implemented Interfaces:
Serializable,org.apache.spark.internal.Logging,Params,HasInputCol,HasOutputCol,DefaultParamsWritable,Identifiable,MLWritable
public class NGram
extends UnaryTransformer<scala.collection.immutable.Seq<String>,scala.collection.immutable.Seq<String>,NGram>
implements DefaultParamsWritable
A feature transformer that converts the input array of strings into an array of n-grams. Null
values in the input array are ignored.
It returns an array of n-grams where each n-gram is represented by a space-separated string of
words.
When the input is empty, an empty array is returned. When the input array length is less than n (number of elements per n-gram), no n-grams are returned.
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter -
Constructor Summary
Constructors -
Method Summary
Methods inherited from class org.apache.spark.ml.UnaryTransformer
copy, inputCol, outputCol, setInputCol, setOutputCol, transform, transformSchemaMethods inherited from class org.apache.spark.ml.Transformer
transform, transform, transformMethods inherited from class org.apache.spark.ml.PipelineStage
paramsMethods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, waitMethods inherited from interface org.apache.spark.ml.util.DefaultParamsWritable
writeMethods inherited from interface org.apache.spark.ml.param.shared.HasInputCol
getInputColMethods inherited from interface org.apache.spark.ml.param.shared.HasOutputCol
getOutputColMethods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logBasedOnLevel, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContextMethods inherited from interface org.apache.spark.ml.util.MLWritable
saveMethods inherited from interface org.apache.spark.ml.param.Params
clear, copyValues, defaultCopy, defaultParamMap, estimateMatadataSize, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, onParamChange, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn
-
Constructor Details
-
NGram
-
NGram
public NGram()
-
-
Method Details
-
load
-
read
-
uid
Description copied from interface:IdentifiableAn immutable unique ID for the object and its derivatives.- Specified by:
uidin interfaceIdentifiable- Returns:
- (undocumented)
-
n
Minimum n-gram length, greater than or equal to 1. Default: 2, bigram features- Returns:
- (undocumented)
-
setN
-
getN
public int getN() -
toString
- Specified by:
toStringin interfaceIdentifiable- Overrides:
toStringin classObject
-