Package org.apache.spark.ml.feature
Class VarianceThresholdSelector
Object
org.apache.spark.ml.PipelineStage
org.apache.spark.ml.Estimator<VarianceThresholdSelectorModel>
org.apache.spark.ml.feature.VarianceThresholdSelector
- All Implemented Interfaces:
- Serializable,- org.apache.spark.internal.Logging,- VarianceThresholdSelectorParams,- Params,- HasFeaturesCol,- HasOutputCol,- DefaultParamsWritable,- Identifiable,- MLWritable
public final class VarianceThresholdSelector
extends Estimator<VarianceThresholdSelectorModel>
implements VarianceThresholdSelectorParams, DefaultParamsWritable
Feature selector that removes all low-variance features. Features with a
 (sample) variance not greater than the threshold will be removed. The default is to keep
 all features with non-zero variance, i.e. remove the features that have the
 same value in all samples.
- See Also:
- 
Nested Class SummaryNested classes/interfaces inherited from interface org.apache.spark.internal.Loggingorg.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter
- 
Constructor SummaryConstructors
- 
Method SummaryModifier and TypeMethodDescriptionCreates a copy of this instance with the same UID and some extra params.Param for features column name.Fits a model to the input data.static VarianceThresholdSelectorParam for output column name.static MLReader<T>read()setFeaturesCol(String value) setOutputCol(String value) setVarianceThreshold(double value) transformSchema(StructType schema) Check transform validity and derive the output schema from the input schema.uid()An immutable unique ID for the object and its derivatives.final DoubleParamParam for variance threshold.Methods inherited from class org.apache.spark.ml.PipelineStageparamsMethods inherited from class java.lang.Objectequals, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.spark.ml.util.DefaultParamsWritablewriteMethods inherited from interface org.apache.spark.ml.param.shared.HasFeaturesColgetFeaturesColMethods inherited from interface org.apache.spark.ml.param.shared.HasOutputColgetOutputColMethods inherited from interface org.apache.spark.ml.util.IdentifiabletoStringMethods inherited from interface org.apache.spark.internal.LogginginitializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logBasedOnLevel, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, MDC, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContextMethods inherited from interface org.apache.spark.ml.util.MLWritablesaveMethods inherited from interface org.apache.spark.ml.param.Paramsclear, copyValues, defaultCopy, defaultParamMap, estimateMatadataSize, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, onParamChange, paramMap, params, set, set, set, setDefault, setDefault, shouldOwnMethods inherited from interface org.apache.spark.ml.feature.VarianceThresholdSelectorParamsgetVarianceThreshold
- 
Constructor Details- 
VarianceThresholdSelector
- 
VarianceThresholdSelectorpublic VarianceThresholdSelector()
 
- 
- 
Method Details- 
load
- 
read
- 
varianceThresholdDescription copied from interface:VarianceThresholdSelectorParamsParam for variance threshold. Features with a variance not greater than this threshold will be removed. The default value is 0.0.- Specified by:
- varianceThresholdin interface- VarianceThresholdSelectorParams
- Returns:
- (undocumented)
 
- 
outputColDescription copied from interface:HasOutputColParam for output column name.- Specified by:
- outputColin interface- HasOutputCol
- Returns:
- (undocumented)
 
- 
featuresColDescription copied from interface:HasFeaturesColParam for features column name.- Specified by:
- featuresColin interface- HasFeaturesCol
- Returns:
- (undocumented)
 
- 
uidDescription copied from interface:IdentifiableAn immutable unique ID for the object and its derivatives.- Specified by:
- uidin interface- Identifiable
- Returns:
- (undocumented)
 
- 
setVarianceThreshold
- 
setFeaturesCol
- 
setOutputCol
- 
fitDescription copied from class:EstimatorFits a model to the input data.- Specified by:
- fitin class- Estimator<VarianceThresholdSelectorModel>
- Parameters:
- dataset- (undocumented)
- Returns:
- (undocumented)
 
- 
transformSchemaDescription copied from class:PipelineStageCheck transform validity and derive the output schema from the input schema.We check validity for interactions between parameters during transformSchemaand raise an exception if any parameter value is invalid. Parameter value checks which do not depend on other parameters are handled byParam.validate().Typical implementation should first conduct verification on schema change and parameter validity, including complex parameter interaction checks. - Specified by:
- transformSchemain class- PipelineStage
- Parameters:
- schema- (undocumented)
- Returns:
- (undocumented)
 
- 
copyDescription copied from interface:ParamsCreates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly. SeedefaultCopy().- Specified by:
- copyin interface- Params
- Specified by:
- copyin class- Estimator<VarianceThresholdSelectorModel>
- Parameters:
- extra- (undocumented)
- Returns:
- (undocumented)
 
 
-