Package org.apache.spark.mllib.feature
Class IDFModel
Object
org.apache.spark.mllib.feature.IDFModel
- All Implemented Interfaces:
 Serializable,scala.Serializable
Represents an IDF model that can transform term frequency vectors.
- See Also:
 
- 
Method Details
- 
idf
 - 
docFreq
public long[] docFreq() - 
numDocs
public long numDocs() - 
transform
Transforms term frequency (TF) vectors to TF-IDF vectors.If
minDocFreqwas set for the IDF calculation, the terms which occur in fewer thanminDocFreqdocuments will have an entry of 0.- Parameters:
 dataset- an RDD of term frequency vectors- Returns:
 - an RDD of TF-IDF vectors
 
 - 
transform
Transforms a term frequency (TF) vector to a TF-IDF vector- Parameters:
 v- a term frequency vector- Returns:
 - a TF-IDF vector
 
 - 
transform
Transforms term frequency (TF) vectors to TF-IDF vectors (Java version).- Parameters:
 dataset- a JavaRDD of term frequency vectors- Returns:
 - a JavaRDD of TF-IDF vectors
 
 
 -