Package org.apache.spark.mllib.feature


package org.apache.spark.mllib.feature
  • Class
    Description
    Creates a ChiSquared feature selector.
    Chi Squared selector model.
     
    Outputs the Hadamard product (i.e., the element-wise product) of each input vector with a provided "weight" vector.
    Maps a sequence of terms to their term frequencies using the hashing trick.
    Inverse document frequency (IDF).
    Document frequency aggregator.
    Represents an IDF model that can transform term frequency vectors.
    Normalizes samples individually to unit L^p^ norm
    A feature transformer that projects vectors to a low-dimensional space using PCA.
    Model fitted by PCA that can project vectors to a low-dimensional space using PCA.
     
    Standardizes features by removing the mean and scaling to unit std using column summary statistics on the samples in the training set.
    Represents a StandardScaler model that can transform vectors.
    Trait for transformation of a vector
    Entry in vocabulary
    Word2Vec creates vector representation of words in a text corpus.
    Word2Vec model param: wordIndex maps each word to an index, which can retrieve the corresponding vector from wordVectors param: wordVectors array of length numWords * vectorSize, vector corresponding to the word mapped with index i can be retrieved by the slice (i * vectorSize, i * vectorSize + vectorSize)