org.apache.spark.mllib.stat.correlation

## Interface Correlation

• `public interface Correlation`
Trait for correlation algorithms.
• ### Method Summary

Modifier and Type Method and Description
`double` ```computeCorrelation(RDD<Object> x, RDD<Object> y)```
Compute correlation for two datasets.
`Matrix` `computeCorrelationMatrix(RDD<Vector> X)`
Compute the correlation matrix S, for the input matrix, where S(i, j) is the correlation between column i and j.
`double` ```computeCorrelationWithMatrixImpl(RDD<Object> x, RDD<Object> y)```
Combine the two input RDD[Double]s into an RDD[Vector] and compute the correlation using the correlation implementation for RDD[Vector].
• ### Method Detail

• #### computeCorrelation

```double computeCorrelation(RDD<Object> x,
RDD<Object> y)```
Compute correlation for two datasets.
Parameters:
`x` - (undocumented)
`y` - (undocumented)
Returns:
(undocumented)
• #### computeCorrelationMatrix

`Matrix computeCorrelationMatrix(RDD<Vector> X)`
Compute the correlation matrix S, for the input matrix, where S(i, j) is the correlation between column i and j. S(i, j) can be NaN if the correlation is undefined for column i and j.
Parameters:
`X` - (undocumented)
Returns:
(undocumented)
• #### computeCorrelationWithMatrixImpl

```double computeCorrelationWithMatrixImpl(RDD<Object> x,
RDD<Object> y)```
Combine the two input RDD[Double]s into an RDD[Vector] and compute the correlation using the correlation implementation for RDD[Vector]. Can be NaN if correlation is undefined for the input vectors.
Parameters:
`x` - (undocumented)
`y` - (undocumented)
Returns:
(undocumented)