Class BlockMatrix
- All Implemented Interfaces:
- Serializable,- org.apache.spark.internal.Logging,- DistributedMatrix
 param:  blocks The RDD of sub-matrix blocks ((blockRowIndex, blockColIndex), sub-matrix) that
               form this distributed matrix. If multiple blocks with the same index exist, the
               results for operations like add and multiply will be unpredictable.
 param:  rowsPerBlock Number of rows that make up each block. The blocks forming the final
                     rows are not required to have the given number of rows
 param:  colsPerBlock Number of columns that make up each block. The blocks forming the final
                     columns are not required to have the given number of columns
 param:  nRows Number of rows of this matrix. If the supplied value is less than or equal to zero,
              the number of rows will be calculated when numRows is invoked.
 param:  nCols Number of columns of this matrix. If the supplied value is less than or equal to
              zero, the number of columns will be calculated when numCols is invoked.
- See Also:
- 
Nested Class SummaryNested classes/interfaces inherited from interface org.apache.spark.internal.Loggingorg.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter
- 
Constructor SummaryConstructorsConstructorDescriptionBlockMatrix(RDD<scala.Tuple2<scala.Tuple2<Object, Object>, Matrix>> blocks, int rowsPerBlock, int colsPerBlock) Alternate constructor for BlockMatrix without the input of the number of rows and columns.BlockMatrix(RDD<scala.Tuple2<scala.Tuple2<Object, Object>, Matrix>> blocks, int rowsPerBlock, int colsPerBlock, long nRows, long nCols) 
- 
Method SummaryModifier and TypeMethodDescriptionadd(BlockMatrix other) Adds the given block matrixothertothisblock matrix:this + other.blocks()cache()Caches the underlying RDD.intmultiply(BlockMatrix other) multiply(BlockMatrix other, int numMidDimSplits) intlongnumCols()Gets or computes the number of columns.intlongnumRows()Gets or computes the number of rows.persist(StorageLevel storageLevel) Persists the underlying RDD with the specified storage level.intsubtract(BlockMatrix other) Subtracts the given block matrixotherfromthisblock matrix:this - other.Converts to CoordinateMatrix.Converts to IndexedRowMatrix.Collect the distributed matrix on the driver as aDenseMatrix.Transpose thisBlockMatrix.voidvalidate()Validates the block matrix info against the matrix data (blocks) and throws an exception if any error is found.Methods inherited from class java.lang.Objectequals, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.spark.internal.LogginginitializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logBasedOnLevel, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext
- 
Constructor Details- 
BlockMatrix
- 
BlockMatrixpublic BlockMatrix(RDD<scala.Tuple2<scala.Tuple2<Object, Object>, Matrix>> blocks, int rowsPerBlock, int colsPerBlock) Alternate constructor for BlockMatrix without the input of the number of rows and columns.- Parameters:
- blocks- The RDD of sub-matrix blocks ((blockRowIndex, blockColIndex), sub-matrix) that form this distributed matrix. If multiple blocks with the same index exist, the results for operations like add and multiply will be unpredictable.
- rowsPerBlock- Number of rows that make up each block. The blocks forming the final rows are not required to have the given number of rows
- colsPerBlock- Number of columns that make up each block. The blocks forming the final columns are not required to have the given number of columns
 
 
- 
- 
Method Details- 
addAdds the given block matrixothertothisblock matrix:this + other. The matrices must have the same size and matchingrowsPerBlockandcolsPerBlockvalues. If one of the blocks that are being added are instances ofSparseMatrix, the resulting sub matrix will also be aSparseMatrix, even if it is being added to aDenseMatrix. If two dense matrices are added, the output will also be aDenseMatrix.- Parameters:
- other- (undocumented)
- Returns:
- (undocumented)
 
- 
blocks
- 
cacheCaches the underlying RDD.
- 
colsPerBlockpublic int colsPerBlock()
- 
multiplyLeft multiplies thisBlockMatrixtoother, anotherBlockMatrix. ThecolsPerBlockof this matrix must equal therowsPerBlockofother. IfothercontainsSparseMatrix, they will have to be converted to aDenseMatrix. The outputBlockMatrixwill only consist of blocks ofDenseMatrix. This may cause some performance issues until support for multiplying two sparse matrices is added.- Parameters:
- other- (undocumented)
- Returns:
- (undocumented)
- Note:
- The behavior of multiply has changed in 1.6.0. multiplyused to throw an error when there were blocks with duplicate indices. Now, the blocks with duplicate indices will be added with each other.
 
- 
multiplyLeft multiplies thisBlockMatrixtoother, anotherBlockMatrix. ThecolsPerBlockof this matrix must equal therowsPerBlockofother. IfothercontainsSparseMatrix, they will have to be converted to aDenseMatrix. The outputBlockMatrixwill only consist of blocks ofDenseMatrix. This may cause some performance issues until support for multiplying two sparse matrices is added. Blocks with duplicate indices will be added with each other.- Parameters:
- other- Matrix- Bin- A * B = C
- numMidDimSplits- Number of splits to cut on the middle dimension when doing multiplication. For example, when multiplying a Matrix- Aof size- m x nwith Matrix- Bof size- n x k, this parameter configures the parallelism to use when grouping the matrices. The parallelism will increase from- m x kto- m x k x numMidDimSplits, which in some cases also reduces total shuffled data.
- Returns:
- (undocumented)
 
- 
numColBlockspublic int numColBlocks()
- 
numColspublic long numCols()Description copied from interface:DistributedMatrixGets or computes the number of columns.- Specified by:
- numColsin interface- DistributedMatrix
 
- 
numRowBlockspublic int numRowBlocks()
- 
numRowspublic long numRows()Description copied from interface:DistributedMatrixGets or computes the number of rows.- Specified by:
- numRowsin interface- DistributedMatrix
 
- 
persistPersists the underlying RDD with the specified storage level.
- 
rowsPerBlockpublic int rowsPerBlock()
- 
subtractSubtracts the given block matrixotherfromthisblock matrix:this - other. The matrices must have the same size and matchingrowsPerBlockandcolsPerBlockvalues. If one of the blocks that are being subtracted are instances ofSparseMatrix, the resulting sub matrix will also be aSparseMatrix, even if it is being subtracted from aDenseMatrix. If two dense matrices are subtracted, the output will also be aDenseMatrix.- Parameters:
- other- (undocumented)
- Returns:
- (undocumented)
 
- 
toCoordinateMatrixConverts to CoordinateMatrix.
- 
toIndexedRowMatrixConverts to IndexedRowMatrix. The number of columns must be within the integer range.
- 
toLocalMatrixCollect the distributed matrix on the driver as aDenseMatrix.- Returns:
- (undocumented)
 
- 
transposeTranspose thisBlockMatrix. Returns a newBlockMatrixinstance sharing the same underlying data. Is a lazy operation.- Returns:
- (undocumented)
 
- 
validatepublic void validate()Validates the block matrix info against the matrix data (blocks) and throws an exception if any error is found.
 
-