Class ColumnarBatch

Object
org.apache.spark.sql.vectorized.ColumnarBatch
All Implemented Interfaces:
AutoCloseable

@DeveloperApi public class ColumnarBatch extends Object implements AutoCloseable
This class wraps multiple ColumnVectors as a row-wise table. It provides a row view of this batch so that Spark can access the data row by row. Instance of it is meant to be reused during the entire data loading process. A data source may extend this class with customized logic.
  • Constructor Summary

    Constructors
    Constructor
    Description
     
    ColumnarBatch(ColumnVector[] columns, int numRows)
    Create a new batch from existing column vectors.
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    Called to close all the columns in this batch.
    column(int ordinal)
    Returns the column at `ordinal`.
    org.apache.spark.sql.catalyst.InternalRow
    getRow(int rowId)
    Returns the row in this batch at `rowId`.
    int
    Returns the number of columns that make up this batch.
    int
    Returns the number of rows for read, including filtered rows.
    Iterator<org.apache.spark.sql.catalyst.InternalRow>
    Returns an iterator over the rows in this batch.
    void
    setNumRows(int numRows)
    Sets the number of rows in this batch.

    Methods inherited from class java.lang.Object

    equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • ColumnarBatch

      public ColumnarBatch(ColumnVector[] columns)
    • ColumnarBatch

      public ColumnarBatch(ColumnVector[] columns, int numRows)
      Create a new batch from existing column vectors.
      Parameters:
      columns - The columns of this batch
      numRows - The number of rows in this batch
  • Method Details

    • close

      public void close()
      Called to close all the columns in this batch. It is not valid to access the data after calling this. This must be called at the end to clean up memory allocations.
      Specified by:
      close in interface AutoCloseable
    • rowIterator

      public Iterator<org.apache.spark.sql.catalyst.InternalRow> rowIterator()
      Returns an iterator over the rows in this batch.
    • setNumRows

      public void setNumRows(int numRows)
      Sets the number of rows in this batch.
    • numCols

      public int numCols()
      Returns the number of columns that make up this batch.
    • numRows

      public int numRows()
      Returns the number of rows for read, including filtered rows.
    • column

      public ColumnVector column(int ordinal)
      Returns the column at `ordinal`.
    • getRow

      public org.apache.spark.sql.catalyst.InternalRow getRow(int rowId)
      Returns the row in this batch at `rowId`. Returned row is reused across calls.