org.apache.spark.graphx (Spark 3.5.1 JavaDoc)

package org.apache.spark.graphx

ALPHA COMPONENT GraphX is a graph processing framework built on top of Spark.

Related Packages

Package

Description

org.apache.spark

Core Spark classes in Scala.

org.apache.spark.graphx.impl

org.apache.spark.graphx.lib

Various analytics functions for graphs.

org.apache.spark.graphx.util

Collections of utilities used by graphx.
Class

Description

Edge<ED>

A single directed edge consisting of a source id, target id, and the data associated with the edge.

EdgeContext<VD,ED,A>

Represents an edge along with its neighboring vertices and allows sending messages along the edge.

EdgeDirection

The direction of a directed edge relative to a vertex.

EdgeRDD<ED>

EdgeRDD[ED, VD] extends RDD[Edge[ED} by storing the edges in columnar format on each partition for performance.

EdgeTriplet<VD,ED>

An edge triplet represents an edge along with the vertex attributes of its neighboring vertices.

Graph<VD,ED>

The Graph abstractly represents a graph with arbitrary objects associated with vertices and edges.

GraphLoader

Provides utilities for loading Graphs from files.

GraphOps<VD,ED>

Contains additional functionality for Graph.

GraphXUtils

PartitionStrategy

Represents the way edges are assigned to edge partitions based on their source and destination vertex IDs.

PartitionStrategy.CanonicalRandomVertexCut$

Assigns edges to partitions by hashing the source and destination vertex IDs in a canonical direction, resulting in a random vertex cut that colocates all edges between two vertices, regardless of direction.

PartitionStrategy.EdgePartition1D$

Assigns edges to partitions using only the source vertex ID, colocating edges with the same source.

PartitionStrategy.EdgePartition2D$

Assigns edges to partitions using a 2D partitioning of the sparse edge adjacency matrix, guaranteeing a 2 * sqrt(numParts) bound on vertex replication.

PartitionStrategy.RandomVertexCut$

Assigns edges to partitions by hashing the source and destination vertex IDs, resulting in a random vertex cut that colocates all same-direction edges between two vertices.

Pregel

Implements a Pregel-like bulk-synchronous message-passing API.

TripletFields

Represents a subset of the fields of an [[EdgeTriplet]] or [[EdgeContext]].

VertexRDD<VD>

Extends RDD[(VertexId, VD)] by ensuring that there is only one entry for each vertex and by pre-indexing the entries for fast, efficient joins.

Package org.apache.spark.graphx