🆕 dgl.graphbolt

dgl.graphbolt is a dataloading framework for GNN that provides well-defined APIs for each stage of the data pipeline and multiple standard implementations.

Dataset

A dataset is a collection of graph structure data, feature data and tasks.

Dataset

An abstract dataset which provides abstraction for accessing the data required for training.

OnDiskDataset

An on-disk dataset which reads graph topology, feature data and Train/Validation/Test set from disk.

BuiltinDataset

A utility class to download built-in dataset from AWS S3 and load it as OnDiskDataset.

LegacyDataset

A Graphbolt dataset for legacy DGLDataset.

Task

An abstract task which consists of meta information and Train/Validation/Test Set.

Graph

A graph is a collection of nodes and edges. It can be a homogeneous graph or a heterogeneous graph.

SamplingGraph

Class for sampling graph.

FusedCSCSamplingGraph

A sampling graph in CSC format.

Feature and FeatureStore

A feature is a collection of data(tensor, array). A feature store is a collection of features.

Feature

A wrapper of feature data for access.

FeatureStore

A store to manage multiple features for access.

BasicFeatureStore

A basic feature store to manage multiple features for access.

TorchBasedFeature

A wrapper of pytorch based feature.

TorchBasedFeatureStore

A store to manage multiple pytorch based feature for access.

GPUCachedFeature

GPU cached feature wrapping a fallback feature.

DataLoader

A dataloader is for iterating over a dataset and generate mini-batches.

DataLoader

Multiprocessing DataLoader.

ItemSet

An item set is an iterable collection of items.

ItemSet

A wrapper of iterable data or tuple of iterable data.

ItemSetDict

Dictionary wrapper of ItemSet.

ItemSampler

An item sampler is for sampling items from an item set.

ItemSampler

A sampler to iterate over input items and create subsets.

DistributedItemSampler

A sampler to iterate over input items and create subsets distributedly.

MiniBatch

A mini-batch is a collection of sampled subgraphs and their corresponding features. It is the basic unit for training a GNN model.

MiniBatch

A composite data class for data structure in the graphbolt.

MiniBatchTransformer

A mini-batch transformer used to manipulate mini-batch.

NegativeSampler

A negative sampler is for sampling negative items from mini-batches.

NegativeSampler

A negative sampler used to generate negative samples and return a mix of positive and negative samples.

UniformNegativeSampler

Sample negative destination nodes for each source node based on a uniform distribution.

SubgraphSampler

A subgraph sampler is for sampling subgraphs from a graph.

SubgraphSampler

A subgraph sampler used to sample a subgraph from a given set of nodes from a larger graph.

SampledSubgraph

An abstract class for sampled subgraph.

NeighborSampler

Sample neighbor edges from a graph and return a subgraph.

LayerNeighborSampler

Sample layer neighbor edges from a graph and return a subgraph.

SampledSubgraphImpl

Sampled subgraph of CSCSamplingGraph.

InSubgraphSampler

Sample the subgraph induced on the inbound edges of the given nodes.

FeatureFetcher

A feature fetcher is for fetching features from a feature store.

FeatureFetcher

A feature fetcher used to fetch features for node/edge in graphbolt.

CopyTo

This datapipe is for copying data to a device.

CopyTo

DataPipe that transfers each element yielded from the previous DataPipe to the given device.

Utilities

fused_csc_sampling_graph

Create a FusedCSCSamplingGraph object from a CSC representation.

load_from_shared_memory

Load a FusedCSCSamplingGraph object from shared memory.

from_dglgraph

Convert a DGLGraph to FusedCSCSamplingGraph.

etype_str_to_tuple

Convert canonical etype from string to tuple.

etype_tuple_to_str

Convert canonical etype from tuple to string.

isin

Tests if each element of elements is in test_elements.

seed

Set the random seed of Graphbolt.

index_select

Returns a new tensor which indexes the input tensor along dimension dim using the entries in index.

expand_indptr

Converts a given indptr offset tensor to a COO format tensor.

add_reverse_edges

This function finds the reverse edges of the given edges and returns the composition of them.

exclude_seed_edges

Exclude seed edges with or without their reverse edges from the sampled subgraphs in the minibatch.

compact_csc_format

Relabel the row (source) IDs in the csc formats into a contiguous range from 0 and return the original row node IDs per type.

unique_and_compact

Compact a list of nodes tensor.

unique_and_compact_csc_formats

Compact csc formats and return unique nodes (per type).