BuiltinDataset

class dgl.graphbolt.BuiltinDataset(name: str, root: str = 'datasets')[source]

Bases: OnDiskDataset

A utility class to download built-in dataset from AWS S3 and load it as OnDiskDataset.

Available built-in datasets include:

cora

The cora dataset is a homogeneous citation network dataset, which is designed for the node classification task.

ogbn-mag

The ogbn-mag dataset is a heterogeneous network composed of a subset of the Microsoft Academic Graph (MAG). See more details in ogbn-mag.

Note

Reverse edges are added to the original graph and duplicated edges are removed.

ogbl-citation2

The ogbl-citation2 dataset is a directed graph, representing the citation network between a subset of papers extracted from MAG. See more details in ogbl-citation2.

Note

Reverse edges are added to the original graph and duplicated edges are removed.

ogbn-arxiv

The ogbn-arxiv dataset is a directed graph, representing the citation network between all Computer Science (CS) arXiv papers indexed by MAG. See more details in ogbn-arxiv.

Note

Reverse edges are added to the original graph and duplicated edges are removed.

ogbn-papers100M

The ogbn-papers100M dataset is a directed graph, representing the citation network between all Computer Science (CS) arXiv papers indexed by MAG. See more details in ogbn-papers100M.

Note

Reverse edges are added to the original graph and duplicated edges are removed.

ogbn-products

The ogbn-products dataset is an undirected and unweighted graph, representing an Amazon product co-purchasing network. See more details in ogbn-products.

Note

Reverse edges are added to the original graph. Node features are stored as float32.

ogb-lsc-mag240m

The ogb-lsc-mag240m dataset is a heterogeneous academic graph extracted from the Microsoft Academic Graph (MAG). See more details in ogb-lsc-mag240m.

Note

Reverse edges are added to the original graph.

Parameters:
  • name (str) – The name of the builtin dataset.

  • root (str, optional) – The root directory of the dataset. Default ot datasets.