dgl.data

The dgl.data package contains datasets hosted by DGL and also utilities for downloading, processing, saving and loading data from external resources.

Base Class

DGLDataset

The basic DGL dataset for creating graph datasets.

CSVDataset

Dataset class that loads and parses graph data from CSV files.

Node Prediction Datasets

Datasets for node classification/regression tasks

SSTDataset

Stanford Sentiment Treebank dataset.

KarateClubDataset

Karate Club dataset for Node Classification

CoraGraphDataset

Cora citation network dataset.

CiteseerGraphDataset

Citeseer citation network dataset.

PubmedGraphDataset

Pubmed citation network dataset.

CoraFullDataset

CORA-Full dataset for node classification task.

AIFBDataset

AIFB dataset for node classification task

MUTAGDataset

MUTAG dataset for node classification task

BGSDataset

BGS dataset for node classification task

AMDataset

AM dataset.

AmazonCoBuyComputerDataset

'Computer' part of the AmazonCoBuy dataset for node classification task.

AmazonCoBuyPhotoDataset

AmazonCoBuy dataset for node classification task.

CoauthorCSDataset

'Computer Science (CS)' part of the Coauthor dataset for node classification task.

CoauthorPhysicsDataset

'Physics' part of the Coauthor dataset for node classification task.

PPIDataset

Protein-Protein Interaction dataset for inductive node classification

RedditDataset

Reddit dataset for community detection (node classification)

SBMMixtureDataset

Symmetric Stochastic Block Model Mixture

FraudDataset

Fraud node prediction dataset.

FraudYelpDataset

Fraud Yelp Dataset

FraudAmazonDataset

Fraud Amazon Dataset

BAShapeDataset

BA-SHAPES dataset from GNNExplainer: Generating Explanations for Graph Neural Networks

BACommunityDataset

BA-COMMUNITY dataset from GNNExplainer: Generating Explanations for Graph Neural Networks

TreeCycleDataset

TREE-CYCLES dataset from GNNExplainer: Generating Explanations for Graph Neural Networks

TreeGridDataset

TREE-GRIDS dataset from GNNExplainer: Generating Explanations for Graph Neural Networks

WikiCSDataset

Wiki-CS is a Wikipedia-based dataset for node classification from Wiki-CS: A Wikipedia-Based Benchmark for Graph Neural Networks

FlickrDataset

Flickr dataset for node classification from GraphSAINT: Graph Sampling Based Inductive Learning Method

YelpDataset

Yelp dataset for node classification from GraphSAINT: Graph Sampling Based Inductive Learning Method

PATTERNDataset

PATTERN dataset for graph pattern recognition task.

CLUSTERDataset

CLUSTER dataset for semi-supervised clustering task.

ChameleonDataset

Wikipedia page-page network on chameleons from Multi-scale Attributed Node Embedding and later modified by Geom-GCN: Geometric Graph Convolutional Networks

SquirrelDataset

Wikipedia page-page network on squirrels from Multi-scale Attributed Node Embedding and later modified by Geom-GCN: Geometric Graph Convolutional Networks

ActorDataset

Actor-only induced subgraph of the film-directoractor-writer network from Social Influence Analysis in Large-scale Networks <https://dl.acm.org/doi/10.1145/1557019.1557108>, introduced by Geom-GCN: Geometric Graph Convolutional Networks <https://arxiv.org/abs/2002.05287>

CornellDataset

Cornell subset of WebKB, later modified by Geom-GCN: Geometric Graph Convolutional Networks

TexasDataset

Texas subset of WebKB, later modified by Geom-GCN: Geometric Graph Convolutional Networks

WisconsinDataset

Wisconsin subset of WebKB, later modified by Geom-GCN: Geometric Graph Convolutional Networks

RomanEmpireDataset

Roman-empire dataset from the 'A Critical Look at the Evaluation of GNNs under Heterophily: Are We Really Making Progress? <https://arxiv.org/abs/2302.11640>'__ paper.

AmazonRatingsDataset

Amazon-ratings dataset from the 'A Critical Look at the Evaluation of GNNs under Heterophily: Are We Really Making Progress? <https://arxiv.org/abs/2302.11640>'__ paper.

MinesweeperDataset

Minesweeper dataset from the 'A Critical Look at the Evaluation of GNNs under Heterophily: Are We Really Making Progress? <https://arxiv.org/abs/2302.11640>'__ paper.

TolokersDataset

Tolokers dataset from the 'A Critical Look at the Evaluation of GNNs under Heterophily: Are We Really Making Progress? <https://arxiv.org/abs/2302.11640>'__ paper.

QuestionsDataset

Questions dataset from the 'A Critical Look at the Evaluation of GNNs under Heterophily: Are We Really Making Progress? <https://arxiv.org/abs/2302.11640>'__ paper.

MovieLensDataset

MovieLens dataset for edge prediction tasks.

Edge Prediction Datasets

Datasets for edge classification/regression and link prediction

FB15k237Dataset

FB15k237 link prediction dataset.

FB15kDataset

FB15k link prediction dataset.

WN18Dataset

WN18 link prediction dataset.

BitcoinOTCDataset

BitcoinOTC dataset for fraud detection

ICEWS18Dataset

ICEWS18 dataset for temporal graph

GDELTDataset

GDELT dataset for event-based temporal graph

Graph Prediction Datasets

Datasets for graph classification/regression tasks

QM7bDataset

QM7b dataset for graph property prediction (regression)

QM9Dataset

QM9 dataset for graph property prediction (regression)

QM9EdgeDataset

QM9Edge dataset for graph property prediction (regression)

MiniGCDataset

The synthetic graph classification dataset class.

TUDataset

TUDataset contains lots of graph kernel datasets for graph classification.

LegacyTUDataset

LegacyTUDataset contains lots of graph kernel datasets for graph classification.

GINDataset

Dataset Class for How Powerful Are Graph Neural Networks?.

FakeNewsDataset

Fake News Graph Classification dataset.

BA2MotifDataset

BA-2motifs dataset from Parameterized Explainer for Graph Neural Network

ZINCDataset

ZINC dataset for the graph regression task.

MNISTSuperPixelDataset

MNIST superpixel dataset for the graph classification task.

CIFAR10SuperPixelDataset

CIFAR10 superpixel dataset for the graph classification task.

Dataset adapters

AsNodePredDataset

Repurpose a dataset for a standard semi-supervised transductive node prediction task.

AsLinkPredDataset

Repurpose a dataset for link prediction task.

AsGraphPredDataset

Repurpose a dataset for standard graph property prediction task.

Utilities

utils.get_download_dir

Get the absolute path to the download directory.

utils.download

Download a given URL.

utils.check_sha1

Check whether the sha1 hash of the file content matches the expected hash.

utils.extract_archive

Extract archive file.

utils.split_dataset

Split dataset into training, validation and test set.

utils.load_labels

Load label dict from file

utils.save_info

Save dataset related information into disk.

utils.load_info

Load dataset related information from disk.

utils.add_nodepred_split

Split the given dataset into training, validation and test sets for transductive node predction task.

utils.mask_nodes_by_property

Provide the split masks for a node split with distributional shift based on a given node property, as proposed in Evaluating Robustness and Uncertainty of Graph Models Under Structural Distributional Shifts

utils.add_node_property_split

Create a node split with distributional shift based on a given node property, as proposed in Evaluating Robustness and Uncertainty of Graph Models Under Structural Distributional Shifts

utils.Subset

Subset of a dataset at specified indices