BAShapeDataset

class dgl.data.BAShapeDataset(num_base_nodes=300, num_base_edges_per_node=5, num_motifs=80, perturb_ratio=0.01, seed=None, raw_dir=None, force_reload=False, verbose=True, transform=None)[source]

Bases: dgl.data.dgl_dataset.DGLBuiltinDataset

BA-SHAPES dataset from GNNExplainer: Generating Explanations for Graph Neural Networks

This is a synthetic dataset for node classification. It is generated by performing the following steps in order.

  • Construct a base Barabási–Albert (BA) graph.

  • Construct a set of five-node house-structured network motifs.

  • Attach the motifs to randomly selected nodes of the base graph.

  • Perturb the graph by adding random edges.

  • Nodes are assigned to 4 classes. Nodes of label 0 belong to the base BA graph. Nodes of label 1, 2, 3 are separately at the middle, bottom, or top of houses.

  • Generate constant feature for all nodes, which is 1.

Parameters
  • num_base_nodes (int, optional) – Number of nodes in the base BA graph. Default: 300

  • num_base_edges_per_node (int, optional) – Number of edges to attach from a new node to existing nodes in constructing the base BA graph. Default: 5

  • num_motifs (int, optional) – Number of house-structured network motifs to use. Default: 80

  • perturb_ratio (float, optional) – Number of random edges to add in perturbation divided by the number of edges in the original graph. Default: 0.01

  • seed (integer, random_state, or None, optional) – Indicator of random number generation state. Default: None

  • raw_dir (str, optional) – Raw file directory to store the processed data. Default: ~/.dgl/

  • force_reload (bool, optional) – Whether to always generate the data from scratch rather than load a cached version. Default: False

  • verbose (bool, optional) – Whether to print progress information. Default: True

  • transform (callable, optional) – A transform that takes in a DGLGraph object and returns a transformed version. The DGLGraph object will be transformed before every access. Default: None

num_classes

Number of node classes

Type

int

Examples

>>> from dgl.data import BAShapeDataset
>>> dataset = BAShapeDataset()
>>> dataset.num_classes
4
>>> g = dataset[0]
>>> label = g.ndata['label']
>>> feat = g.ndata['feat']
__getitem__(idx)[source]

Gets the data object at index.

__len__()[source]

The number of examples in the dataset.