PPIDataset¶
-
class
dgl.data.
PPIDataset
(mode='train', raw_dir=None, force_reload=False, verbose=False, transform=None)[source]¶ Bases:
dgl.data.dgl_dataset.DGLBuiltinDataset
Protein-Protein Interaction dataset for inductive node classification
A toy Protein-Protein Interaction network dataset. The dataset contains 24 graphs. The average number of nodes per graph is 2372. Each node has 50 features and 121 labels. 20 graphs for training, 2 for validation and 2 for testing.
Reference: http://snap.stanford.edu/graphsage/
Statistics:
Train examples: 20
Valid examples: 2
Test examples: 2
- Parameters
mode (str) – Must be one of (‘train’, ‘valid’, ‘test’). Default: ‘train’
raw_dir (str) – Raw file directory to download/contains the input data directory. Default: ~/.dgl/
force_reload (bool) – Whether to reload the dataset. Default: False
verbose (bool) – Whether to print out progress information. Default: True.
transform (callable, optional) – A transform that takes in a
DGLGraph
object and returns a transformed version. TheDGLGraph
object will be transformed before every access.
-
labels
¶ Node labels
- Type
Tensor
-
features
¶ Node features
- Type
Tensor
Examples
>>> dataset = PPIDataset(mode='valid') >>> num_classes = dataset.num_classes >>> for g in dataset: .... feat = g.ndata['feat'] .... label = g.ndata['label'] .... # your code here >>>