dgl.data.utils.add_nodepred_split

dgl.data.utils.add_nodepred_split(dataset, ratio, ntype=None)[source]

Split the given dataset into training, validation and test sets for transductive node predction task.

It adds three node mask arrays 'train_mask', 'val_mask' and 'test_mask', to each graph in the dataset. Each sample in the dataset thus must be a DGLGraph.

Fix the random seed of NumPy to make the result deterministic:

numpy.random.seed(42)
Parameters
  • dataset (DGLDataset) – The dataset to modify.

  • ratio ((float, float, float)) – Split ratios for training, validation and test sets. Must sum to one.

  • ntype (str, optional) – The node type to add mask for.

Examples

>>> dataset = dgl.data.AmazonCoBuyComputerDataset()
>>> print('train_mask' in dataset[0].ndata)
False
>>> dgl.data.utils.add_nodepred_split(dataset, [0.8, 0.1, 0.1])
>>> print('train_mask' in dataset[0].ndata)
True