AsLinkPredDataset

class dgl.data.AsLinkPredDataset(dataset, split_ratio=None, neg_ratio=3, **kwargs)[source]

Bases: dgl.data.dgl_dataset.DGLDataset

Repurpose a dataset for link prediction task.

The created dataset will include data needed for link prediction. Currently only support homogeneous graph. It will keep only the first graph in the provided dataset and generate train/val/test edges according to the given split ratio, and the correspondent negative edges based on the neg_ratio. The generated edges will be cached to disk for fast re-loading. If the provided split ratio differs from the cached one, it will re-process the dataset properly.

Parameters
  • dataset (DGLDataset) – The dataset to be converted.

  • split_ratio ((float, float, float), optional) – Split ratios for training, validation and test sets. Must sum to one.

  • neg_ratio (int, optional) – Indicate how much negative samples to be sampled The number of the negative samples will be equal or less than neg_ratio * num_positive_edges.

feat_size

The size of the feature dimension in the graph

Type

int

train_graph

The DGLGraph for training

Type

DGLGraph

val_edges

The validation set edges, encoded as ((positive_edge_src, positive_edge_dst), (negative_edge_src, negative_edge_dst))

Type

Tuple[Tuple[Tensor, Tensor], Tuple[Tensor, Tensor]]

test_edges

The test set edges, encoded as ((positive_edge_src, positive_edge_dst), (negative_edge_src, negative_edge_dst))

Type

Tuple[Tuple[Tensor, Tensor], Tuple[Tensor, Tensor]]

Examples

>>> ds = dgl.data.CoraGraphDataset()
>>> print(ds)
Dataset("cora_v2", num_graphs=1, save_path=...)
>>> new_ds = dgl.data.AsLinkPredDataset(ds, [0.8, 0.1, 0.1])
>>> print(new_ds)
Dataset("cora_v2-as-linkpred", num_graphs=1, save_path=/home/ubuntu/.dgl/cora_v2-as-linkpred)
>>> print(hasattr(new_ds, "test_edges"))
True
__getitem__(idx)[source]

Gets the data object at index.

__len__()[source]

The number of examples in the dataset.