AsLinkPredDataset

class dgl.data.AsLinkPredDataset(dataset, split_ratio=None, neg_ratio=3, **kwargs)[source]

Bases: DGLDataset

Repurpose a dataset for link prediction task.

The created dataset will include data needed for link prediction. Currently it only supports homogeneous graphs. It will keep only the first graph in the provided dataset and generate train/val/test edges according to the given split ratio, and the correspondent negative edges based on the neg_ratio. The generated edges will be cached to disk for fast re-loading. If the provided split ratio differs from the cached one, it will re-process the dataset properly.

Parameters:
  • dataset (DGLDataset) – The dataset to be converted.

  • split_ratio ((float, float, float), optional) – Split ratios for training, validation and test sets. Must sum to one.

  • neg_ratio (int, optional) – Indicate how much negative samples to be sampled The number of the negative samples will be equal or less than neg_ratio * num_positive_edges.

feat_size

The size of the feature dimension in the graph

Type:

int

train_graph

The DGLGraph for training

Type:

DGLGraph

val_edges

The validation set edges, encoded as ((positive_edge_src, positive_edge_dst), (negative_edge_src, negative_edge_dst))

Type:

Tuple[Tuple[Tensor, Tensor], Tuple[Tensor, Tensor]]

test_edges

The test set edges, encoded as ((positive_edge_src, positive_edge_dst), (negative_edge_src, negative_edge_dst))

Type:

Tuple[Tuple[Tensor, Tensor], Tuple[Tensor, Tensor]]

Examples

>>> ds = dgl.data.CoraGraphDataset()
>>> print(ds)
Dataset("cora_v2", num_graphs=1, save_path=...)
>>> new_ds = dgl.data.AsLinkPredDataset(ds, [0.8, 0.1, 0.1])
>>> print(new_ds)
Dataset("cora_v2-as-linkpred", num_graphs=1, save_path=/home/ubuntu/.dgl/cora_v2-as-linkpred)
>>> print(hasattr(new_ds, "test_edges"))
True
__getitem__(idx)[source]

Gets the data object at index.

__len__()[source]

The number of examples in the dataset.