AMDataset¶
-
class
dgl.data.
AMDataset
(print_every=10000, insert_reverse=True, raw_dir=None, force_reload=False, verbose=True, transform=None)[source]¶ Bases:
dgl.data.rdf.RDFGraphDataset
AM dataset. for node classification task
Namespace convention:
Instance:
http://purl.org/collections/nl/am/<type>-<id>
Relation:
http://purl.org/collections/nl/am/<name>
We ignored all literal nodes and the relations connecting them in the output graph.
AM dataset statistics:
Nodes: 881680
Edges: 5668682 (including reverse edges)
Target Category: proxy
Number of Classes: 11
Label Split:
Train: 802
Test: 198
- Parameters
print_every (int) – Preprocessing log for every X tuples. Default: 10000.
insert_reverse (bool) – If true, add reverse edge and reverse relations to the final graph. Default: True.
raw_dir (str) – Raw file directory to download/contains the input data directory. Default: ~/.dgl/
force_reload (bool) – Whether to reload the dataset. Default: False
verbose (bool) – Whether to print out progress information. Default: True.
transform (callable, optional) – A transform that takes in a
DGLGraph
object and returns a transformed version. TheDGLGraph
object will be transformed before every access.
Examples
>>> dataset = dgl.data.rdf.AMDataset() >>> graph = dataset[0] >>> category = dataset.predict_category >>> num_classes = dataset.num_classes >>> >>> train_mask = g.nodes[category].data['train_mask'] >>> test_mask = g.nodes[category].data['test_mask'] >>> label = g.nodes[category].data['label']