Dataset

class dgl.graphbolt.Dataset[source]

Bases: object

An abstract dataset which provides abstraction for accessing the data required for training.

The data abstraction could be a native CPU memory block, a shared memory block, a file handle of an opened file on disk, a service that provides the API to access the data e.t.c. There are 3 primary components in the dataset:

  • Task

    A task consists of several meta information and the Train/Validation/Test Set. A dataset could have multiple tasks.

  • Feature Storage

    A key-value store which stores node/edge/graph features.

  • Graph Topology

    Graph topology is used by the subgraph sampling algorithm to generate a subgraph.

property all_nodes_set: ItemSet | ItemSetDict

Return the itemset containing all nodes.

property dataset_name: str

Return the dataset name.

property feature: FeatureStore

Return the feature.

property graph: SamplingGraph

Return the graph.

property tasks: List[Task]

Return the tasks.