QM7bDataset

class dgl.data.QM7bDataset(raw_dir=None, force_reload=False, verbose=False, transform=None)[source]

Bases: DGLDataset

QM7b dataset for graph property prediction (regression)

This dataset consists of 7,211 molecules with 14 regression targets. Nodes means atoms and edges means bonds. Edge data β€˜h’ means the entry of Coulomb matrix.

Reference: http://quantum-machine.org/datasets/

Statistics:

  • Number of graphs: 7,211

  • Number of regression targets: 14

  • Average number of nodes: 15

  • Average number of edges: 245

  • Edge feature size: 1

Parameters:
  • raw_dir (str) – Raw file directory to download/contains the input data directory. Default: ~/.dgl/

  • force_reload (bool) – Whether to reload the dataset. Default: False

  • verbose (bool) – Whether to print out progress information. Default: True.

  • transform (callable, optional) – A transform that takes in a DGLGraph object and returns a transformed version. The DGLGraph object will be transformed before every access.

num_tasks

Number of prediction tasks

Type:

int

num_labels

(DEPRECATED, use num_tasks instead) Number of prediction tasks

Type:

int

Raises:

UserWarning – If the raw data is changed in the remote server by the author.

Examples

>>> data = QM7bDataset()
>>> data.num_tasks
14
>>>
>>> # iterate over the dataset
>>> for g, label in data:
...     edge_feat = g.edata['h']  # get edge feature
...     # your code here...
...
>>>
__getitem__(idx)[source]

Get graph and label by index

Parameters:

idx (int) – Item index

Return type:

(dgl.DGLGraph, Tensor)

__len__()[source]

Number of graphs in the dataset.

Return type:

int