dgl.to_homogeneous

dgl.to_homogeneous(G, ndata=None, edata=None, store_type=True, return_count=False)[source]

Convert a heterogeneous graph to a homogeneous graph and return.

By default, the function stores the node and edge types of the input graph as the dgl.NTYPE and dgl.ETYPE features in the returned graph. Each feature is an integer representing the type id, determined by the DGLGraph.get_ntype_id() and DGLGraph.get_etype_id() methods. One can omit it by specifying store_type=False.

The result graph assigns nodes and edges of the same type with IDs in continuous range (i.e., nodes of the first type have IDs 0 ~ G.num_nodes(G.ntypes[0]); nodes of the second type come after; so on and so forth). Therefore, a more memory-efficient format for type information is an integer list; the i^th corresponds to the number of nodes/edges of the i^th type. One can choose this format by specifying return_count=True.

Parameters:
  • G (DGLGraph) – The heterogeneous graph.

  • ndata (list[str], optional) – The node features to combine across all node types. For each feature feat in ndata, it concatenates G.nodes[T].data[feat] across all node types T. As a result, the feature feat of all node types should have the same shape and data type. By default, the returned graph will not have any node features.

  • edata (list[str], optional) – The edge features to combine across all edge types. For each feature feat in edata, it concatenates G.edges[T].data[feat] across all edge types T. As a result, the feature feat of all edge types should have the same shape and data type. By default, the returned graph will not have any edge features.

  • store_type (bool, optional) – If True, store type information as the dgl.NTYPE and dgl.ETYPE features in the returned graph.

  • return_count (bool, optional) – If True, return type information as an integer list; the i^th element corresponds to the number of nodes/edges of the i^th type.

Returns:

  • DGLGraph – A homogeneous graph.

  • ntype_count (list[int], optional) – Number of nodes of each type. Return when return_count is True.

  • etype_count (list[int], optional) – Number of edges of each type. Return when return_count is True.

Notes

  • Calculating type information may introduce noticeable cost. Setting both store_type and return_count to False can avoid such cost if type information is not needed. Otherwise, DGL recommends to use store_type=False and return_count=True due to its memory efficiency.

  • The ntype_count and etype_count lists can help speed up some operations. See RelGraphConv for such an example.

  • Calling to_homogeneous() then calling to_heterogeneous() again yields the same result.

Examples

The following example uses PyTorch backend.

>>> import dgl
>>> import torch
>>> hg = dgl.heterograph({
...     ('user', 'follows', 'user'): ([0, 1], [1, 2]),
...     ('developer', 'develops', 'game'): ([0, 1], [0, 1])
...     })
>>> hg.nodes['user'].data['h'] = torch.ones(3, 1)
>>> hg.nodes['developer'].data['h'] = torch.zeros(2, 1)
>>> hg.nodes['game'].data['h'] = torch.ones(2, 1)
>>> g = dgl.to_homogeneous(hg)
>>> # The first three nodes are for 'user', the next two are for 'developer',
>>> # and the last two are for 'game'
>>> g.ndata
{'_TYPE': tensor([0, 0, 0, 1, 1, 2, 2]), '_ID': tensor([0, 1, 2, 0, 1, 0, 1])}
>>> # The first two edges are for 'follows', and the next two are for 'develops' edges.
>>> g.edata
{'_TYPE': tensor([0, 0, 1, 1]), '_ID': tensor([0, 1, 0, 1])}

Combine feature β€˜h’ across all node types in the conversion.

>>> g = dgl.to_homogeneous(hg, ndata=['h'])
>>> g.ndata['h']
tensor([[1.], [1.], [1.], [0.], [0.], [1.], [1.]])

See also

to_heterogeneous