dgl.graphbolt.unique_and_compact_csc_formats

dgl.graphbolt.unique_and_compact_csc_formats(csc_formats: Tuple[Tensor, Tensor] | Dict[str, Tuple[Tensor, Tensor]], unique_dst_nodes: Tensor | Dict[str, Tensor])[source]

Compact csc formats and return unique nodes (per type).

Parameters:
  • csc_formats (Union[CSCFormatBase, Dict(str, CSCFormatBase)]) – CSC formats representing source-destination edges. - If csc_formats is a CSCFormatBase: It means the graph is homogeneous. Also, indptr and indice in it should be torch.tensor representing source and destination pairs in csc format. And IDs inside are homogeneous ids. - If csc_formats is a Dict[str, CSCFormatBase]: The keys should be edge type and the values should be csc format node pairs. And IDs inside are heterogeneous ids.

  • unique_dst_nodes (torch.Tensor or Dict[str, torch.Tensor]) – Unique nodes of all destination nodes in the node pairs. - If unique_dst_nodes is a tensor: It means the graph is homogeneous. - If csc_formats is a dictionary: The keys are node type and the values are corresponding nodes. And IDs inside are heterogeneous ids.

Returns:

The compacted csc formats, where node IDs are replaced with mapped node IDs, and the unique nodes (per type). β€œCompacted csc formats” indicates that the node IDs in the input node pairs are replaced with mapped node IDs, where each type of node is mapped to a contiguous space of IDs ranging from 0 to N.

Return type:

Tuple[csc_formats, unique_nodes]

Examples

>>> import dgl.graphbolt as gb
>>> N1 = torch.LongTensor([1, 2, 2])
>>> N2 = torch.LongTensor([5, 5, 6])
>>> unique_dst = {
...     "n1": torch.LongTensor([1, 2]),
...     "n2": torch.LongTensor([5, 6])}
>>> csc_formats = {
...     "n1:e1:n2": gb.CSCFormatBase(indptr=torch.tensor([0, 2, 3]),indices=N1),
...     "n2:e2:n1": gb.CSCFormatBase(indptr=torch.tensor([0, 1, 3]),indices=N2)}
>>> unique_nodes, compacted_csc_formats = gb.unique_and_compact_csc_formats(
...     csc_formats, unique_dst
... )
>>> print(unique_nodes)
{'n1': tensor([1, 2]), 'n2': tensor([5, 6])}
>>> print(compacted_csc_formats)
{"n1:e1:n2": CSCFormatBase(indptr=torch.tensor([0, 2, 3]),
                           indices=torch.tensor([0, 1, 1])),
 "n2:e2:n1": CSCFormatBase(indptr=torch.tensor([0, 1, 3]),
                           indices=torch.Longtensor([0, 0, 1]))}