SampledSubgraphο
- class dgl.graphbolt.SampledSubgraph[source]ο
Bases:
object
An abstract class for sampled subgraph. In the context of a heterogeneous graph, each field should be of Dict type. Otherwise, for homogeneous graphs, each field should correspond to its respective value type.
- exclude_edges(edges: Dict[str, Tensor] | Tensor, assume_num_node_within_int32: bool = True)[source]ο
Exclude edges from the sampled subgraph.
This function can be used with sampled subgraphs, regardless of whether they have compacted row/column nodes or not. If the original subgraph has compacted row or column nodes, the corresponding row or column nodes in the returned subgraph will also be compacted.
- Parameters:
self (SampledSubgraph) β The sampled subgraph.
edges (Union[torch.Tensor, Dict[str, torch.Tensor]]) β Edges to exclude. If sampled subgraph is homogeneous, then edges should be a N*2 tensors representing the edges to exclude. If sampled subgraph is heterogeneous, then edges should be a dictionary of edge types and the corresponding edges to exclude.
assume_num_node_within_int32 (bool) β If True, assumes the value of node IDs in the provided edges fall within the int32 range, which can significantly enhance computation speed. Default: True
- Returns:
An instance of a class that inherits from SampledSubgraph.
- Return type:
Examples
>>> import dgl.graphbolt as gb >>> import torch >>> sampled_csc = {"A:relation:B": gb.CSCFormatBase( ... indptr=torch.tensor([0, 1, 2, 3]), ... indices=torch.tensor([0, 1, 2]))} >>> original_column_node_ids = {"B": torch.tensor([10, 11, 12])} >>> original_row_node_ids = {"A": torch.tensor([13, 14, 15])} >>> original_edge_ids = {"A:relation:B": torch.tensor([19, 20, 21])} >>> subgraph = gb.SampledSubgraphImpl( ... sampled_csc=sampled_csc, ... original_column_node_ids=original_column_node_ids, ... original_row_node_ids=original_row_node_ids, ... original_edge_ids=original_edge_ids ... ) >>> edges_to_exclude = {"A:relation:B": torch.tensor([[14, 11], [15, 12]])} >>> result = subgraph.exclude_edges(edges_to_exclude) >>> print(result.sampled_csc) {'A:relation:B': CSCFormatBase(indptr=tensor([0, 1, 1, 1]), indices=tensor([0]), )} >>> print(result.original_column_node_ids) {'B': tensor([10, 11, 12])} >>> print(result.original_row_node_ids) {'A': tensor([13, 14, 15])} >>> print(result.original_edge_ids) {'A:relation:B': tensor([19])}
- property original_column_node_ids: Tensor | Dict[str, Tensor]ο
Returns corresponding reverse column node ids the original graph. Columnβs reverse node ids in the original graph. A graph structure can be treated as a coordinated row and column pair, and this is the mapped ids of the column.
If original_column_node_ids is a tensor: It represents the original node ids.
If original_column_node_ids is a dictionary: The keys should be node type and the values should be corresponding original heterogeneous node ids.
If present, it means column IDs are compacted, and sampled_csc column IDs match these compacted ones.
- property original_edge_ids: Tensor | Dict[str, Tensor]ο
Returns corresponding reverse edge ids the original graph. Reverse edge ids in the original graph. This is useful when edge features are needed.
If original_edge_ids is a tensor: It represents the original edge ids.
If original_edge_ids is a dictionary: The keys should be edge type and the values should be corresponding original heterogeneous edge ids.
- property original_row_node_ids: Tensor | Dict[str, Tensor]ο
Returns corresponding reverse row node ids the original graph. Rowβs reverse node ids in the original graph. A graph structure can be treated as a coordinated row and column pair, and this is the mapped ids of the row.
If original_row_node_ids is a tensor: It represents the original node ids.
If original_row_node_ids is a dictionary: The keys should be node type and the values should be corresponding original heterogeneous node ids.
If present, it means row IDs are compacted, and sampled_csc row IDs match these compacted ones.
- property sampled_csc: CSCFormatBase | Dict[str, CSCFormatBase]ο
- Returns the node pairs representing edges in csc format.
If sampled_csc is a CSCFormatBase: It should be in the csc format. indptr stores the index in the data array where each column starts. indices stores the row indices of the non-zero elements.
If sampled_csc is a dictionary: The keys should be edge type and the values should be corresponding node pairs. The ids inside is heterogeneous ids.
Examples
Homogeneous graph.
>>> import dgl.graphbolt as gb >>> import torch >>> sampled_csc = gb.CSCFormatBase( ... indptr=torch.tensor([0, 1, 2, 3]), ... indices=torch.tensor([0, 1, 2])) >>> print(sampled_csc) CSCFormatBase(indptr=tensor([0, 1, 2, 3]), indices=tensor([0, 1, 2]), )
Heterogeneous graph.
>>> sampled_csc = {"A:relation:B": gb.CSCFormatBase( ... indptr=torch.tensor([0, 1, 2, 3]), ... indices=torch.tensor([0, 1, 2]))} >>> print(sampled_csc) {'A:relation:B': CSCFormatBase(indptr=tensor([0, 1, 2, 3]), indices=tensor([0, 1, 2]), )}