MiniBatch

Bases: object

A composite data class for data structure in the graphbolt.

It is designed to facilitate the exchange of data among different components involved in processing data. The purpose of this class is to unify the representation of input and output data across different stages, ensuring consistency and ease of use throughout the loading process.

edge_ids(layer_id: int) → Dict[str, Tensor] | Tensor[source]: Get the edge ids of a layer.

node_ids() → Tensor | Dict[str, Tensor][source]: A representation of input nodes in the outermost layer. Contains all nodes in the sampled_subgraphs. - If input_nodes is a tensor: It indicates the graph is homogeneous. - If input_nodes is a dictionary: The keys should be node type and the

value should be corresponding heterogeneous node id.

num_layers() → int[source]: Return the number of layers.

set_edge_features(edge_features: List[Dict[str, Tensor] | Dict[Tuple[str, str], Tensor]]) → None[source]: Set edge features.

set_node_features(node_features: Dict[str, Tensor] | Dict[Tuple[str, str], Tensor]) → None[source]: Set node features.

to(device: device)[source]: Copy MiniBatch to the specified device using reflection.

to_pyg_data()[source]: Construct a PyG Data from MiniBatch. This function only supports node classification task on a homogeneous graph and the number of features cannot be more than one.

property blocks: Extracts DGL blocks from MiniBatch to construct a graphical structure and ID mappings.

compacted_seeds: Tensor | Dict[str, Tensor] = None: Representation of compacted seeds corresponding to ‘seeds’, where all node ids inside are compacted.

edge_features: List[Dict[str, Tensor] | Dict[Tuple[str, str], Tensor]] = None: Edge features associated with the ‘sampled_subgraphs’. - If keys are single strings: It means the graph is homogeneous, and the keys are feature names. - If keys are tuples: It means the graph is heterogeneous, and the keys are tuples of ‘(edge_type, feature_name)’. Note, edge type is single string of format ‘str:str:str’.

indexes: Tensor | Dict[str, Tensor] = None

Indexes associated with seeds in the graph, which indicates to which query a seeds belongs. - If indexes is a tensor: It indicates the graph is homogeneous. The

value should be corresponding query to given ‘seeds’.

If indexes is a dictionary: It indicates the graph is heterogeneous. The keys should be node or edge type and the value should be corresponding query to given ‘seeds’. For each key, indexes are consecutive integers starting from zero.

input_nodes: Tensor | Dict[str, Tensor] = None

A representation of input nodes in the outermost layer. Conatins all nodes: in the ‘sampled_subgraphs’.

If input_nodes is a tensor: It indicates the graph is homogeneous.
If input_nodes is a dictionary: The keys should be node type and the value should be corresponding heterogeneous node id.

labels: Tensor | Dict[str, Tensor] = None

Labels associated with seeds in the graph. - If labels is a tensor: It indicates the graph is homogeneous. The value

should be corresponding labels to given ‘seeds’.

If labels is a dictionary: The keys should be node or edge type and the value should be corresponding labels to given ‘seeds’.

node_features: Dict[str, Tensor] | Dict[Tuple[str, str], Tensor] = None: A representation of node features. - If keys are single strings: It means the graph is homogeneous, and the keys are feature names. - If keys are tuples: It means the graph is heterogeneous, and the keys are tuples of ‘(node_type, feature_name)’.

sampled_subgraphs: List[SampledSubgraph] = None: A list of ‘SampledSubgraph’s, each one corresponding to one layer, representing a subset of a larger graph structure.

seeds: Tensor | Dict[str, Tensor] = None

Representation of seed items utilized in node classification tasks, link prediction tasks and hyperlinks tasks. - If seeds is a tensor: it indicates that the seeds originate from a

homogeneous graph. It can be either a 1-dimensional or 2-dimensional tensor:

1-dimensional tensor: Each element directly represents a seed node within the graph.

2-dimensional tensor: Each row designates a seed item, which can encompass various entities such as edges, hyperlinks, or other graph components depending on the specific context.

If seeds is a dictionary: it indicates that the seeds originate from a heterogeneous graph. The keys should be edge or node type, and the value should be a tensor, which can be either a 1-dimensional or 2-dimensional tensor:
- 1-dimensional tensor: Each element directly represents a seed node
of the given type within the graph. - 2-dimensional tensor: Each row designates a seed item of the given

type, which can encompass various entities such as edges, hyperlinks, or other graph components depending on the specific context.