dgl.sampling

The dgl.sampling package contains operators and utilities for sampling from a graph via random walks, neighbor sampling, etc. They are typically used together with the DataLoader s in the dgl.dataloading package. The user guide Chapter 6: Stochastic Training on Large Graphs gives a holistic explanation on how different components work together.

Random walk

random_walk(g, nodes, *[, metapath, length, …])

Generate random walk traces from an array of starting nodes based on the given metapath.

node2vec_random_walk(g, nodes, p, q, walk_length)

Generate random walk traces from an array of starting nodes based on the node2vec model.

pack_traces(traces, types)

Pack the padded traces returned by random_walk() into a concatenated array.

Neighbor sampling

sample_neighbors(g, nodes, fanout[, …])

Sample neighboring edges of the given nodes and return the induced subgraph.

sample_labors(g, nodes, fanout[, edge_dir, …])

Sampler that builds computational dependency of node representations via labor sampling for multilayer GNN from the NeurIPS 2023 paper Layer-Neighbor Sampling – Defusing Neighborhood Explosion in GNNs

sample_neighbors_biased(g, nodes, fanout, bias)

Sample neighboring edges of the given nodes and return the induced subgraph, where each neighbor’s probability to be picked is determined by its tag.

select_topk(g, k, weight[, nodes, edge_dir, …])

Select the neighboring edges with k-largest (or k-smallest) weights of the given nodes and return the induced subgraph.

PinSAGESampler(G, ntype, other_type, …[, …])

PinSAGE-like neighbor sampler.

Negative sampling

global_uniform_negative_sampling(g, num_samples)

Performs negative sampling, which generate source-destination pairs such that edges with the given type do not exist.