NN Modules (Tensorflow)¶
Conv Layers¶
TF NN conv module
GraphConv¶
-
class
dgl.nn.tensorflow.conv.
GraphConv
(in_feats, out_feats, norm='both', weight=True, bias=True, activation=None, allow_zero_in_degree=False)[source]¶ Bases:
tensorflow.python.keras.engine.base_layer.Layer
Graph convolution was introduced in GCN and mathematically is defined as follows:
\[h_i^{(l+1)} = \sigma(b^{(l)} + \sum_{j\in\mathcal{N}(i)}\frac{1}{c_{ij}}h_j^{(l)}W^{(l)})\]where \(\mathcal{N}(i)\) is the set of neighbors of node \(i\), \(c_{ij}\) is the product of the square root of node degrees (i.e., \(c_{ij} = \sqrt{|\mathcal{N}(i)|}\sqrt{|\mathcal{N}(j)|}\)), and \(\sigma\) is an activation function.
- Parameters
in_feats (int) – Input feature size; i.e, the number of dimensions of \(h_j^{(l)}\).
out_feats (int) – Output feature size; i.e., the number of dimensions of \(h_i^{(l+1)}\).
norm (str, optional) –
How to apply the normalizer. Can be one of the following values:
right
, to divide the aggregated messages by each node’s in-degrees, which is equivalent to averaging the received messages.none
, where no normalization is applied.both
(default), where the messages are scaled with \(1/c_{ji}\) above, equivalent to symmetric normalization.left
, to divide the messages sent out from each node by its out-degrees, equivalent to random walk normalization.
weight (bool, optional) – If True, apply a linear layer. Otherwise, aggregating the messages without a weight matrix.
bias (bool, optional) – If True, adds a learnable bias to the output. Default:
True
.activation (callable activation function/layer or None, optional) – If not None, applies an activation function to the updated node features. Default:
None
.allow_zero_in_degree (bool, optional) – If there are 0-in-degree nodes in the graph, output for those nodes will be invalid since no message will be passed to those nodes. This is harmful for some applications causing silent performance regression. This module will raise a DGLError if it detects 0-in-degree nodes in input graph. By setting
True
, it will suppress the check and let the users handle it by themselves. Default:False
.
-
weight
¶ The learnable weight tensor.
- Type
torch.Tensor
-
bias
¶ The learnable bias tensor.
- Type
torch.Tensor
Note
Zero in-degree nodes will lead to invalid output value. This is because no message will be passed to those nodes, the aggregation function will be appied on empty input. A common practice to avoid this is to add a self-loop for each node in the graph if it is homogeneous, which can be achieved by:
>>> g = ... # a DGLGraph >>> g = dgl.add_self_loop(g)
Calling
add_self_loop
will not work for some graphs, for example, heterogeneous graph since the edge type can not be decided for self_loop edges. Setallow_zero_in_degree
toTrue
for those cases to unblock the code and handle zero-in-degree nodes manually. A common practise to handle this is to filter out the nodes with zero-in-degree when use after conv.Examples
>>> import dgl >>> import numpy as np >>> import tensorflow as tf >>> from dgl.nn import GraphConv
>>> # Case 1: Homogeneous graph >>> with tf.device("CPU:0"): ... g = dgl.graph(([0,1,2,3,2,5], [1,2,3,4,0,3])) ... g = dgl.add_self_loop(g) ... feat = tf.ones((6, 10)) ... conv = GraphConv(10, 2, norm='both', weight=True, bias=True) ... res = conv(g, feat) >>> print(res) <tf.Tensor: shape=(6, 2), dtype=float32, numpy= array([[ 0.6208475 , -0.4896223 ], [ 0.68356586, -0.5390842 ], [ 0.6208475 , -0.4896223 ], [ 0.7859846 , -0.61985517], [ 0.8251371 , -0.65073216], [ 0.48335412, -0.38119012]], dtype=float32)> >>> # allow_zero_in_degree example >>> with tf.device("CPU:0"): ... g = dgl.graph(([0,1,2,3,2,5], [1,2,3,4,0,3])) ... conv = GraphConv(10, 2, norm='both', weight=True, bias=True, allow_zero_in_degree=True) ... res = conv(g, feat) >>> print(res) <tf.Tensor: shape=(6, 2), dtype=float32, numpy= array([[ 0.6208475 , -0.4896223 ], [ 0.68356586, -0.5390842 ], [ 0.6208475 , -0.4896223 ], [ 0.7859846 , -0.61985517], [ 0.8251371 , -0.65073216], [ 0., 0.]], dtype=float32)>
>>> # Case 2: Unidirectional bipartite graph >>> u = [0, 1, 0, 0, 1] >>> v = [0, 1, 2, 3, 2] >>> with tf.device("CPU:0"): ... g = dgl.bipartite((u, v)) ... u_fea = tf.convert_to_tensor(np.random.rand(2, 5)) ... v_fea = tf.convert_to_tensor(np.random.rand(4, 5)) ... conv = GraphConv(5, 2, norm='both', weight=True, bias=True) ... res = conv(g, (u_fea, v_fea)) >>> res <tf.Tensor: shape=(4, 2), dtype=float32, numpy= array([[ 1.3607183, -0.1636453], [ 1.6665325, -0.2004239], [ 2.1405895, -0.2574358], [ 1.3607183, -0.1636453]], dtype=float32)>
RelGraphConv¶
-
class
dgl.nn.tensorflow.conv.
RelGraphConv
(in_feat, out_feat, num_rels, regularizer='basis', num_bases=None, bias=True, activation=None, self_loop=True, low_mem=False, dropout=0.0, layer_norm=False)[source]¶ Bases:
tensorflow.python.keras.engine.base_layer.Layer
Relational graph convolution layer.
Relational graph convolution is introduced in “Modeling Relational Data with Graph Convolutional Networks” and can be described as below:
\[h_i^{(l+1)} = \sigma(\sum_{r\in\mathcal{R}} \sum_{j\in\mathcal{N}^r(i)}\frac{1}{c_{i,r}}W_r^{(l)}h_j^{(l)}+W_0^{(l)}h_i^{(l)})\]where \(\mathcal{N}^r(i)\) is the neighbor set of node \(i\) w.r.t. relation \(r\). \(c_{i,r}\) is the normalizer equal to \(|\mathcal{N}^r(i)|\). \(\sigma\) is an activation function. \(W_0\) is the self-loop weight.
The basis regularization decomposes \(W_r\) by:
\[W_r^{(l)} = \sum_{b=1}^B a_{rb}^{(l)}V_b^{(l)}\]where \(B\) is the number of bases, \(V_b^{(l)}\) are linearly combined with coefficients \(a_{rb}^{(l)}\).
The block-diagonal-decomposition regularization decomposes \(W_r\) into \(B\) number of block diagonal matrices. We refer \(B\) as the number of bases.
The block regularization decomposes \(W_r\) by:
\[W_r^{(l)} = \oplus_{b=1}^B Q_{rb}^{(l)}\]where \(B\) is the number of bases, \(Q_{rb}^{(l)}\) are block bases with shape \(R^{(d^{(l+1)}/B)*(d^{l}/B)}\).
- Parameters
in_feat (int) – Input feature size; i.e, the number of dimensions of \(h_j^{(l)}\).
out_feat (int) – Output feature size; i.e., the number of dimensions of \(h_i^{(l+1)}\).
num_rels (int) – Number of relations. .
regularizer (str) – Which weight regularizer to use “basis” or “bdd”. “basis” is short for basis-diagonal-decomposition. “bdd” is short for block-diagonal-decomposition.
num_bases (int, optional) – Number of bases. If is none, use number of relations. Default:
None
.bias (bool, optional) – True if bias is added. Default:
True
.activation (callable, optional) – Activation function. Default:
None
.self_loop (bool, optional) – True to include self loop message. Default:
True
.low_mem (bool, optional) – True to use low memory implementation of relation message passing function. Default: False. This option trades speed with memory consumption, and will slowdown the forward/backward. Turn it on when you encounter OOM problem during training or evaluation. Default:
False
.dropout (float, optional) – Dropout rate. Default:
0.0
layer_norm (float, optional) – Add layer norm. Default:
False
Examples
>>> import dgl >>> import numpy as np >>> import tensorflow as tf >>> from dgl.nn import RelGraphConv >>> >>> with tf.device("CPU:0"): >>> g = dgl.graph(([0,1,2,3,2,5], [1,2,3,4,0,3])) >>> feat = tf.ones((6, 10)) >>> conv = RelGraphConv(10, 2, 3, regularizer='basis', num_bases=2) >>> etype = tf.convert_to_tensor(np.array([0,1,2,0,1,2]).astype(np.int64)) >>> res = conv(g, feat, etype) >>> res <tf.Tensor: shape=(6, 2), dtype=float32, numpy= array([[-0.02938664, 1.7932655 ], [ 0.1146394 , 0.48319 ], [-0.02938664, 1.7932655 ], [ 1.2054908 , -0.26098895], [ 0.1146394 , 0.48319 ], [ 0.75915515, 1.1454091 ]], dtype=float32)>
>>> # One-hot input >>> with tf.device("CPU:0"): >>> one_hot_feat = tf.convert_to_tensor(np.array([0,1,2,3,4,5]).astype(np.int64)) >>> res = conv(g, one_hot_feat, etype) >>> res <tf.Tensor: shape=(6, 2), dtype=float32, numpy= array([[-0.24205256, -0.7922753 ], [ 0.62085056, 0.4893622 ], [-0.9484881 , -0.26546806], [-0.2163915 , -0.12585883], [-0.14293689, 0.77483284], [ 0.091169 , -0.06761569]], dtype=float32)>
GATConv¶
-
class
dgl.nn.tensorflow.conv.
GATConv
(in_feats, out_feats, num_heads, feat_drop=0.0, attn_drop=0.0, negative_slope=0.2, residual=False, activation=None, allow_zero_in_degree=False)[source]¶ Bases:
tensorflow.python.keras.engine.base_layer.Layer
Apply Graph Attention Network over an input signal.
\[h_i^{(l+1)} = \sum_{j\in \mathcal{N}(i)} \alpha_{i,j} W^{(l)} h_j^{(l)}\]where \(\alpha_{ij}\) is the attention score bewteen node \(i\) and node \(j\):
\[ \begin{align}\begin{aligned}\alpha_{ij}^{l} &= \mathrm{softmax_i} (e_{ij}^{l})\\e_{ij}^{l} &= \mathrm{LeakyReLU}\left(\vec{a}^T [W h_{i} \| W h_{j}]\right)\end{aligned}\end{align} \]- Parameters
in_feats (int, or pair of ints) – Input feature size; i.e, the number of dimensions of \(h_i^{(l)}\). ATConv can be applied on homogeneous graph and unidirectional bipartite graph. If the layer is to be applied to a unidirectional bipartite graph,
in_feats
specifies the input feature size on both the source and destination nodes. If a scalar is given, the source and destination node feature size would take the same value.out_feats (int) – Output feature size; i.e, the number of dimensions of \(h_i^{(l+1)}\).
num_heads (int) – Number of heads in Multi-Head Attention.
feat_drop (float, optional) – Dropout rate on feature. Defaults:
0
.attn_drop (float, optional) – Dropout rate on attention weight. Defaults:
0
.negative_slope (float, optional) – LeakyReLU angle of negative slope. Defaults:
0.2
.residual (bool, optional) – If True, use residual connection. Defaults:
False
.activation (callable activation function/layer or None, optional.) – If not None, applies an activation function to the updated node features. Default:
None
.allow_zero_in_degree (bool, optional) – If there are 0-in-degree nodes in the graph, output for those nodes will be invalid since no message will be passed to those nodes. This is harmful for some applications causing silent performance regression. This module will raise a DGLError if it detects 0-in-degree nodes in input graph. By setting
True
, it will suppress the check and let the users handle it by themselves. Defaults:False
.
Note
Zero in-degree nodes will lead to invalid output value. This is because no message will be passed to those nodes, the aggregation function will be appied on empty input. A common practice to avoid this is to add a self-loop for each node in the graph if it is homogeneous, which can be achieved by:
>>> g = ... # a DGLGraph >>> g = dgl.add_self_loop(g)
Calling
add_self_loop
will not work for some graphs, for example, heterogeneous graph since the edge type can not be decided for self_loop edges. Setallow_zero_in_degree
toTrue
for those cases to unblock the code and handle zero-in-degree nodes manually. A common practise to handle this is to filter out the nodes with zero-in-degree when use after conv.Examples
>>> import dgl >>> import numpy as np >>> import tensorflow as tf >>> from dgl.nn import GATConv >>> >>> # Case 1: Homogeneous graph >>> with tf.device("CPU:0"): >>> g = dgl.graph(([0,1,2,3,2,5], [1,2,3,4,0,3])) >>> g = dgl.add_self_loop(g) >>> feat = tf.ones((6, 10)) >>> gatconv = GATConv(10, 2, num_heads=3) >>> res = gatconv(g, feat) >>> res <tf.Tensor: shape=(6, 3, 2), dtype=float32, numpy= array([[[ 0.75311995, -1.8093625 ], [-0.12128812, -0.78072834], [-0.49870574, -0.15074375]], [[ 0.75311995, -1.8093625 ], [-0.12128812, -0.78072834], [-0.49870574, -0.15074375]], [[ 0.75311995, -1.8093625 ], [-0.12128812, -0.78072834], [-0.49870574, -0.15074375]], [[ 0.75311995, -1.8093626 ], [-0.12128813, -0.78072834], [-0.49870574, -0.15074375]], [[ 0.75311995, -1.8093625 ], [-0.12128812, -0.78072834], [-0.49870574, -0.15074375]], [[ 0.75311995, -1.8093625 ], [-0.12128812, -0.78072834], [-0.49870574, -0.15074375]]], dtype=float32)>
>>> # Case 2: Unidirectional bipartite graph >>> u = [0, 1, 0, 0, 1] >>> v = [0, 1, 2, 3, 2] >>> g = dgl.heterograph({('A', 'r', 'B'): (u, v)}) >>> with tf.device("CPU:0"): >>> u_feat = tf.convert_to_tensor(np.random.rand(2, 5)) >>> v_feat = tf.convert_to_tensor(np.random.rand(4, 10)) >>> gatconv = GATConv((5,10), 2, 3) >>> res = gatconv(g, (u_feat, v_feat)) >>> res <tf.Tensor: shape=(4, 3, 2), dtype=float32, numpy= array([[[-0.89649093, -0.74841046], [ 0.5088224 , 0.10908248], [ 0.55670375, -0.6811229 ]], [[-0.7905004 , -0.1457274 ], [ 0.2248168 , 0.93014705], [ 0.12816726, -0.4093595 ]], [[-0.85875374, -0.53382933], [ 0.36841977, 0.51498866], [ 0.31893706, -0.5303393 ]], [[-0.89649093, -0.74841046], [ 0.5088224 , 0.10908248], [ 0.55670375, -0.6811229 ]]], dtype=float32)>
SAGEConv¶
-
class
dgl.nn.tensorflow.conv.
SAGEConv
(in_feats, out_feats, aggregator_type, feat_drop=0.0, bias=True, norm=None, activation=None)[source]¶ Bases:
tensorflow.python.keras.engine.base_layer.Layer
GraphSAGE layer from paper Inductive Representation Learning on Large Graphs.
\[ \begin{align}\begin{aligned}h_{\mathcal{N}(i)}^{(l+1)} &= \mathrm{aggregate} \left(\{h_{j}^{l}, \forall j \in \mathcal{N}(i) \}\right)\\h_{i}^{(l+1)} &= \sigma \left(W \cdot \mathrm{concat} (h_{i}^{l}, h_{\mathcal{N}(i)}^{l+1}) \right)\\h_{i}^{(l+1)} &= \mathrm{norm}(h_{i}^{l})\end{aligned}\end{align} \]- Parameters
in_feats (int, or pair of ints) –
Input feature size; i.e, the number of dimensions of \(h_i^{(l)}\).
GATConv can be applied on homogeneous graph and unidirectional bipartite graph. If the layer applies on a unidirectional bipartite graph,
in_feats
specifies the input feature size on both the source and destination nodes. If a scalar is given, the source and destination node feature size would take the same value.If aggregator type is
gcn
, the feature size of source and destination nodes are required to be the same.out_feats (int) – Output feature size; i.e, the number of dimensions of \(h_i^{(l+1)}\).
feat_drop (float) – Dropout rate on features, default:
0
.aggregator_type (str) – Aggregator type to use (
mean
,gcn
,pool
,lstm
).bias (bool) – If True, adds a learnable bias to the output. Default:
True
.norm (callable activation function/layer or None, optional) – If not None, applies normalization to the updated node features.
activation (callable activation function/layer or None, optional) – If not None, applies an activation function to the updated node features. Default:
None
.
Examples
>>> import dgl >>> import numpy as np >>> import tensorflow as tf >>> from dgl.nn import SAGEConv >>> >>> # Case 1: Homogeneous graph >>> with tf.device("CPU:0"): >>> g = dgl.graph(([0,1,2,3,2,5], [1,2,3,4,0,3])) >>> g = dgl.add_self_loop(g) >>> feat = tf.ones((6, 10)) >>> conv = SAGEConv(10, 2, 'pool') >>> res = conv(g, feat) >>> res <tf.Tensor: shape=(6, 2), dtype=float32, numpy= array([[-3.6633523 , -0.90711546], [-3.6633523 , -0.90711546], [-3.6633523 , -0.90711546], [-3.6633523 , -0.90711546], [-3.6633523 , -0.90711546], [-3.6633523 , -0.90711546]], dtype=float32)>
>>> # Case 2: Unidirectional bipartite graph >>> with tf.device("CPU:0"): >>> u = [0, 1, 0, 0, 1] >>> v = [0, 1, 2, 3, 2] >>> g = dgl.bipartite((u, v)) >>> u_fea = tf.convert_to_tensor(np.random.rand(2, 5)) >>> v_fea = tf.convert_to_tensor(np.random.rand(4, 5)) >>> conv = SAGEConv((5, 10), 2, 'mean') >>> res = conv(g, (u_fea, v_fea)) >>> res <tf.Tensor: shape=(4, 2), dtype=float32, numpy= array([[-0.59453356, -0.4055441 ], [-0.47459763, -0.717764 ], [ 0.3221837 , -0.29876417], [-0.63356155, 0.09390211]], dtype=float32)>
ChebConv¶
-
class
dgl.nn.tensorflow.conv.
ChebConv
(in_feats, out_feats, k, activation=<function relu>, bias=True)[source]¶ Bases:
tensorflow.python.keras.engine.base_layer.Layer
Chebyshev Spectral Graph Convolution layer from paper Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering.
\[ \begin{align}\begin{aligned}h_i^{l+1} &= \sum_{k=0}^{K-1} W^{k, l}z_i^{k, l}\\Z^{0, l} &= H^{l}\\Z^{1, l} &= \tilde{L} \cdot H^{l}\\Z^{k, l} &= 2 \cdot \tilde{L} \cdot Z^{k-1, l} - Z^{k-2, l}\\\tilde{L} &= 2\left(I - \tilde{D}^{-1/2} \tilde{A} \tilde{D}^{-1/2}\right)/\lambda_{max} - I\end{aligned}\end{align} \]where \(\tilde{A}\) is \(A\) + \(I\), \(W\) is learnable weight.
- Parameters
in_feats (int) – Dimension of input features; i.e, the number of dimensions of \(h_i^{(l)}\).
out_feats (int) – Dimension of output features \(h_i^{(l+1)}\).
k (int) – Chebyshev filter size \(K\).
activation (function, optional) – Activation function. Default
ReLu
.bias (bool, optional) – If True, adds a learnable bias to the output. Default:
True
.
Example
>>> import dgl >>> import numpy as np >>> import tensorflow as tf >>> from dgl.nn import ChebConv >>> with tf.device("CPU:0"): ... g = dgl.graph(([0,1,2,3,2,5], [1,2,3,4,0,3])) ... feat = tf.ones((6, 10)) ... conv = ChebConv(10, 2, 2) ... res = conv(g, feat) ... res <tf.Tensor: shape=(6, 2), dtype=float32, numpy= array([[ 0.6163, -0.1809], [ 0.6163, -0.1809], [ 0.6163, -0.1809], [ 0.9698, -1.5053], [ 0.3664, 0.7556], [-0.2370, 3.0164]], dtype=float32)>
SGConv¶
-
class
dgl.nn.tensorflow.conv.
SGConv
(in_feats, out_feats, k=1, cached=False, bias=True, norm=None, allow_zero_in_degree=False)[source]¶ Bases:
tensorflow.python.keras.engine.base_layer.Layer
Simplifying Graph Convolution layer from paper Simplifying Graph Convolutional Networks.
\[H^{K} = (\tilde{D}^{-1/2} \tilde{A} \tilde{D}^{-1/2})^K X \Theta\]where \(\tilde{A}\) is \(A\) + \(I\). Thus the graph input is expected to have self-loop edges added.
- Parameters
in_feats (int) – Number of input features; i.e, the number of dimensions of \(X\).
out_feats (int) – Number of output features; i.e, the number of dimensions of \(H^{K}\).
k (int) – Number of hops \(K\). Defaults:
1
.cached (bool) –
If True, the module would cache
\[(\tilde{D}^{-\frac{1}{2}}\tilde{A}\tilde{D}^{-\frac{1}{2}})^K X\Theta\]at the first forward call. This parameter should only be set to
True
in Transductive Learning setting.bias (bool) – If True, adds a learnable bias to the output. Default:
True
.norm (callable activation function/layer or None, optional) – If not None, applies normalization to the updated node features. Default:
False
.allow_zero_in_degree (bool, optional) – If there are 0-in-degree nodes in the graph, output for those nodes will be invalid since no message will be passed to those nodes. This is harmful for some applications causing silent performance regression. This module will raise a DGLError if it detects 0-in-degree nodes in input graph. By setting
True
, it will suppress the check and let the users handle it by themselves. Default:False
.
Note
Zero in-degree nodes will lead to invalid output value. This is because no message will be passed to those nodes, the aggregation function will be appied on empty input. A common practice to avoid this is to add a self-loop for each node in the graph if it is homogeneous, which can be achieved by:
>>> g = ... # a DGLGraph >>> g = dgl.add_self_loop(g)
Calling
add_self_loop
will not work for some graphs, for example, heterogeneous graph since the edge type can not be decided for self_loop edges. Setallow_zero_in_degree
toTrue
for those cases to unblock the code and handle zero-in-degree nodes manually. A common practise to handle this is to filter out the nodes with zero-in-degree when use after conv.Example
>>> import dgl >>> import numpy as np >>> import tensorflow as tf >>> from dgl.nn import SGConv >>> >>> with tf.device("CPU:0"): >>> g = dgl.graph(([0,1,2,3,2,5], [1,2,3,4,0,3])) >>> g = dgl.add_self_loop(g) >>> feat = tf.ones((6, 10)) >>> conv = SGConv(10, 2, k=2, cached=True) >>> res = conv(g, feat) >>> res <tf.Tensor: shape=(6, 2), dtype=float32, numpy= array([[0.61023676, 0.5246612 ], [0.61023676, 0.5246612 ], [0.61023676, 0.5246612 ], [0.8697353 , 0.7477695 ], [0.60570633, 0.520766 ], [0.6102368 , 0.52466124]], dtype=float32)>
APPNPConv¶
-
class
dgl.nn.tensorflow.conv.
APPNPConv
(k, alpha, edge_drop=0.0)[source]¶ Bases:
tensorflow.python.keras.engine.base_layer.Layer
Approximate Personalized Propagation of Neural Predictions layer from paper Predict then Propagate: Graph Neural Networks meet Personalized PageRank.
\[ \begin{align}\begin{aligned}H^{0} & = X\\H^{t+1} & = (1-\alpha)\left(\hat{D}^{-1/2} \hat{A} \hat{D}^{-1/2} H^{t}\right) + \alpha H^{0}\end{aligned}\end{align} \]
GINConv¶
-
class
dgl.nn.tensorflow.conv.
GINConv
(apply_func, aggregator_type, init_eps=0, learn_eps=False)[source]¶ Bases:
tensorflow.python.keras.engine.base_layer.Layer
Graph Isomorphism Network layer from paper How Powerful are Graph Neural Networks?.
\[h_i^{(l+1)} = f_\Theta \left((1 + \epsilon) h_i^{l} + \mathrm{aggregate}\left(\left\{h_j^{l}, j\in\mathcal{N}(i) \right\}\right)\right)\]- Parameters
apply_func (callable activation function/layer or None) – If not None, apply this function to the updated node feature, the \(f_\Theta\) in the formula.
aggregator_type (str) – Aggregator type to use (
sum
,max
ormean
).init_eps (float, optional) – Initial \(\epsilon\) value, default:
0
.learn_eps (bool, optional) – If True, \(\epsilon\) will be a learnable parameter. Default:
False
.
Example
>>> import dgl >>> import numpy as np >>> import tensorflow as tf >>> from dgl.nn import GINConv >>> >>> with tf.device("CPU:0"): >>> g = dgl.graph(([0,1,2,3,2,5], [1,2,3,4,0,3])) >>> feat = tf.ones((6, 10)) >>> lin = tf.keras.layers.Dense(10) >>> conv = GINConv(lin, 'max') >>> res = conv(g, feat) >>> res <tf.Tensor: shape=(6, 10), dtype=float32, numpy= array([[-0.1090256 , 1.9050574 , -0.30704725, -1.995831 , -0.36399186, 1.10414 , 2.4885745 , -0.35387516, 1.3568261 , 1.7267858 ], [-0.1090256 , 1.9050574 , -0.30704725, -1.995831 , -0.36399186, 1.10414 , 2.4885745 , -0.35387516, 1.3568261 , 1.7267858 ], [-0.1090256 , 1.9050574 , -0.30704725, -1.995831 , -0.36399186, 1.10414 , 2.4885745 , -0.35387516, 1.3568261 , 1.7267858 ], [-0.1090256 , 1.9050574 , -0.30704725, -1.995831 , -0.36399186, 1.10414 , 2.4885745 , -0.35387516, 1.3568261 , 1.7267858 ], [-0.1090256 , 1.9050574 , -0.30704725, -1.995831 , -0.36399186, 1.10414 , 2.4885745 , -0.35387516, 1.3568261 , 1.7267858 ], [-0.0545128 , 0.9525287 , -0.15352362, -0.9979155 , -0.18199593, 0.55207 , 1.2442873 , -0.17693758, 0.67841303, 0.8633929 ]], dtype=float32)>
Global Pooling Layers¶
Tensorflow modules for graph global pooling.
SumPooling¶
-
class
dgl.nn.tensorflow.glob.
SumPooling
[source]¶ Bases:
tensorflow.python.keras.engine.base_layer.Layer
Apply sum pooling over the nodes in the graph.
\[r^{(i)} = \sum_{k=1}^{N_i} x^{(i)}_k\]-
call
(graph, feat)[source]¶ Compute sum pooling.
- Parameters
graph (DGLGraph) – The graph.
feat (tf.Tensor) – The input feature with shape \((N, *)\) where \(N\) is the number of nodes in the graph.
- Returns
The output feature with shape \((B, *)\), where \(B\) refers to the batch size.
- Return type
tf.Tensor
-
AvgPooling¶
-
class
dgl.nn.tensorflow.glob.
AvgPooling
[source]¶ Bases:
tensorflow.python.keras.engine.base_layer.Layer
Apply average pooling over the nodes in the graph.
\[r^{(i)} = \frac{1}{N_i}\sum_{k=1}^{N_i} x^{(i)}_k\]-
call
(graph, feat)[source]¶ Compute average pooling.
- Parameters
graph (DGLGraph) – The graph.
feat (tf.Tensor) – The input feature with shape \((N, *)\) where \(N\) is the number of nodes in the graph.
- Returns
The output feature with shape \((B, *)\), where \(B\) refers to the batch size.
- Return type
tf.Tensor
-
MaxPooling¶
-
class
dgl.nn.tensorflow.glob.
MaxPooling
[source]¶ Bases:
tensorflow.python.keras.engine.base_layer.Layer
Apply max pooling over the nodes in the graph.
\[r^{(i)} = \max_{k=1}^{N_i}\left( x^{(i)}_k \right)\]-
call
(graph, feat)[source]¶ Compute max pooling.
- Parameters
graph (DGLGraph) – The graph.
feat (tf.Tensor) – The input feature with shape \((N, *)\) where \(N\) is the number of nodes in the graph.
- Returns
The output feature with shape \((B, *)\), where \(B\) refers to the batch size.
- Return type
tf.Tensor
-
SortPooling¶
-
class
dgl.nn.tensorflow.glob.
SortPooling
(k)[source]¶ Bases:
tensorflow.python.keras.engine.base_layer.Layer
Apply Sort Pooling (An End-to-End Deep Learning Architecture for Graph Classification) over the nodes in the graph.
- Parameters
k (int) – The number of nodes to hold for each graph.
-
call
(graph, feat)[source]¶ Compute sort pooling.
- Parameters
graph (DGLGraph) – The graph.
feat (tf.Tensor) – The input feature with shape \((N, D)\) where \(N\) is the number of nodes in the graph.
- Returns
The output feature with shape \((B, k * D)\), where \(B\) refers to the batch size.
- Return type
tf.Tensor
GlobalAttentionPooling¶
-
class
dgl.nn.tensorflow.glob.
GlobalAttentionPooling
(gate_nn, feat_nn=None)[source]¶ Bases:
tensorflow.python.keras.engine.base_layer.Layer
Apply Global Attention Pooling (Gated Graph Sequence Neural Networks) over the nodes in the graph.
\[r^{(i)} = \sum_{k=1}^{N_i}\mathrm{softmax}\left(f_{gate} \left(x^{(i)}_k\right)\right) f_{feat}\left(x^{(i)}_k\right)\]- Parameters
gate_nn (tf.layers.Layer) – A neural network that computes attention scores for each feature.
feat_nn (tf.layers.Layer, optional) – A neural network applied to each feature before combining them with attention scores.
-
call
(graph, feat)[source]¶ Compute global attention pooling.
- Parameters
graph (DGLGraph) – The graph.
feat (tf.Tensor) – The input feature with shape \((N, D)\) where \(N\) is the number of nodes in the graph.
- Returns
The output feature with shape \((B, *)\), where \(B\) refers to the batch size.
- Return type
tf.Tensor
Heterogeneous Graph Convolution Module¶
HeteroGraphConv¶
-
class
dgl.nn.tensorflow.
HeteroGraphConv
(mods, aggregate='sum')[source]¶ Bases:
tensorflow.python.keras.engine.base_layer.Layer
A generic module for computing convolution on heterogeneous graphs.
The heterograph convolution applies sub-modules on their associating relation graphs, which reads the features from source nodes and writes the updated ones to destination nodes. If multiple relations have the same destination node types, their results are aggregated by the specified method. If the relation graph has no edge, the corresponding module will not be called.
Pseudo-code:
outputs = {nty : [] for nty in g.dsttypes} # Apply sub-modules on their associating relation graphs in parallel for relation in g.canonical_etypes: stype, etype, dtype = relation dstdata = relation_submodule(g[relation], ...) outputs[dtype].append(dstdata) # Aggregate the results for each destination node type rsts = {} for ntype, ntype_outputs in outputs.items(): if len(ntype_outputs) != 0: rsts[ntype] = aggregate(ntype_outputs) return rsts
Examples
Create a heterograph with three types of relations and nodes.
>>> import dgl >>> g = dgl.heterograph({ ... ('user', 'follows', 'user') : edges1, ... ('user', 'plays', 'game') : edges2, ... ('store', 'sells', 'game') : edges3})
Create a
HeteroGraphConv
that applies different convolution modules to different relations. Note that the modules for'follows'
and'plays'
do not share weights.>>> import dgl.nn.pytorch as dglnn >>> conv = dglnn.HeteroGraphConv({ ... 'follows' : dglnn.GraphConv(...), ... 'plays' : dglnn.GraphConv(...), ... 'sells' : dglnn.SAGEConv(...)}, ... aggregate='sum')
Call forward with some
'user'
features. This computes new features for both'user'
and'game'
nodes.>>> import tensorflow as tf >>> h1 = {'user' : tf.random.normal((g.number_of_nodes('user'), 5))} >>> h2 = conv(g, h1) >>> print(h2.keys()) dict_keys(['user', 'game'])
Call forward with both
'user'
and'store'
features. Because both the'plays'
and'sells'
relations will update the'game'
features, their results are aggregated by the specified method (i.e., summation here).>>> f1 = {'user' : ..., 'store' : ...} >>> f2 = conv(g, f1) >>> print(f2.keys()) dict_keys(['user', 'game'])
Call forward with some
'store'
features. This only computes new features for'game'
nodes.>>> g1 = {'store' : ...} >>> g2 = conv(g, g1) >>> print(g2.keys()) dict_keys(['game'])
Call forward with a pair of inputs is allowed and each submodule will also be invoked with a pair of inputs.
>>> x_src = {'user' : ..., 'store' : ...} >>> x_dst = {'user' : ..., 'game' : ...} >>> y_dst = conv(g, (x_src, x_dst)) >>> print(y_dst.keys()) dict_keys(['user', 'game'])
- Parameters
mods (dict[str, nn.Module]) – Modules associated with every edge types. The forward function of each module must have a DGLHeteroGraph object as the first argument, and its second argument is either a tensor object representing the node features or a pair of tensor object representing the source and destination node features.
aggregate (str, callable, optional) –
Method for aggregating node features generated by different relations. Allowed string values are ‘sum’, ‘max’, ‘min’, ‘mean’, ‘stack’. The ‘stack’ aggregation is performed along the second dimension, whose order is deterministic. User can also customize the aggregator by providing a callable instance. For example, aggregation by summation is equivalent to the follows:
def my_agg_func(tensors, dsttype): # tensors: is a list of tensors to aggregate # dsttype: string name of the destination node type for which the # aggregation is performed stacked = tf.stack(tensors, axis=0) return tf.reduce_sum(stacked, axis=0)
-
call
(g, inputs, mod_args=None, mod_kwargs=None)[source]¶ Forward computation
Invoke the forward function with each module and aggregate their results.
- Parameters
g (DGLHeteroGraph) – Graph data.
inputs (dict[str, Tensor] or pair of dict[str, Tensor]) – Input node features.
mod_args (dict[str, tuple[any]], optional) – Extra positional arguments for the sub-modules.
mod_kwargs (dict[str, dict[str, any]], optional) – Extra key-word arguments for the sub-modules.
- Returns
Output representations for every types of nodes.
- Return type