v0.8.0 is a major release with many new features, system improvement and fixes. Read the blog for the highlighted features.
Major features
Mini-batch Sampling Pipeline Update
Enabled CUDA UVA-based optimization and feature prefetching for all built-in graph samplers (up to 4x speedup compared to v0.7). Users can now specify the features to prefetch and turn on UVA optimization in dgl.dataloading.Sampler
and dgl.dataloading.DataLoader
.
g = ... # some DGLGraph data
train_nids = ... # training node IDs
sampler = dgl.dataloading.MultiLayerNeighborSampler(
fanout=[10, 15],
prefetch_node_feats=['feat'], # prefetch node feature 'feat'
prefetch_labels=['label'], # prefetch node label 'label'
)
dataloader = dgl.dataloading.DataLoader(
g, train_nids, sampler,
device='cuda:0', # perform sampling on GPU 0
batch_size=1024,
shuffle=True,
use_uva=True # turn on UVA optimization
)
We have done a major refactor on the sampling components to make it easier to implement new graph samplers. Added a new base class dgl.dataloading.Sampler
with one abstract method sample for overriding. Added new APIs dgl.set_src_lazy_features
, dgl.set_dst_lazy_features
, dgl.set_node_lazy_features
, dgl.set_edge_lazy_features
for customizing prefetching rules. The code below shows the new user experience.
class NeighborSampler(dgl.dataloading.Sampler):
def __init__(self,
fanouts : list[int],
prefetch_node_feats: list[str] = None,
prefetch_edge_feats: list[str] = None,
prefetch_labels: list[str] = None):
super().__init__()
self.fanouts = fanouts
self.prefetch_node_feats = prefetch_node_feats
self.prefetch_edge_feats = prefetch_edge_feats
self.prefetch_labels = prefetch_labels
def sample(self, g, seed_nodes):
output_nodes = seed_nodes
subgs = []
for fanout in reversed(self.fanouts):
# Sample a fixed number of neighbors of the current seed nodes.
sg = g.sample_neighbors(seed_nodes, fanout)
# Convert this subgraph to a message flow graph.
sg = dgl.to_block(sg, seed_nodes)
seed_nodes = sg.srcdata[NID]
subgs.insert(0, sg)
input_nodes = seed_nodes
# handle prefetching
dgl.set_src_lazy_features(subgs[0], self.prefetch_node_feats)
dgl.set_dst_lazy_features(subgs[-1], self.prefetch_labels)
for subg in subgs:
dgl.set_edge_lazy_features(subg, self.prefetch_edge_feats)
return input_nodes, output_nodes, subgs
Related documentations:
- Reworked the user guide chapter for customizing graph samplers.
- Added a new user guide chapter for writing graph samplers with feature prefetching.
We thank Xin Yao (@yaox12 ) and Dominique LaSalle (@nv-dlasalle ) from NVIDIA and David Min (@davidmin7 ) from UIUC for their contributions.
DGL-Go
DGL-Go is a new command line tool for users to get started with training, using and studying Graph Neural Networks (GNNs). Data scientists can quickly apply GNNs to their problems, whereas researchers will find it useful to customize their experiments.
The initial release include
- Four commands,
dgl train
,dgl recipe
,dgl configure
anddgl export
. - 3 training pipelines for node prediction using full graph training, link prediction using full graph training and node prediction using neighbor sampling.
- 5 node encoding models: gat, gcn, gin, sage, sgc; 3 edge encoding models: bilinear, dot-product, element-wise.
- 10 datasets including custom dataset in CSV format.
NN Modules
We have accelerated dgl.nn.RelGraphConv
and dgl.nn.HGTConv
by up to 36x and 12x compared with the baselines from v0.7 and PyG. Shortened the implementation of dgl.nn.RelGraphConv
by 3x (from 200L → 64L).
Breaking change: dgl.nn.RelGraphConv
no longer accepts 1-D integer tensor representing node IDs during forward. Please switch to torch.nn.Embedding
to explicitly represent trainable node embeddings.
Below are the new NN modules added to v0.8:
GATv2Conv
: GATv2 from How Attentive are Graph Attention Networks?EGATConv
: Graph attention layer that handles edge features from Rossmann-ToolboxEdgePredictor
: Predictor/score function for pairs of node representationsTransE
: Similarity measure from Translating Embeddings for Modeling Multi-relational DataTransR
: Similarity measure from Learning entity and relation embeddings for knowledge graph completionHeteroLinear
: Apply linear transformations on heterogeneous inputs.HeteroEmbedding
: Create a heterogeneous embedding table.HGTConv
: Heterogeneous graph transformer convolution from Heterogeneous Graph TransformerTypedLinear
: Linear transformation according to types.JumpingKnowledge
: The Jumping Knowledge aggregation module from Representation Learning on Graphs with Jumping Knowledge NetworksGNNExplainer
: GNNExplainer model from GNNExplainer: Generating Explanations for Graph Neural Networks
A new edge_weight
argument is added to several GNN modules to support training on weighted graph. Added a new user guide chapter 5.5 about how to use edge weights in your GNN model.
Graph Dataset and Transforms
Rename the old dgl.transform
package to dgl.transforms
to follow PyTorch’s namespace convention. All DGL’s datasets now accept an extra transforms keyword argument for data augmentation and transformation:
import dgl
import dgl.transforms as T
t = T.Compose([
T.AddSelfLoop(),
T.GCNNorm(),
])
dataset = dgl.data.CoraGraphDataset(transform=t)
g = dataset[0] # graph and features will be transformed automatically
Added 16 graph data transforms module:
Compose
: Create a transform composed of multiple transforms in sequence.AddSelfLoop
: Add self-loops for each node in the graph and return a new graph.RemoveSelfLoop
: Remove self-loops for each node in the graph and return a new graph.AddReverse
: Add a reverse edge (i,j) for each edge (j,i) in the input graph and return a new graph.ToSimple
: Convert a graph to a simple graph without parallel edges and return a new graph.LineGraph
: Return the line graph of the input graph.KHopGraph
: Return the graph whose edges connect the k-hop neighbors of the original graph.AddMetaPaths
: Add new edges to an input graph based on given metapaths, as described in Heterogeneous Graph Attention Network.GCNNorm
: Apply symmetric adjacency normalization to an input graph and save the result edge weights, as described in Semi-Supervised Classification with Graph Convolutional Networks.PPR
: Apply personalized PageRank (PPR) to an input graph for diffusion, as introduced in The pagerank citation ranking: Bringing order to the web.HeatKernel
: Apply heat kernel to an input graph for diffusion, as introduced in Diffusion kernels on graphs and other discrete structures.GDC
: Apply graph diffusion convolution (GDC) to an input graph, as introduced in Diffusion Improves Graph Learning.NodeShuffle
: Randomly shuffle the nodes.DropNode
: Randomly drop nodes, as described in Graph Contrastive Learning with Augmentations.DropEdge
: Randomly drop edges, as described in DropEdge: Towards Deep Graph Convolutional Networks on Node Classification and Graph Contrastive Learning with Augmentations.AddEdge
: Randomly add edges, as described in Graph Contrastive Learning with Augmentations.
Added several dataset utilities:
dgl.data.CSVDataset
: A new dataset for loading and parsing graph data stored in CSV format. Added a new user guide chapter about how to prepare CSV data and use this dataset.dgl.data.AsNodePredDataset
: Repurpose a dataset for a standard semi-supervised transductive node prediction task.dgl.data.AsLinkPredDataset
: Repurpose a dataset for link prediction task.dgl.data.utils.add_nodepred_split
: Split the given dataset into training, validation and test sets for transductive node prediction task.
Model Examples
A major rework of two classical examples:
- GraphSAGE (https://github.com/dmlc/dgl/tree/master/examples/pytorch/graphsage): Clean up the main folder and keep the training scripts suitable for new users. Move advanced training methodologies (e.g., unsupervised training, training with PyTorch Lightning) to the advanced subfolder. Rename experimental to dist.
- RGCN (https://github.com/dmlc/dgl/tree/master/examples/pytorch/rgcn): Clean up the main folder and keep the training scripts suitable for new users. Simplify the RGCN model in the example to increase readability.
7 new examples:
- GATv2: https://github.com/dmlc/dgl/tree/master/examples/pytorch/gatv2
- Point Transformer: https://github.com/dmlc/dgl/tree/master/examples/pytorch/pointcloud/point_transformer
- GeniePath: https://github.com/dmlc/dgl/tree/master/examples/pytorch/geniepath
- CARE-GNN: https://github.com/dmlc/dgl/tree/master/examples/pytorch/caregnn
- GAS: https://github.com/dmlc/dgl/tree/59a7d0d1c023528974ef43a2e3c8b99b6dce1894/examples/pytorch/gas
- EvolveGCN: https://github.com/dmlc/dgl/tree/master/examples/pytorch/evolveGCN
- An example that re-creates the OGB leaderboard performance on ogbn-mag: https://github.com/dmlc/dgl/tree/master/examples/pytorch/ogb/ogbn-mag
GNNLens
GNNLens is an interactive visualization tool for graph neural networks (GNN). It integrates GNN explanation model to analyze and understand graph data. See the repository here: https://github.com/dmlc/gnnlens2
Distributed Training
- Allow launching persistent graph server (will not exit even if all training workers have finished) to speed up distributed experiments on the same graph data. See the user guide chapter for more details.
- Breaking change: separate the data loaders for single-device and distributed training. Passing a
DistGraph
todgl.dataloading.NodeDataLoader
will cause an error. Please usedgl.dataloading.DistNodeDataLoader
instead. - Replace the low-level network communicator with pytorch/tensorpipe.
dgl.sample_etype_neighbors
now works forDistGraph
. #3558
Documentation
- DGL User Guide in Korean is now live. Thanks @muhyun for the contribution.
Other API Updates
dgl.ops.segment_mm
: An operator to perform matrix multiplication according to segments.dgl.ops.gather_mm
: An operator to perform matrix multiplication according to look-up indices.dgl.merge
: Merge a sequence of graphs together into a single one. @noncomputable #3522dgl.dataloading.GlobalUniform
: A negative sampler that draws negative samples uniformly from all nodes. #3599dgl.DGLGraph.pin_memory_
,dgl.DGLGraph.unpin_memory_
anddgl.DGLGraph.is_pinned
to pin, unpin and check aDGLGraph
to page-locked memory.- A new CPU kernel for
dgl.edge_softmax
. @ranzhejiang #3650 - New CUDA kernel implementation that accelerated
dgl.node_subgraph
,dgl.in_subgraph
,dgl.in_edges
by several orders of magnitudes. @ayasar70, #3745 dgl.reorder_graph
supports reordering edges according to user-provided permutation.
Patch and Bugfixes
- Fixed an off-by-one bug in GenericRandomWalk(). @erickim555, #3500
- Cleanup codebase and remove unused third_party dependency.
- Fixed a device error in the pytorch/MNIST example. @sinhaharsh #3527
- Improved the speed of PinSAGESampler by fusing several operations. @lixiaobai09, #3529
- Enable CUDA PinSAGESampler. @lixiaobai09, #3567
- Fixed GATv2Conv residual for mini-batch. @ksadowski13, #3535
- Fixed the output dimensions of residual connection for GATv2Conv. @schmidt-ju, #3584
- Improved csr2coo.cu:_RepeatKernal() for more robust GPU usage. @ayasar70, #3537
- Fixed the dimension mismatch issue in PinSAGE example. #3539
- Fixed a bug of GinConv when using in pickle. @lizeyan #3540
- Fixed a bug in distributed training where improper data splitting causes training hanging. #3542
- Fixed a bunch of bugs in distributed training. @xcwanAndy #3607
- Fix a bug in TGN example. #3543
- Improved building by only rebuilding libxsmm if necessary. #3497
- Improved the documentation of TUDataset on the order. @sangyx #3549
- Fixed a bug in distributed SparseAdam optimizer. #3561
- Fixed a bug in TWIRLS module and example. @FFTYYY #3573
- Fixed a bug in ndata and edata where lazy copy is triggered unnecessarily. #3585
- Fixed a bug when using int32 array. @hirayaku, #3597
- Fixed a bug of KNN graph on TensorFlow.
- Fixed a bug in to_bidirected where a simple graph is needed. #3630
- Fixed a compilation bug in parallel_for.h. #3631
- Fixed a compilation crash related to libuv-devel. #3640
- Remove the info message of RDFLib and “using backend: xxx” when importing DGL.
- Dataset dependencies are loaded only when the dataset object is created.
- Fixed a bug in distributed training of conflicting ports. #3658
- Improved the CompGCN example. @nxznm #3663
- Improved the GIN example on reproducibility. @miziha-zp #3676
- Improved SAGEConv by adding sanity check on aggregator type. @thatlittleboy #3691
- Fixed a bug in launching multiple DGL programs in parallel. #3696
- Fix the document of GraphSAGE normalization. @KoyamaSohei, #3711
Breaking Changes & Deprecations
- DGL now requires PyTorch >= 1.9.0.
- Building from source now requires compiler with c++14 support.
- For multi-GPU training, the new strategy is to use shared memory to speedup inter-process communication. Users may sometimes experience a "not enough shared memory" error. If it happens, please increase the shared memory capacity.