v0.8.0 is a major release with many new features, system improvement and fixes. Read the blog for the highlighted features.

Major features

Mini-batch Sampling Pipeline Update

Enabled CUDA UVA-based optimization and feature prefetching for all built-in graph samplers (up to 4x speedup compared to v0.7). Users can now specify the features to prefetch and turn on UVA optimization in dgl.dataloading.Sampler and dgl.dataloading.DataLoader.

g = ...                             # some DGLGraph data
train_nids = ...                    # training node IDs
sampler = dgl.dataloading.MultiLayerNeighborSampler(
    fanout=[10, 15],
    prefetch_node_feats=['feat'],   # prefetch node feature 'feat'
    prefetch_labels=['label'],      # prefetch node label 'label'
)
dataloader = dgl.dataloading.DataLoader(
    g, train_nids, sampler,
    device='cuda:0',     # perform sampling on GPU 0
    batch_size=1024,
    shuffle=True,
    use_uva=True         # turn on UVA optimization
)

We have done a major refactor on the sampling components to make it easier to implement new graph samplers. Added a new base class dgl.dataloading.Sampler with one abstract method sample for overriding. Added new APIs dgl.set_src_lazy_features, dgl.set_dst_lazy_features, dgl.set_node_lazy_features, dgl.set_edge_lazy_features for customizing prefetching rules. The code below shows the new user experience.

class NeighborSampler(dgl.dataloading.Sampler):
    def __init__(self,
                 fanouts : list[int],
                 prefetch_node_feats: list[str] = None,
                 prefetch_edge_feats: list[str] = None,
                 prefetch_labels: list[str] = None):
        super().__init__()
        self.fanouts = fanouts
        self.prefetch_node_feats = prefetch_node_feats
        self.prefetch_edge_feats = prefetch_edge_feats
        self.prefetch_labels = prefetch_labels

    def sample(self, g, seed_nodes):
        output_nodes = seed_nodes
        subgs = []
        for fanout in reversed(self.fanouts):
            # Sample a fixed number of neighbors of the current seed nodes.
            sg = g.sample_neighbors(seed_nodes, fanout)
            # Convert this subgraph to a message flow graph.
            sg = dgl.to_block(sg, seed_nodes)
            seed_nodes = sg.srcdata[NID]
            subgs.insert(0, sg)
         input_nodes = seed_nodes
         
         # handle prefetching
         dgl.set_src_lazy_features(subgs[0], self.prefetch_node_feats)
         dgl.set_dst_lazy_features(subgs[-1], self.prefetch_labels)
         for subg in subgs:
             dgl.set_edge_lazy_features(subg, self.prefetch_edge_feats)

         return input_nodes, output_nodes, subgs

DGL-Go

DGL-Go is a new command line tool for users to get started with training, using and studying Graph Neural Networks (GNNs). Data scientists can quickly apply GNNs to their problems, whereas researchers will find it useful to customize their experiments.

The initial release include

Four commands, dgl train, dgl recipe, dgl configure and dgl export.
3 training pipelines for node prediction using full graph training, link prediction using full graph training and node prediction using neighbor sampling.
5 node encoding models: gat, gcn, gin, sage, sgc; 3 edge encoding models: bilinear, dot-product, element-wise.
10 datasets including custom dataset in CSV format.

NN Modules

We have accelerated dgl.nn.RelGraphConv and dgl.nn.HGTConv by up to 36x and 12x compared with the baselines from v0.7 and PyG. Shortened the implementation of dgl.nn.RelGraphConv by 3x (from 200L → 64L).

Breaking change: dgl.nn.RelGraphConv no longer accepts 1-D integer tensor representing node IDs during forward. Please switch to torch.nn.Embedding to explicitly represent trainable node embeddings.

Below are the new NN modules added to v0.8:

GATv2Conv: GATv2 from How Attentive are Graph Attention Networks?
EGATConv: Graph attention layer that handles edge features from Rossmann-Toolbox
EdgePredictor: Predictor/score function for pairs of node representations
TransE: Similarity measure from Translating Embeddings for Modeling Multi-relational Data
TransR: Similarity measure from Learning entity and relation embeddings for knowledge graph completion
HeteroLinear: Apply linear transformations on heterogeneous inputs.
HeteroEmbedding: Create a heterogeneous embedding table.
HGTConv: Heterogeneous graph transformer convolution from Heterogeneous Graph Transformer
TypedLinear: Linear transformation according to types.
JumpingKnowledge: The Jumping Knowledge aggregation module from Representation Learning on Graphs with Jumping Knowledge Networks
GNNExplainer: GNNExplainer model from GNNExplainer: Generating Explanations for Graph Neural Networks

A new edge_weight argument is added to several GNN modules to support training on weighted graph. Added a new user guide chapter 5.5 about how to use edge weights in your GNN model.

Graph Dataset and Transforms

Rename the old dgl.transform package to dgl.transforms to follow PyTorch’s namespace convention. All DGL’s datasets now accept an extra transforms keyword argument for data augmentation and transformation:

import dgl
import dgl.transforms as T
t = T.Compose([
    T.AddSelfLoop(),
    T.GCNNorm(),
])
dataset = dgl.data.CoraGraphDataset(transform=t)
g = dataset[0]  # graph and features will be transformed automatically

Added 16 graph data transforms module:

Compose: Create a transform composed of multiple transforms in sequence.
AddSelfLoop: Add self-loops for each node in the graph and return a new graph.
RemoveSelfLoop: Remove self-loops for each node in the graph and return a new graph.
AddReverse: Add a reverse edge (i,j) for each edge (j,i) in the input graph and return a new graph.
ToSimple: Convert a graph to a simple graph without parallel edges and return a new graph.
LineGraph: Return the line graph of the input graph.
KHopGraph: Return the graph whose edges connect the k-hop neighbors of the original graph.
AddMetaPaths: Add new edges to an input graph based on given metapaths, as described in Heterogeneous Graph Attention Network.
GCNNorm: Apply symmetric adjacency normalization to an input graph and save the result edge weights, as described in Semi-Supervised Classification with Graph Convolutional Networks.
PPR: Apply personalized PageRank (PPR) to an input graph for diffusion, as introduced in The pagerank citation ranking: Bringing order to the web.
HeatKernel: Apply heat kernel to an input graph for diffusion, as introduced in Diffusion kernels on graphs and other discrete structures.
GDC: Apply graph diffusion convolution (GDC) to an input graph, as introduced in Diffusion Improves Graph Learning.
NodeShuffle: Randomly shuffle the nodes.
DropNode: Randomly drop nodes, as described in Graph Contrastive Learning with Augmentations.
DropEdge: Randomly drop edges, as described in DropEdge: Towards Deep Graph Convolutional Networks on Node Classification and Graph Contrastive Learning with Augmentations.
AddEdge: Randomly add edges, as described in Graph Contrastive Learning with Augmentations.

Added several dataset utilities:

dgl.data.CSVDataset: A new dataset for loading and parsing graph data stored in CSV format. Added a new user guide chapter about how to prepare CSV data and use this dataset.
dgl.data.AsNodePredDataset: Repurpose a dataset for a standard semi-supervised transductive node prediction task.
dgl.data.AsLinkPredDataset: Repurpose a dataset for link prediction task.
dgl.data.utils.add_nodepred_split: Split the given dataset into training, validation and test sets for transductive node prediction task.

Model Examples

A major rework of two classical examples:

GraphSAGE (https://github.com/dmlc/dgl/tree/master/examples/pytorch/graphsage): Clean up the main folder and keep the training scripts suitable for new users. Move advanced training methodologies (e.g., unsupervised training, training with PyTorch Lightning) to the advanced subfolder. Rename experimental to dist.
RGCN (https://github.com/dmlc/dgl/tree/master/examples/pytorch/rgcn): Clean up the main folder and keep the training scripts suitable for new users. Simplify the RGCN model in the example to increase readability.

7 new examples:

GATv2: https://github.com/dmlc/dgl/tree/master/examples/pytorch/gatv2
Point Transformer: https://github.com/dmlc/dgl/tree/master/examples/pytorch/pointcloud/point_transformer
GeniePath: https://github.com/dmlc/dgl/tree/master/examples/pytorch/geniepath
CARE-GNN: https://github.com/dmlc/dgl/tree/master/examples/pytorch/caregnn
GAS: https://github.com/dmlc/dgl/tree/59a7d0d1c023528974ef43a2e3c8b99b6dce1894/examples/pytorch/gas
EvolveGCN: https://github.com/dmlc/dgl/tree/master/examples/pytorch/evolveGCN
An example that re-creates the OGB leaderboard performance on ogbn-mag: https://github.com/dmlc/dgl/tree/master/examples/pytorch/ogb/ogbn-mag

GNNLens

GNNLens is an interactive visualization tool for graph neural networks (GNN). It integrates GNN explanation model to analyze and understand graph data. See the repository here: https://github.com/dmlc/gnnlens2

Distributed Training

Allow launching persistent graph server (will not exit even if all training workers have finished) to speed up distributed experiments on the same graph data. See the user guide chapter for more details.
Breaking change: separate the data loaders for single-device and distributed training. Passing a DistGraph to dgl.dataloading.NodeDataLoader will cause an error. Please use dgl.dataloading.DistNodeDataLoader instead.
Replace the low-level network communicator with pytorch/tensorpipe.
dgl.sample_etype_neighbors now works for DistGraph. #3558

Documentation

DGL User Guide in Korean is now live. Thanks @muhyun for the contribution.

Other API Updates

dgl.ops.segment_mm: An operator to perform matrix multiplication according to segments.
dgl.ops.gather_mm: An operator to perform matrix multiplication according to look-up indices.
dgl.merge: Merge a sequence of graphs together into a single one. @noncomputable #3522
dgl.dataloading.GlobalUniform: A negative sampler that draws negative samples uniformly from all nodes. #3599
dgl.DGLGraph.pin_memory_, dgl.DGLGraph.unpin_memory_ and dgl.DGLGraph.is_pinned to pin, unpin and check a DGLGraph to page-locked memory.
A new CPU kernel for dgl.edge_softmax. @ranzhejiang #3650
New CUDA kernel implementation that accelerated dgl.node_subgraph, dgl.in_subgraph, dgl.in_edges by several orders of magnitudes. @ayasar70, #3745
dgl.reorder_graph supports reordering edges according to user-provided permutation.

Patch and Bugfixes

Fixed an off-by-one bug in GenericRandomWalk(). @erickim555, #3500
Cleanup codebase and remove unused third_party dependency.
Fixed a device error in the pytorch/MNIST example. @sinhaharsh #3527
Improved the speed of PinSAGESampler by fusing several operations. @lixiaobai09, #3529
Enable CUDA PinSAGESampler. @lixiaobai09, #3567
Fixed GATv2Conv residual for mini-batch. @ksadowski13, #3535
Fixed the output dimensions of residual connection for GATv2Conv. @schmidt-ju, #3584
Improved csr2coo.cu:_RepeatKernal() for more robust GPU usage. @ayasar70, #3537
Fixed the dimension mismatch issue in PinSAGE example. #3539
Fixed a bug of GinConv when using in pickle. @lizeyan #3540
Fixed a bug in distributed training where improper data splitting causes training hanging. #3542
Fixed a bunch of bugs in distributed training. @xcwanAndy #3607
Fix a bug in TGN example. #3543
Improved building by only rebuilding libxsmm if necessary. #3497
Improved the documentation of TUDataset on the order. @sangyx #3549
Fixed a bug in distributed SparseAdam optimizer. #3561
Fixed a bug in TWIRLS module and example. @FFTYYY #3573
Fixed a bug in ndata and edata where lazy copy is triggered unnecessarily. #3585
Fixed a bug when using int32 array. @hirayaku, #3597
Fixed a bug of KNN graph on TensorFlow.
Fixed a bug in to_bidirected where a simple graph is needed. #3630
Fixed a compilation bug in parallel_for.h. #3631
Fixed a compilation crash related to libuv-devel. #3640
Remove the info message of RDFLib and “using backend: xxx” when importing DGL.
Dataset dependencies are loaded only when the dataset object is created.
Fixed a bug in distributed training of conflicting ports. #3658
Improved the CompGCN example. @nxznm #3663
Improved the GIN example on reproducibility. @miziha-zp #3676
Improved SAGEConv by adding sanity check on aggregator type. @thatlittleboy #3691
Fixed a bug in launching multiple DGL programs in parallel. #3696
Fix the document of GraphSAGE normalization. @KoyamaSohei, #3711

Breaking Changes & Deprecations

DGL now requires PyTorch >= 1.9.0.
Building from source now requires compiler with c++14 support.
For multi-GPU training, the new strategy is to use shared memory to speedup inter-process communication. Users may sometimes experience a "not enough shared memory" error. If it happens, please increase the shared memory capacity.

dmlc/dgl 0.8.0 v0.8.0 on GitHub