dmlc/dgl 0.5.3 on GitHub

This is a patch release mainly for supporting CUDA 11.0. Now DGL supports CUDA 11.0 and PyTorch 1.7 on Linux/Windows/Mac.

Other fixes include:

Performance fix of graph batching: #2363
Speedup on readout: #2361
Speedup in CPU SpMM with sum reducer: #2309 (thanks @pawelpiotrowicz )
Performance optimization that removes redundant copies between CPU and GPU: #2266 #2267 (thanks @nv-dlasalle )
Fix segment_reduce() ignoring tailing 0 segments (#2228) (thanks @mjwen)
Fix crash due to unfound attribute (#2262) (thanks @Samiisd )
Performance optimization in COO-CSR conversion (#2356 ) (thanks @IzabelaMazur )
Parallelization in heterogeneous graph format conversion (#2148) (thanks @mozga-intel )
Fix a bug to enable distributed training of RGCN with CPU (#2345) (thanks @mszarma )
Numerous documentation fixes (kudos to @cafeal , @maqy1995 , @sw32-seo, @157492196 , @chwan-rice , @ZenoTan )

New examples:

The Chinese user guide has been released for chapter 1 to 4 (#2351). Thanks @zhjwy9343 for coordination and kudos to all the offline contributors!

dmlc/dgl 0.5.3 v0.5.3 on GitHub