Features
ONNX
- Add more ONNX export support to operators (#19625)
- onnx support more ops (#19653)
- ONNX _contrib_interleaved_matmul_selfatt_valatt and LayerNorm (#19661)
- Improve onnx test suite (#19662)
- Make ONNX export operators work properly with the node input shape (#19676)
- Onnx fix slice_axis and embedding and reshape (#19677)
- Add more onnx export unit tests, refactor onnxruntime tests. (#19689)
- Update onnx export support for FullyConnected and add unit tests (#19679)
- Add coverage to onnx test pipeline. (#19682)
- onnx test coverage for leakyrelu elemwise_add concat activation (#19687)
- ONNX fix softmax (#19691)
- More onnx export updates (#19692)
- onnx fix fullyconnected (#19693)
- ONNX fix embedding and slice (#19695)
- Add more CV models to onnxruntime inference test, add bert model test. (#19697)
- Add more ONNX export operator support (#19727)
- ONNX Supoort for MXNet repeat op (#19732)
- ONNX Supoort for MXNet _contrib_BilinearResize2D op (#19733)
- ONNX support adaptiveAveragePooling2D and update Softmax to support temperature (#19736)
- ONNX Supoort for MXNet reverse op (#19737)
- Add onnx export support for where and greater_scalar operators. (#19745)
- ONNX support for box_decode (#19750)
- ONNX contrib_box_nms (#19755)
- Onnx support for reshape_like (#19759)
- ONNX conversion for topk (#19761)
- _maximum_scalar (#19763)
- Onnx export support for gather_nd (#19767)
- ONNX support for broadcast_mod (#19770)
- Onnx export support for batch_dot (#19775)
- ONNX support for slice_like (#19782)
- ONNX export support for SwapAxis (#19789)
- broadcast_like (#19791)
- ONNX support for Softmax -- optimize for axis=-1 case (#19794)
- Onnx support for upsampling (#19795)
- ONNX export support for multiple input data types (#19796)
- Refactor onnx tests for object classification, add object detection tests (#19802)
- Onnx Reshpe support for special caes (#19804)
- Onnx export support for ROIAlign (#19814)
- Add image segmentation end-to-end tests and expand object classification tests (#19815)
- Add onnx operator unit tests for sum, broadcast_mul (#19820)
- Add onnx export function for log2 operator, add operator unit test and update tests to allow comparing NaN values. (#19822)
- ONNX 1.6 compatibility fix + fix for when multiple nodes have the same name (#19823)
- Add ONNX export support for equal_scalar operator (#19824)
- ONNX Export Support for Pooling & Convolution (#19831)
- Add onnx end-to-end tests for pose estimation and action recognition models. (#19834)
- new cases (#19835)
- batchnorm tests (#19836)
- Onnx Support for Dropout (#19837)
- Bump Up CI ONNX Tests Thread Count (#19845)
- nnx export support for slicechannel and box_nms (#19846)
- Move majority of ONNX model tests to nightly, only test a few models in PR pipeline (#19848)
- ONNX export rewrite Take (#19851)
- ONNX export fix slice_axis (#19853)
- ONNX support for argsort (#19854)
- enable 3d convolution (#19855)
- ONNX export rewrite tile (#19868)
- reshape corner cases for mask rcnn (#19875)
- refactor code (#19887)
- Add onnx export operator for minimum_scalar. (#19888)
- ONNX Fixes (#19914)
- Add onnx export support and unit tests for zeros and ones. (#19951)
- Add onnx export support for one_hot and random_uniform_like and unit tests for one_hot. (#19952)
- ONNX support for SequenceReverse (#19954)
- ONNX export support for RNN (#19958)
- ONNX Fixes for some NLP models (#19973)
- ONNX Type inference support (#19990)
- add roberta tests (#19996)
- add ONNX DistilBERT tests (#19999)
- Onnx Dynamic Shapes Support (#20001)
- ONNX Support for pretrained StandardRNN models (#20017)
- Add AWDRNN Pratrained model test (#20018)
- fix squeeze (#20020)
- website update for 1.8.0 (#20021)
- add ernie onnx test (#20030)
- Onnx Support for Transformer (#20048)
- ONNX export support for GRU (#20060)
- ONNX support fot gpt models (#20061)
- Rearrange ONNX tests in Nightly CI (#20075)
- ONNX Graduation (#20094)
- fix typo (#20106)
- MXNet export for ONNX 1.8 support (#20113)
- split cv tests (#20117)
- skip one test (#20122)
- fix onnx type inference issue (#20130)
- Add mx2onnx operator support matrix (#20139)
- fix mx2onnx wheel (#20157)
- increase test tolerance (#20161)
- ONNX legacy operator fix and test (#20165)
- Onnx Fix 6 MaskRCNN models (#20178)
- onnx legacy operator unit tests + fixes (#20179)
- add faster_rcnn_fpn models (#20190)
- fix test (#20191)
- Add onnx export operator unit tests. (#20192)
- Add more onnx operator export unit tests (#20194)
- ONNX support rewrite norm (#20195)
- ONNX export support from arg/aux params (#20198)
- bump onnxruntime version (#20199)
- skip cv tests (#20208)
- ONNX fix log_softmax for opset 12 (#20209)
- Add more ONNX model tests (#20210)
- ONNX export support for RNN and sum_axis (#20226)
- Add ONNX model support matrix (#20230)
- ONNX optimize softmax (#20231)
- fix (#20240)
- add example (#20245)
- ONNX add support coverage for Reshape and lstm (#20246)
- ONNX support for _split_v2 (#20250)
- ONNX fix RNN input shape (#20255)
- Update ONNX tutorial and doc (#20253)
- change some shapes from 10d to 8d (#20258)
- ONNX export support broadcast_not_equal (#20259)
- ONNX: fix error handling when op is not registered (#20261)
- ONNX tweak Resize op (#20264)
- Add more onnx export unit tests, refactor onnxruntime tests. (#19689)
- ONNX docs and tutorial revision #20269
- onnx fix rnn (#20272)
OneDNN
- Implement oneDNN deconvolution primitives to deconvolution 2D (#20107)
- [Feature] Add oneDNN support for interleaved_matmulselfatt* operators (fp32/int8) (#20163)
- Bumped oneDNN version to 1.6.5 (#19449)
- [submodule] Upgrade oneDNN to v2.0 (#19670)
- Impose a plain format for concat’s output when oneDNN would use padding (#19735)
- [submodule] Upgrade to oneDNN v1.7 (#19559)
- Add test case for oneDNN RNN (#19464)
- Fusing gelu post operator in Fully Connected symbol (#19971)
- [submodule] Upgrade oneDNN to v1.6.4 (#19276)
- ElementWiseSum fix for oneDNN (#18777) (#19199)
ARM support
- Add aarch64 support (#20252)
- Revise MKLDNN Builds on Arm and Add a CMake Template for Arm (#20266)
CI-CD improvements
- Fix Nightly CI (#20019)
- correcting cuda 11.2 image name in CI and CD (#19960)
- CI fixes to make more stable and upgradable (#19895)
- Address CI failures with docker timeouts (v2) (#19890)
- Attempt to fix v1.x CI issues. (#19872)
- Update CI build scripts to install python 3.6 from deadsnakes repo (#19788)
- Fix R builds on CI (#19656)
- Update CD Jenkins config for include/mkldnn/oneapi/dnnl (#19725)
- Fix CI builds failing due to invalid GPG keys. (#19377)
- Disable unix-gpu-cu110 pipeline for v1.x build since we now build with cuda 11.0 in windows pipelines. (#19828)
- [BACKPORT]Enable CUDA 11.0 on nightly + CUDA 11.2 on pip (#19295)(#19764) (#19930)
- Fix nightly cd cu102 (#19940)
- Drop cu9x in cd (#19902)
- update cudnn from 7 to 8 for cu102 (#19522)
- update cudnn from 7 to 8 for cu102 (#19506)
- [v.1x] Attempt to fix v1.x cd by installing new cuda compt package (#19959)
- [FEATURE]Migrating all CD pipelines to Ninja build + fix cu112 CD pipeline (#19974)
- Fix nightly CD for python docker image releases (#19774)
- [CD] Fix nightly docker missing lib (#20120)
- [CD] Fix CD cu102 110 112 cuda compatibility (#20116)
- Disable codecov. (#20175)
- Static build for mxnet-cu110 (#19272)
Subgraph API
- Move block.optimize_for backend_opts to kwargs (#19386)
- Backport Enable Numpy support for Gluon Block optimize_for to v1.x (#19456)
- Save/Load Gluon Blocks & HybridBlocks (#19565)
- Fixed setting attributes in reviewSubgraph (#19274)
- Fix for optimize_for multiple subgraph properties issue (#19263) (#20142)
- Reuse params from cached_op_args (#20221)
MXNet-TensorRT
- Simplify TRT build by adding onnx_tensorrt targets in CMake (#19742)
- Add 1:many conversions in nnvm_to_onnx and non-flatten GEMM (#19652)
- TRT test update (#19296)
- Fix TRT INT8 unsupported hardware error handling (#19349)
- Update MXNet-TRT doc with the new optimize_for API (#19385)
- Fix precision vars initialization in TRT (#20277)
Build system
- Fix gcc 10 build (#20216)
- Change gcc 8 PPA to ppa:jonathonf/gcc (#19638)
- Add option to build with shared c runtime on windows (#19409) (#19932)
- Create tool for building source archives (#19972)
- [PIP] update manifest to include lib_api.cc (#19850) (#19912)
- Fix windows dll loading for compute capabilties >7.5 (#19931)
- [PIP] add build target in cmake for osx compat (#19110) (#19926)
Documentation
- update news.md and readme.md for 1.8.0 release (#19976)
- Fix python doc version dropdown (#20189)
- Fix cu100 pip link (#20084)
License
- adding License in libmxnet make config .sym and .ver files (#19937)
- add missing license fix from master to v1.x (#19916)
- Fix license for blockingconcurrentqueue (#19910)
- update notice year (#19893)
- Backport [LICENSE] Reorganize rat-excludes file to ease license auditing (#19743) (#19799)
- Update LICENSE (#19704)
- [LICENSE] Change intgemm to a submodule instead of fetch. (#19407)
Website improvements
- add djl and autogluon to website (#19981) (#19995)
- add website artifacts pipeline (#19397)
- v1.x website patch (#19192)
Bug fixes & misc
- Fix take gradient (#20166)
- Pip Build: use runtime.Features instead of manual check for mkldnn headers (#19195) (#19928)
- Fix AmpCast for float16 (#19749)
- Backport extension bug fixes to v1.x (#19469) (#19503)
- Fix MKLDNN BatchNorm with even number of channels (#19150) #19299 #19425 (#19445)
- Fix to quantization dshape bug (#19501)
- Update ps-lite to fix the zmq not found issue (#20248)
- update short desc for pip (#20237)
- set osx deploy target for v1.x wheels (#20127)
- downloading MNIST dataset from alternate URL (#20014)
- pass version param (#19982)
- Remove unmaintained BLC (#19801)
- Update setup.py for darwin builds (#19130) (#19927)
- Unskip Flaky test_gluon_data tests (#19919)
- WAR the dataloader issue with forked processes holding stale references (#19924)
- For ECR, ensure we sanitize region input from environment variable (#19882)
- Migrate to use ECR as docker cache instead of dockerhub (#19654)
- provide a faster PrefetchedDataLoader (#19748)
- Update dgl_graph.cc (#19827)
- initial commit (#19757)
- [PERFORMANCE] Layer normalization code from Marian for CPU (#19601)
- fixed macros with name (#19669)
- Fix local variable ‘optimizer’ referenced before assignment (#19666)
- Support destructors for custom stateful ops (#19607)
- Remove obsolete six dependency (#19620)
- Backport Faster pointwise fusion graph pass (#19269) (#19413)
- Don’t use namespace for pow() function, since it is built into cuda math library, and cast the second - - argument so it will find an acceptable form. (#19532)
- backport #19037 (#19514)
- Allow eliminating common subexpressions when temp space is used (#19487)
- Fix h5py version on (#nto)
- Relaxing type requirements for broadcast_like (#17977) (#19447)
- initial disclaimer update (#19402) (#19415)
- backport slice assign large tensor fix (#19399)
- Fix SoftReLU fused operator numerical stability (#17849) (#19391)
- backport #19393 to v1.x (#19396)
- Remove extra --build-arg causing docker command to fail. (#19411)
- backport fixes in master branch (#19356)
- Remove build_ccache_wrappers invocation from R-package unittests (#19305)
- Refactor cmake cpp-package & add missing inference/imagenet_inference (#19228)
- Backport PRs in v1.7.x missing from v1.x to v1.8.x (#19262) (#19281)
- fixing breaking change introduced in #17123 when batch_axis=0 (#19267)
- Backport of #19078 (#19095)
- added key for samskalicky (#19224)
- delete executor before reallocating it memory (#19214)
- Nightly Large Tensor test cherrypicks (#19194)
- Updated v1.x to version 1.9 after branching v1.8.x (#19196)
- Fix flaky test #19197 by avoiding case that 0.45 mapped to 0.5 (#19201)
- Tweeking syntax to be closer to other tests (#19186)
- Add code signing key (#20276)