0.11.0
Major Features
- Apple Core ML model converter
- Support for Keras v1.2.2
- Gluon Interface (experimental)
- Updated LICENSE and NOTICE files.
- For more information see full release notes
API Changes
- Added
CachedOp
. You can now cache the operators that’s called frequently with the same set of arguments to reduce overhead. - Added sample_multinomial for sampling from multinomial distributions.
- Added
trunc
operator for rounding towards zero. - Added linalg_gemm, linalg_potrf, ... operators for lapack support.
- Added verbose option to Initializer for printing out initialization details.
- Added DeformableConvolution to contrib from the Deformable Convolutional Networks paper.
- Added float64 support for dot and batch_dot operator.
allow_extra
is added to Module.set_params to ignore extra parameters.- Added
mod
operator for modulo. - Added
multi_precision
option to SGD optimizer to improve training with float16. Resnet50 now achieves the same accuracy when trained with float16 and gives 50% speedup on Titan XP.
Performance Improvements
- ImageRecordIter now stores data in pinned memory to improve GPU memcopy speed.
Bugfixes
- Fixed a bug in Adam that causes weight decay to be handled incorrectly. If you are using Adam, you may need to tune learning rate a little to get the same performance as previous versions.
- Remove WaitToRead in dist-kvstore: Improves performance 20-30% for distributed training.
- Cython interface is fixed.
make cython
andpython setup.py install --with-cython
should install the cython interface and reduce overhead in applications that use imperative/bucketing. - Fixed various bugs in Faster-RCNN example: #6486
- Fixed various bugs in SSD example.
- Fixed
out
argument not working forzeros
,ones
,full
, etc. expand_dims
now supports backward shape inference.- Fixed a bug in rnn. BucketingSentenceIter that causes incorrect layout handling on multi-GPU.
- Fixed context mismatch when loading optimizer states.
- Fixed a bug in ReLU activation when using MKL.
- Fixed a few race conditions that causes crashes on shutdown.
- Fixed image-classification example code.
Refactors
- Refactored TShape/TBlob to use int64 dimensions and DLTensor as internal storage. Getting ready for migration to DLPack. As a result TBlob::dev_mask_ and TBlob::stride_ are removed.
Known Issues
- Inception-V3 model can be converted into CoreML format but is unable to run on Xcode.