torch 1.0.1 on Python PyPI

Note: our conda install commands have slightly changed. Version specifiers such as cuda100 in conda install pytorch cuda100 -c pytorch have changed to conda install pytorch cudatoolkit=10.0 -c pytorch

Breaking Changes

There are no breaking changes in this release.

Bug Fixes

Serious

Higher order gradients for CPU Convolutions have been fixed (regressed in 1.0.0 under MKL-DNN setting) #15686
Correct gradients for non-contiguous weights in CPU Convolutions #16301
Fix ReLU on CPU Integer Tensors by fixing vec256 inversions #15634
Fix bincount for non-contiguous Tensors #15109
Fix torch.norm on CPU for large Tensors #15602
Fix eq_ to do equality on GPU (was doing greater-equal due to a typo) (#15475)
Workaround a CuDNN bug that gave wrong results in certain strided convolution gradient setups
- blacklist fft algorithms for strided dgrad (#16626)

Correctness

Fix cuda native loss_ctc for varying input length (#15798)
- this avoids NaNs in variable length settings
C++ Frontend: Fix serialization (#15033)
- Fixes a bug where (de-)/serializing a hierarchy of submodules where one submodule doesn't have any parameters, but its submodules do
Fix derivative for mvlgamma (#15049)
Fix numerical stability in log_prob for Gumbel distribution (#15878)
multinomial: fix detection and drawing of zero probability events (#16075)

Crashes

PyTorch binaries were crashing on AWS Lambda and a few other niche systems, stemming from CPUInfo handling certain warnings as errors. Updated CPUInfo with relevant fixes.
MKL-DNN is now statically built, to avoid conflicts with system versions
Allow ReadyQueue to handle empty tasks (#15791)
- Fixes a segfault with a DataParallel + Checkpoint neural network setting
Avoid integer divide by zero error in index_put_ (#14984)
Fix for model inference crash on Win10 (#15919) (#16092)
Use CUDAGuard when serializing Tensors:
- Before this change, torch.save and torch.load would initialize the CUDA context on GPU 0 if it hadn't been initialized already, even if the serialized tensors are only on GPU 1.
Fix error with handling scalars and rpow, for example 1 ^^ x, where x is a PyTorch scalar (#16687)
Switch to CUDA implementation instead of CuDNN if batch size >= 65536 for affine_grid (#16403)
- CuDNN crashes when batch size >= 65536
[Distributed] TCP init method race condition fix (#15684)
[Distributed] Fix a memory leak in Gloo's CPU backend
[C++ Frontend] Fix LBFGS issue around using inplace ops (#16167)
[Hub] Fix github branch prefix v (#15552)
[Hub] url download bugfix for URLs served without Content-Length header

Performance

LibTorch binaries now ship with CuDNN enabled. Without this change, many folks saw significant perf differences while using LibTorch vs PyTorch, this should be fixed now. #14976
Make btriunpack work for high dimensional batches and faster than before (#15286)
improve performance of unique with inverse indices (#16145)
Re-enable OpenMP in binaries (got disabled because of a CMake refactor)

Other

create type hint stub files for module torch (#16089)
- This will restore auto-complete functionality in PyCharm, VSCode etc.
Fix sum_to behavior with zero dimensions (#15796)
Match NumPy by considering NaNs to be larger than any number when sorting (#15886)
Fixes various error message / settings in dynamic weight GRU / LSTMs (#15766)
C++ Frontend: Make call operator on module holder call forward (#15831)
C++ Frontend: Add the normalize transform to the core library (#15891)
Fix bug in torch::load and unpack torch::optim::detail namespace (#15926)
Implements Batched upper triangular, lower triangular (#15257)
Add torch.roll to documentation (#14880)
(better errors) Add backend checks for batch norm (#15955)

JIT

Add better support for bools in the graph fuser (#15057)
Allow tracing with fork/wait (#15184)
improve script/no script save error (#15321)
Add self to Python printer reserved words (#15318)
Better error when torch.load-ing a JIT model (#15578)
fix select after chunk op (#15672)
Add script standard library documentation + cleanup (#14912)

torch 1.0.1 Bug Fix Release on Python PyPI