tracel-ai/burn v0.21.0-pre.4 on GitHub

What's Changed

Update cubek: tile matmul refactor (#4888) @louisfd
Add ctc_loss backend trait hook + tch and cubecl impls (#4819) @antimora
Centralize internal burn-* deps in [workspace.dependencies] (#4876) @antimora
Update cubecl + cubek: fix matmul, reduce WASM and vector size check on strided tensors (#4874) @laggui
Split Associated Types from Backend into BackendTypes (#4868) @skewballfox
All reduce backward (#4873) @Charles23R
Update/cubecl to client (#4866) @Charles23R
Fix select_assign OOB units (#4870) @laggui
Add linear op to ModuleOps for fused matmul+bias (#4747) @antimora
Add burn-std::config runtime configuration with fusion logging and search optimization (#4864) @nathanielsimard
Fix Typo in One Hot encoding class size error (#4869) @Baseng0815
Fix fusion reduce broadcasted when multi block local might be a view (#4867) @laggui
Add STFT/ISTFT and thread n through FFT backend trait (#4835) @antimora
Fix burn-flex argmax NaN ordering; tighten expand; precise erf (#4859) @antimora
Fix burn-flex sum_dim reading contiguous storage on transposed input (#4861) @antimora
Fix rustls-webpki audit (#4863) @laggui
Add det (determinant) tensor operation (#4813) @softmaximalist
Add Blackman window function to signal module (#4842) @softmaximalist
Display FlexDevice as Cpu (#4857) @antimora
Update cubecl: refactor toml config, fix autotune priority and fix persistent memory pool reset (#4858) @nathanielsimard
Migrate default test backend from NdArray to Flex (#4854) @antimora
Use burn-flex in docs and examples (#4841) @antimora
Fix burn-flex to_contiguous fast path for prefix views (#4856) @antimora
Migrate benchmarks from burn-flex to burn-backend-tests (#4853) @antimora
Fix autotune context, remove unsafe code (#4781) @ArthurBrussee
Override float_mean in cubecl backends (#4840) @laggui
Device service usage (#4839) @nathanielsimard
Fusion all reduce + refactor collective (#4803) @Charles23R
Add missing dispatch overrides and native tch ops for softmax, layer_norm (#4834) @antimora
Fix CrossEntropyLoss with probabilities (#4829) @laggui
Move tensor tests from burn-flex to burn-backend-tests (#4812) @antimora
Remove unused M param from SimpleOptimizerMapper. (#4823) @crutcher
Forward gemm perf features and fix burn-flex SIMD flag cascade (#4826) @antimora
Add Record<(R0,)> 1-Tuple (#4825) @crutcher
Cleanup OptimizerAdaptor / GradAdaptor API. (#4822) @crutcher
Prep for Group Multi Optimizers (#4818) @crutcher
Fix clippy lints (#4820) @laggui
Matmul selection (#4773) @nathanielsimard
Fix conv x-backward padding_out bug (#4806) @antimora
burn-flex: implement softmax and layer_norm backend op (#4805) @antimora
Add FloatInfo for dtype-aware precision info (#4721) @antimora
Add softmax and layer_norm backend trait hooks (#4797) @antimora
Update bitstream-io & rustls-webpki (yanked + audit) (#4801) @laggui
feat(burn-nn): add native LocalResponseNorm module (#4765) @jcwal1516
Fix: make module cloning efficient for CPU devices (#4703) @antimora
burn-flex: enable f16 tests and fix mean overflow, grid_sample and quantization (#4769) @antimora
Seed CubeCL normal distribution test (#4791) @leohenon
Drop burn-flex I64 debug_asserts (#4780) @antimora
fix(vision): propagate backend features to burn-vision (#4753) @jcwal1516
Optimize and update LU decomposition function (#4738) @softmaximalist
Fix burn-flex attention rejecting broadcasted mask/bias (#4777) @antimora
Fix burn-flex bool binary ops to broadcast operands (#4775) @antimora
Add burn-flex CPU backend (#4761) @antimora
Fix flaky initializer_normal_init test (#4766) @leohenon
Fix unsqueeze_dims panic on duplicate sorted axes (#4764) @antimora
fix(ndarray): grouped conv SIMD clamp + regressions (#4727) @dnvt
Fix xtask CI renamed feature (#4763) @laggui
Fix/fusion autotune context (#4759) @nathanielsimard

Full Changelog: v0.21.0-pre.2...v0.21.0-pre.3