tracel-ai/burn v0.20.0-pre.6 on GitHub

What's Changed

doc warning fix by @crutcher in #4130
Fix tch bf16 into_data by @laggui in #4142
Update raspberry-pi-pico example to use the Pico 2, and burnpack by @BjornTheProgrammer in #4132
Unify all_reduce LocalCollectiveClient operation handling. by @crutcher in #4125
Add direct tensor snapshot retrieval API to ModuleStore by @antimora in #4131
Fix outer-scope variable references in ONNX subgraphs (If/Loop/Scan) by @antimora in #4119
Add removed docs for tensor equal_elem by @laggui in #4145
Add ceil_mode support to pooling operations (MaxPool, AvgPool) by @antimora in #4112
chore: Update cubecl by @wingertge in #4134
Implement Slice iterator and utility methods. by @crutcher in #4042
Bump peter-evans/create-pull-request from 7 to 8 by @dependabot[bot] in #4148
Add slice_dyn, slice_assign_dyn, and slice_fill_dyn variants. by @crutcher in #4127
Add Reshape scalar optimization and Gather scalar input support by @antimora in #4146
Shape FromStr/ToString by @crutcher in #4143
Add contiguous reindexing for non-contiguous layer indices by @antimora in #4150
Add warmup epochs to MetricEarlyStoppingStrategy. (#3970) by @crutcher in #4041
fix(onnx): Use activation function for GELU codegen instead of non-existent tensor method by @antimora in #4161
Refactor more basic ops by @laggui in #4156
Refactor LocalCollectiveServer for improved clarity and error handling by @crutcher in #4126
Fix typo in comment for logger_task function by @crutcher in #4159
Refactor configurable backend tests (no more testgen macros) by @laggui in #4129
Zero-copy loading for embedded burnpack weights by @antimora in #4154
Fix candle cuda imports by @laggui in #4171
Backends no longer depend on burn-tensor, but strictly burn-backend by @laggui in #4169
Chore/update cubek cubecl by @nathanielsimard in #4172
Add ONNX CumSum operator support by @antimora in #4162
Add backend supports_dtype by @laggui in #4155
Fix attention shapes and out rank by @laggui in #4192
Fix matmul & reduce execute fuse no autotune by @laggui in #4193
Fix output dtype for argmin / argmax by @laggui in #4195
Add flatten_dims method to Shape and refactor tensor flattening API by @crutcher in #4189
Return slice for each dimension in shape by @laggui in #4152
Make xtask validate run no-std checks first. by @crutcher in #4198
Fix: CubeCL Reduce by @nathanielsimard in #4197
Reorganize and tracing::instrument collective operations. by @crutcher in #4157
Log running values by @Charles23R in #4199
Remove global ONNX opset version restriction, recommend opset 16 by @antimora in #4168
Fix dtype preservation when loading tensors in burn-store by @antimora in #4194
Fix TchTensor::from_data bf16 by @laggui in #4203
Perf/reduce cpu + Fix OOB by @nathanielsimard in #4204
feat: Implicit GEMM weight gradients for convolution by @wingertge in #4182
Fix checkpoint and summary log level by @J-F-Liu in #4201
fix: handle 1D slope when importing prelu from onnx by @mertalev in #4205
Zero-copy tensor loading for NdArray backend by @antimora in #4178
Fix quantized tensor storage data length calculation by @antimora in #4180
Fix handling scalar scan outputs in ONNX loop nodes by @antimora in #4210
Perf/improve reduce autotuning + plane non uniform control flow check by @nathanielsimard in #4208
Add ONNX external data support for models >2GB by @antimora in #4158
Update/cubek by @louisfd in #4214
Refactor: Replace canonicalize_dim with expect_dim by @crutcher in #4196
fix: handle negative indices in onnx gather op by @mertalev in #4207
Refactor/cube dim by @nathanielsimard in #4217
Refactor: Consolidate shape and slice error handling into ExpressionError by @crutcher in #4218
Update: CubeK by @louisfd in #4222
feat: Accelerated convolution data gradient by @wingertge in #4220
Fix repeat 0 times by @laggui in #4216
Burn train api refactor by @Charles23R in #4223
Chore/pre release 6 by @nathanielsimard in #4224