Jan 9, 2025
- Add support to train and validate in pure
bfloat16
orfloat16
wandb
project name arg added by https://github.com/caojiaolong, use arg.experiment for name- Fix old issue w/ checkpoint saving not working on filesystem w/o hard-link support (e.g. FUSE fs mounts)
- 1.0.13 release
Jan 6, 2025
- Add
torch.utils.checkpoint.checkpoint()
wrapper intimm.models
that defaultsuse_reentrant=False
, unlessTIMM_REENTRANT_CKPT=1
is set in env.
Dec 31, 2024
convnext_nano
384x384 ImageNet-12k pretrain & fine-tune. https://huggingface.co/models?search=convnext_nano%20r384- Add AIM-v2 encoders from https://github.com/apple/ml-aim, see on Hub: https://huggingface.co/models?search=timm%20aimv2
- Add PaliGemma2 encoders from https://github.com/google-research/big_vision to existing PaliGemma, see on Hub: https://huggingface.co/models?search=timm%20pali2
- Add missing L/14 DFN2B 39B CLIP ViT,
vit_large_patch14_clip_224.dfn2b_s39b
- Fix existing
RmsNorm
layer & fn to match standard formulation, use PT 2.5 impl when possible. Move old impl toSimpleNorm
layer, it's LN w/o centering or bias. There were only twotimm
models using it, and they have been updated. - Allow override of
cache_dir
arg for model creation - Pass through
trust_remote_code
for HF datasets wrapper inception_next_atto
model added by creator- Adan optimizer caution, and Lamb decoupled weighgt decay options
- Some feature_info metadata fixed by https://github.com/brianhou0208
- All OpenCLIP and JAX (CLIP, SigLIP, Pali, etc) model weights that used load time remapping were given their own HF Hub instances so that they work with
hf-hub:
based loading, and thus will work with new TransformersTimmWrapperModel
What's Changed
- Punch cache_dir through model factory / builder / pretrain helpers by @rwightman in #2356
- Yuweihao inception next atto merge by @rwightman in #2360
- Dataset trust remote tweaks by @rwightman in #2361
- Add --dataset-trust-remote-code to the train.py and validate.py scripts by @grodino in #2328
- Fix feature_info.reduction by @brianhou0208 in #2369
- Add caution to Adan. Add decouple decay option to LAMB. by @rwightman in #2357
- Switching to timm specific weight instances for open_clip image encoders by @rwightman in #2376
- Fix broken image link in
Quickstart
doc by @ariG23498 in #2381 - Supporting aimv2 encoders by @rwightman in #2379
- fix: minor typos in markdowns by @ruidazeng in #2382
- Add 384x384 in12k pretrain and finetune for convnext_nano by @rwightman in #2384
- Fixed unfused attn2d scale by @laclouis5 in #2387
- Fix MQA V2 by @laclouis5 in #2388
- Wrap torch checkpoint() fn to default use_reentrant flag to False and allow env var override by @rwightman in #2394
- Add half-precision (bfloat16, float16) support to train & validate scripts by @rwightman in #2397
- Merging wandb project name chages w/ addition by @rwightman in #2398
New Contributors
- @brianhou0208 made their first contribution in #2369
- @ariG23498 made their first contribution in #2381
- @ruidazeng made their first contribution in #2382
- @laclouis5 made their first contribution in #2387
Full Changelog: v1.0.12...v1.0.13