Feb 7, 2023
- New inference benchmark numbers added in results folder.
- Add convnext LAION CLIP trained weights and initial set of in1k fine-tunes
convnext_base.clip_laion2b_augreg_ft_in1k
- 86.2% @ 256x256convnext_base.clip_laiona_augreg_ft_in1k_384
- 86.5% @ 384x384convnext_large_mlp.clip_laion2b_augreg_ft_in1k
- 87.3% @ 256x256convnext_large_mlp.clip_laion2b_augreg_ft_in1k_384
- 87.9% @ 384x384
- Add DaViT models. Supports
features_only=True
. Adapted from https://github.com/dingmyu/davit by Fredo. - Use a common NormMlpClassifierHead across MaxViT, ConvNeXt, DaViT
- Add EfficientFormer-V2 model, update EfficientFormer, and refactor LeViT (closely related architectures). Weights on HF hub.
- New EfficientFormer-V2 arch, significant refactor from original at (https://github.com/snap-research/EfficientFormer). Supports
features_only=True
. - Minor updates to EfficientFormer.
- Refactor LeViT models to stages, add
features_only=True
support to newconv
variants, weight remap required.
- New EfficientFormer-V2 arch, significant refactor from original at (https://github.com/snap-research/EfficientFormer). Supports
- Move ImageNet meta-data (synsets, indices) from
/results
totimm/data/_info
. - Add ImageNetInfo / DatasetInfo classes to provide labelling for various ImageNet classifier layouts in
timm
- Update
inference.py
to use, try:python inference.py /folder/to/images --model convnext_small.in12k --label-type detail --topk 5
- Update
- Ready for 0.8.10 pypi pre-release (final testing).
Jan 20, 2023
-
Add two convnext 12k -> 1k fine-tunes at 384x384
convnext_tiny.in12k_ft_in1k_384
- 85.1 @ 384convnext_small.in12k_ft_in1k_384
- 86.2 @ 384
-
Push all MaxxViT weights to HF hub, and add new ImageNet-12k -> 1k fine-tunes for
rw
base MaxViT and CoAtNet 1/2 models
model | top1 | top5 | samples / sec | Params (M) | GMAC | Act (M) |
---|---|---|---|---|---|---|
maxvit_xlarge_tf_512.in21k_ft_in1k | 88.53 | 98.64 | 21.76 | 475.77 | 534.14 | 1413.22 |
maxvit_xlarge_tf_384.in21k_ft_in1k | 88.32 | 98.54 | 42.53 | 475.32 | 292.78 | 668.76 |
maxvit_base_tf_512.in21k_ft_in1k | 88.20 | 98.53 | 50.87 | 119.88 | 138.02 | 703.99 |
maxvit_large_tf_512.in21k_ft_in1k | 88.04 | 98.40 | 36.42 | 212.33 | 244.75 | 942.15 |
maxvit_large_tf_384.in21k_ft_in1k | 87.98 | 98.56 | 71.75 | 212.03 | 132.55 | 445.84 |
maxvit_base_tf_384.in21k_ft_in1k | 87.92 | 98.54 | 104.71 | 119.65 | 73.80 | 332.90 |
maxvit_rmlp_base_rw_384.sw_in12k_ft_in1k | 87.81 | 98.37 | 106.55 | 116.14 | 70.97 | 318.95 |
maxxvitv2_rmlp_base_rw_384.sw_in12k_ft_in1k | 87.47 | 98.37 | 149.49 | 116.09 | 72.98 | 213.74 |
coatnet_rmlp_2_rw_384.sw_in12k_ft_in1k | 87.39 | 98.31 | 160.80 | 73.88 | 47.69 | 209.43 |
maxvit_rmlp_base_rw_224.sw_in12k_ft_in1k | 86.89 | 98.02 | 375.86 | 116.14 | 23.15 | 92.64 |
maxxvitv2_rmlp_base_rw_224.sw_in12k_ft_in1k | 86.64 | 98.02 | 501.03 | 116.09 | 24.20 | 62.77 |
maxvit_base_tf_512.in1k | 86.60 | 97.92 | 50.75 | 119.88 | 138.02 | 703.99 |
coatnet_2_rw_224.sw_in12k_ft_in1k | 86.57 | 97.89 | 631.88 | 73.87 | 15.09 | 49.22 |
maxvit_large_tf_512.in1k | 86.52 | 97.88 | 36.04 | 212.33 | 244.75 | 942.15 |
coatnet_rmlp_2_rw_224.sw_in12k_ft_in1k | 86.49 | 97.90 | 620.58 | 73.88 | 15.18 | 54.78 |
maxvit_base_tf_384.in1k | 86.29 | 97.80 | 101.09 | 119.65 | 73.80 | 332.90 |
maxvit_large_tf_384.in1k | 86.23 | 97.69 | 70.56 | 212.03 | 132.55 | 445.84 |
maxvit_small_tf_512.in1k | 86.10 | 97.76 | 88.63 | 69.13 | 67.26 | 383.77 |
maxvit_tiny_tf_512.in1k | 85.67 | 97.58 | 144.25 | 31.05 | 33.49 | 257.59 |
maxvit_small_tf_384.in1k | 85.54 | 97.46 | 188.35 | 69.02 | 35.87 | 183.65 |
maxvit_tiny_tf_384.in1k | 85.11 | 97.38 | 293.46 | 30.98 | 17.53 | 123.42 |
maxvit_large_tf_224.in1k | 84.93 | 96.97 | 247.71 | 211.79 | 43.68 | 127.35 |
coatnet_rmlp_1_rw2_224.sw_in12k_ft_in1k | 84.90 | 96.96 | 1025.45 | 41.72 | 8.11 | 40.13 |
maxvit_base_tf_224.in1k | 84.85 | 96.99 | 358.25 | 119.47 | 24.04 | 95.01 |
maxxvit_rmlp_small_rw_256.sw_in1k | 84.63 | 97.06 | 575.53 | 66.01 | 14.67 | 58.38 |
coatnet_rmlp_2_rw_224.sw_in1k | 84.61 | 96.74 | 625.81 | 73.88 | 15.18 | 54.78 |
maxvit_rmlp_small_rw_224.sw_in1k | 84.49 | 96.76 | 693.82 | 64.90 | 10.75 | 49.30 |
maxvit_small_tf_224.in1k | 84.43 | 96.83 | 647.96 | 68.93 | 11.66 | 53.17 |
maxvit_rmlp_tiny_rw_256.sw_in1k | 84.23 | 96.78 | 807.21 | 29.15 | 6.77 | 46.92 |
coatnet_1_rw_224.sw_in1k | 83.62 | 96.38 | 989.59 | 41.72 | 8.04 | 34.60 |
maxvit_tiny_rw_224.sw_in1k | 83.50 | 96.50 | 1100.53 | 29.06 | 5.11 | 33.11 |
maxvit_tiny_tf_224.in1k | 83.41 | 96.59 | 1004.94 | 30.92 | 5.60 | 35.78 |
coatnet_rmlp_1_rw_224.sw_in1k | 83.36 | 96.45 | 1093.03 | 41.69 | 7.85 | 35.47 |
maxxvitv2_nano_rw_256.sw_in1k | 83.11 | 96.33 | 1276.88 | 23.70 | 6.26 | 23.05 |
maxxvit_rmlp_nano_rw_256.sw_in1k | 83.03 | 96.34 | 1341.24 | 16.78 | 4.37 | 26.05 |
maxvit_rmlp_nano_rw_256.sw_in1k | 82.96 | 96.26 | 1283.24 | 15.50 | 4.47 | 31.92 |
maxvit_nano_rw_256.sw_in1k | 82.93 | 96.23 | 1218.17 | 15.45 | 4.46 | 30.28 |
coatnet_bn_0_rw_224.sw_in1k | 82.39 | 96.19 | 1600.14 | 27.44 | 4.67 | 22.04 |
coatnet_0_rw_224.sw_in1k | 82.39 | 95.84 | 1831.21 | 27.44 | 4.43 | 18.73 |
coatnet_rmlp_nano_rw_224.sw_in1k | 82.05 | 95.87 | 2109.09 | 15.15 | 2.62 | 20.34 |
coatnext_nano_rw_224.sw_in1k | 81.95 | 95.92 | 2525.52 | 14.70 | 2.47 | 12.80 |
coatnet_nano_rw_224.sw_in1k | 81.70 | 95.64 | 2344.52 | 15.14 | 2.41 | 15.41 |
maxvit_rmlp_pico_rw_256.sw_in1k | 80.53 | 95.21 | 1594.71 | 7.52 | 1.85 | 24.86 |