Aug 28, 2023
- Add dynamic img size support to models in
vision_transformer.py
,vision_transformer_hybrid.py
,deit.py
, andeva.py
w/o breaking backward compat.- Add
dynamic_img_size=True
to args at model creation time to allow changing the grid size (interpolate abs and/or ROPE pos embed each forward pass). - Add
dynamic_img_pad=True
to allow image sizes that aren't divisible by patch size (pad bottom right to patch size each forward pass). - Enabling either dynamic mode will break FX tracing unless PatchEmbed module added as leaf.
- Existing method of resizing position embedding by passing different
img_size
(interpolate pretrained embed weights once) on creation still works. - Existing method of changing
patch_size
(resize pretrained patch_embed weights once) on creation still works. - Example validation cmd
python validate.py /imagenet --model vit_base_patch16_224 --amp --amp-dtype bfloat16 --img-size 255 --crop-pct 1.0 --model-kwargs dynamic_img_size=True dyamic_img_pad=True
- Add
Aug 25, 2023
- Many new models since last release
- FastViT - https://arxiv.org/abs/2303.14189
- MobileOne - https://arxiv.org/abs/2206.04040
- InceptionNeXt - https://arxiv.org/abs/2303.16900
- RepGhostNet - https://arxiv.org/abs/2211.06088 (thanks https://github.com/ChengpengChen)
- GhostNetV2 - https://arxiv.org/abs/2211.12905 (thanks https://github.com/yehuitang)
- EfficientViT (MSRA) - https://arxiv.org/abs/2305.07027 (thanks https://github.com/seefun)
- EfficientViT (MIT) - https://arxiv.org/abs/2205.14756 (thanks https://github.com/seefun)
- Add
--reparam
arg tobenchmark.py
,onnx_export.py
, andvalidate.py
to trigger layer reparameterization / fusion for models with any one ofreparameterize()
,switch_to_deploy()
orfuse()
- Including FastViT, MobileOne, RepGhostNet, EfficientViT (MSRA), RepViT, RepVGG, and LeViT
- Preparing 0.9.6 'back to school' release
Aug 11, 2023
- Swin, MaxViT, CoAtNet, and BEiT models support resizing of image/window size on creation with adaptation of pretrained weights
- Example validation cmd to test w/ non-square resize
python validate.py /imagenet --model swin_base_patch4_window7_224.ms_in22k_ft_in1k --amp --amp-dtype bfloat16 --input-size 3 256 320 --model-kwargs window_size=8,10 img_size=256,320