This a Pytorch implementation of our paper SMMix: Self-Motivated Image Mixing for Vision Transformers
- python 3.8.0
- pytorch 1.7.1
- torchvision 0.8.2
- The ImageNet dataset should be prepared as follows:
ImageNet
├── train
│ ├── folder 1 (class 1)
│ ├── folder 2 (class 2)
│ ├── ...
├── val
│ ├── folder 1 (class 1)
│ ├── folder 2 (class 2)
│ ├── ...
Model | Top-1 Accuracy | Dowmload |
---|---|---|
DeiT-T | 73.6 | model & log |
DeiT-S | 81.1 | model & log |
PVT-T | 76.4 | model & log |
PVT-S | 81.0 | model & log |
PVT-M | 82.2 | model & log |
PVT-L | 82.7 | model & log |
./script/eval.sh --data-path DATASET_PATH --model MODEL_NAME --resume CHECKPOINT_PATH
examples:
./script/eval.sh --data-path /media/DATASET/ImageNet --model pvt_small --resume ./checkpoints/pvt_small_smmix.pth
./script/eval.sh --data-path /media/DATASET/ImageNet --model vit_deit_small_patch16_224 --resume ./checkpoints/deit_small_smmix.pth
./script/train.sh --data-path DATASET_PATH --model MODEL_NAME --output_dir LOG_PATH --batch_size 256
examples:
./script/train.sh --data-path /media/DATASET/ImageNet --model pvt_small --output_dir ./log/pvt_small_smmix --batch_size 256
./script/train.sh --data-path /media/DATASET/ImageNet --model vit_deit_small_patch16_224 --output_dir ./log/deit_small_smmix --batch_size 256