Code and configuration files to reproduce semantic segmentation results of our paper. All experiments are conducted on ADE20K dataset based on mmsegmentation.
Backbone | Pretrain | Lr Schd | mIoU | mAcc | #params | FLOPs | config | model |
---|---|---|---|---|---|---|---|---|
Agent-Swin-T | ImageNet-1K | 160K | 46.68 | 58.53 | 61M | 954G | config | TsinghuaCloud |
Agent-Swin-S | ImageNet-1K | 160K | 48.08 | 59.78 | 81M | 1043G | config | TsinghuaCloud |
Agent-Swin-B | ImageNet-1K | 160K | 48.73 | 60.01 | 121M | 1196G | config | TsinghuaCloud |
Backbone | Pretrain | Lr Schd | mIoU | mAcc | #params | FLOPs | config | model |
---|---|---|---|---|---|---|---|---|
Agent-PVT-T | ImageNet-1K | 40K | 40.18 | 51.76 | 15M | 147G | config | TsinghuaCloud |
Agent-PVT-S | ImageNet-1K | 40K | 44.18 | 56.17 | 24M | 211G | config | TsinghuaCloud |
Agent-PVT-M | ImageNet-1K | 40K | 44.30 | 56.42 | 40M | 321G | config | TsinghuaCloud |
Agent-PVT-L | ImageNet-1K | 40K | 46.52 | 58.50 | 52M | 434G | config | TsinghuaCloud |
Prepare ADE20K dataset, and change data_root
argument in configs/_base_/datasets/ade20k.py
to the dataset path.
Please place ImageNet-1K pretrained models under ./data/
folder and rename them as {MODEL_STRUCTURE}_max_acc.pth
, e.g. agent_swin_t_max_acc.pth
.
For convenience, we provide the conda environment file and pre-bulit mmcv
.
Please download the pre-built mmcv here, and place it under ../
We use an empty mmcv
directory as a placeholder.
conda env create -f agent_segmentation.yaml
cd ../mmcv/
pip install -v -e .
cd ../segmentation/
pip install -v -e .
# single-gpu testing
python tools/test.py <CONFIG_FILE> <DET_CHECKPOINT_FILE> --eval mIoU
# multi-gpu testing
tools/dist_test.sh <CONFIG_FILE> <DET_CHECKPOINT_FILE> <GPU_NUM> --eval mIoU
To train a detector with pre-trained models, run:
# single-gpu training
python tools/train.py <CONFIG_FILE>
# multi-gpu training
torchrun --nproc_per_node <GPU_NUM> tools/train.py <CONFIG_FILE> --launcher="pytorch"
If you find this repo helpful, please consider citing us.
@inproceedings{han2024agent,
title={Agent attention: On the integration of softmax and linear attention},
author={Han, Dongchen and Ye, Tianzhu and Han, Yizeng and Xia, Zhuofan and Pan, Siyuan and Wan, Pengfei and Song, Shiji and Huang, Gao},
booktitle={European Conference on Computer Vision},
year={2024},
}