we proposed a global attention module named GPA, which trades off between model performance and complexity,and improve ResNet-18/R(2+1)D-18 more than 18.9 on ucf101 comparing to [1] when training models from scratch. addtionally,we try to explain the process of attention generations in our paper, details will be opened soon if paper is published.
[1] "Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? ", accepted to CVPR2018!