- NLP
- Knowledge Distillation
- Pruning
Summary | Title & Authors | Introduction | Links |
---|---|---|---|
#1 LGTM Experiments | Tailoring Instructions to Student's Learning Levels Boosts Knowledge Distillation Yuxin Ren, Zihan Zhong, Xingjian Shi, Yi Zhu, Chun Yuan, Mu Li |
Github Paper |
|
#2 SCOTT | SCOTT: Self-Consistent Chain-of-Thought Distillation Peifeng Wang, Zhengyang Wang, Zheng Li, Yifan Gao, Bing Yin, Xiang Ren |
Paper | |
#3 KD-QAT | Understanding and Improving Knowledge Distillation for Quantization Aware Training of Large Transformer Encoders Minsoo Kim, Sihwa Lee, Suk-Jin Hong, Du-Seong Chang, and Jungwook Choi |
Github Paper |
|
#4 FlanT5 | Specializing Smaller Language Models towards Multi-Step Reasoning Yao Fu, Hao Peng, Litu Ou, Ashish Sabharwal, Tushar Khot |
Github Paper |
|
#5 DML Experiments | Deep Mutual Learning Ying Zhang, Tao Xiang, Timothy M. Hospedales, Huchuan Lu |
Paper |
Summary | Title & Authors | Introduction | Links |
---|---|---|---|
#1 PuMer Evaluation |
PuMer: Pruning and Merging Tokens for Efficient Vision Language Models Qingqing Cao, Bhargavi Paranjape, and Hannaneh Hajishirzi |
Github Paper |
|
#2 PLMS | Specializing Pre-trained Language Models for Better Relational Reasoning via Network Pruning Siyu Ren and Kenny Zhu |
Github Paper |
|
#3 DPF | Dynamic Model Pruning with Feedback Tao Lin, Sebastian U. Stich, Luis Barba, Daniil Dmitriev, Martin Jaggi |
Paper | |
#4 GB | Gender Biases and Where to Find Them: Exploring Gender Bias in Pre-Trained Transformer-based Language Models Using Movement Pruning Przemyslaw Joniak and Akiko Aizawa |
Github Paper |
|
#5 MP | Movement Pruning: Adaptive Sparsity by Fine-Tuning Victor Sanh1, Thomas Wolf1, Alexander M. Rush1 |
Github Paper |
|
#6 BMP | Block Pruning For Faster Transformers François Lagunas, Ella Charlaix, Victor Sanh, Alexander Rush |
Github Paper |