Skip to content

Code for paper: From redundancy to relevance: Enhancing explainability in multimodal large language models

License

Notifications You must be signed in to change notification settings

zhangbaijin/From-Redundancy-to-Relevance

Repository files navigation

From-Redundancy-to-Relevance

Code for paper:An Information Flow Perspective for Exploring Large Vision Language Models on Reasoning Tasks. The paper is under review, we will release code in Oct.

Paper

Setup

The main implementation of Our is in transformers-4.29.2/src/transformers/generation/utils.py.

So it is convenient to use Our decoding by just installing our modified transformers package.

conda env create -f environment.yml
conda activate redundancy
python -m pip install -e transformers-4.29.2

Note: to implement on other version of transformers, you can follow the steps as the follows:

  • Find the file at transformers-4.29.2/src/transformers/generation/utils.py.
  • Add the arguments in transformers.generate function here.
  • Add the code in transformers.generate function here.
  • Copy and paste the opera_decoding function here.

Evaluation

The following evaluation requires for MSCOCO 2014 dataset. Please download here and extract it in your data path.

Besides, it needs you to prepare the following checkpoints of 7B base models:

Citation

@article{zhang2024redundancy,
  title={From Redundancy to Relevance: Enhancing Explainability in Multimodal Large Language Models},
  author={Zhang, Xiaofeng and Shen, Chen and Yuan, Xiaosong and Yan, Shaotian and Xie, Liang and Wang, Wenxiao and Gu, Chaochen and Tang, Hao and Ye, Jieping},
  journal={arXiv preprint arXiv:2406.06579},
  year={2024}
}


About

Code for paper: From redundancy to relevance: Enhancing explainability in multimodal large language models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published