Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does the repo support OFA pre-training? #1

Open
zzhanghub opened this issue Sep 22, 2022 · 4 comments
Open

Does the repo support OFA pre-training? #1

zzhanghub opened this issue Sep 22, 2022 · 4 comments

Comments

@zzhanghub
Copy link

Awesome job! The Huggingface version OFA looks more concise. Can this framework support multi-task pre-training like this repo?
I see that this code file contains many different tasks. Could you provide more details about pre-training (Such as data preparation and submission script)?

Thank you again and look forward to your reply!

@faychu
Copy link
Collaborator

faychu commented Sep 22, 2022

Thanks for your support! I just released a version of the pretrain script. Please try pretrain.sh. But I do not recommend using this framework for pre-training. We have tried pre-training under this framework before, and its performance is not as good as the original repo (fairseq version).

@zzhanghub
Copy link
Author

Thanks for your support! I just released a version of the pretrain script. Please try pretrain.sh. But I do not recommend using this framework for pre-training. We have tried pre-training under this framework before, and its performance is not as good as the original repo (fairseq version).
@faychu

Thank you for your prompt reply.

Does this suboptimal performance refer to training from scratch? Have you tried to load ckpt (from fairseq version) to continue adding new data or tasks for pre-training?

By the way, will OFA continue to maintain the pre-training in fairseq version in the future, or gradually change to the Huggingface version?

Thank you!!!

@faychu
Copy link
Collaborator

faychu commented Sep 23, 2022

Yes, the performance refers to training from scratch, and we haven't tried pertaining with new data from loaded ckpt. If you are interested, welcome to try it.

In the future, the OFA pre-training will still be maintained in fairseq version, but the compression features (including distillation, pruning and quantization) will be supported by this repo.

@zzhanghub
Copy link
Author

Thank you!
I checked the code and found a difference between this version and fairseq version in the pre-training settings. I don't know if it is the reason that affects the pre-training results. In fairseq version, pure image and text tasks account for less. From this line of code, the ratio is 8:1. (different code of of Huggingface version)
Do you have any experience with the proportion of each pre-training task?

In addition, I am a little confused about some details of these three codes (process_image_text_pair, L12&45 of pretrain.sh, and init_task.py). Their configurations and dimensions seem inconsistent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants