Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pull Request: Addition of "general-preference" Branch for General Preference Model (GPM) #201

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

kirigayahitsugi
Copy link

Dear RewardBench Team,

I am submitting a Pull Request to introduce a new branch named "general-preference" to support the General Preference Model (GPM). The following modifications have been implemented:

  1. Added a new pipeline, GPMPipeline.
  2. Updated run_rm.py to accommodate our model's requirements.
  3. Included our model in the REWARD_MODEL_CONFIG.
  4. Added a script named run_rm_rewardbench.sh.

Thank you for considering this addition!

Best regards,
Grace Zhang

Copy link
Collaborator

@natolambert natolambert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @kirigayahitsugi -- sorry for the delay. In general this is great. I'm wondering how we can minimize modifications to the script. Right now, some of the moderations effect all the models (truncation) instead of just the new types of models. I am wondering if we can do this in a bit cleaner of a way.

Same goes for the custom classifier section where there is a lot of additions.

Lastly, seems like we want to remove the extra bash script you accidentally checked in?

@@ -89,6 +89,13 @@ def get_args():
choices=["eager", "sdpa", "flash_attention_2"],
help="Attention implementation to use (default: None)",
)
#### custom arguments added for general-preference/GPM-Llama-3.1-8B and general-preference/GPM-Gemma-2B
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to not add any args if possible, if just for one model. Given we already have a config structure.

Comment on lines 137 to 138
# #### added for general-preference/GPM-Llama-3.1-8B and general-preference/GPM-Gemma-2B for debugging
# config = REWARD_MODEL_CONFIG["general-preference/GPM-Gemma-2B"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

Comment on lines 188 to 190
#### added for general-preference/GPM-Llama-3.1-8B and general-preference/GPM-Gemma-2B
tokenizer.truncation_side = "right"
tokenizer.padding_side = "left"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants