Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducing WMT14 #3175

Open
MaxHahnbueck opened this issue Nov 19, 2024 · 2 comments
Open

Reproducing WMT14 #3175

MaxHahnbueck opened this issue Nov 19, 2024 · 2 comments

Comments

@MaxHahnbueck
Copy link

I want to reproduce and then adjust the data of the WMT2014 Benchmark. Therefore I cant use helm directly (if I understand it correctly). Therefore i want to use the dataset and the way how the prompt is constructed in my own pipeline. (same applies to the evaluation code)

Unfortunately I cant understand and find my way through the code.

I believe the PromptRunExpander is responsible for creating the prompt but i cant figure out what it exactly does and what values are used and where they are obtained from.

I would happy for any explenation on how the flow of information is and how the final prompt is constructed

@yifanmai
Copy link
Collaborator

yifanmai commented Dec 7, 2024

Hi @MaxHahnbueck, here's some pointers to the code:

The main flow for the prompt generation happens in Runner.run_one here. Specifically, this calls WMT14Scenario.get_instances (link) to download and load instances into memory, and it calls GenerationAdapter.generate_requests (link) to turn them into prompt strings, which are placed in RequestState.request.input.text.

You may also want to check out the documentation if you haven't already. Hope this helps.

@MaxHahnbueck
Copy link
Author

MaxHahnbueck commented Jan 8, 2025

Thanks. I think i figured out most stuff now.

If I understand correctly annotators in scenario_state here describe how postprocessing is done. WMT14 (and others such as med_qa) do not have these annotators.

But if I want to use BLEU-4 as the metric I believe I need some kind of preprocessing as my output looks like this:

It seems like there are two sentences to translate. Here are the translations:

1` German: Wiederaufnahme der Sitzungsperiode
   English: Resumption of the session

2. German: Sie stehen keine 100 Meter voneinander entfernt: Am Dienstag ist in Gutach die neue B 33-Fußgängerampel am Dorfparkplatz in Betrieb genommen worden - in Sichtweite der älteren Rathausampel.
   English: They are not 100 meters apart from each other: On Dienstag, in Gutach, the new B 33 pedestrian traffic light at Dorfparkplatz has been put into operation - in sight of the older town hall traffic light.

Maybe thats just not clean output by the model, but I think I just need the last part after the "english:"

Where can I find the postprecessing steps?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants