Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identify and translate missing plural keys #596

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

markedmondson
Copy link
Contributor

@markedmondson markedmondson commented Aug 30, 2024

From #582

  • For locales identified in a call to translate commands, determines the existing and required pluralization keys and adds the nodes to the forest before translating. It will additionally remove unrequired plural translation keys.
  • Adds a label to the front of missing plural keys when running the missing command as the table headers were a little unclear.
  • Add a --skip-interpolation option to the missing command (but specifically intended for translation) that won't send only interpolated strings to translate, the reasoning behind this was a) I found OpenAI would end up getting a mismatched number of results returned if there were a lot of them and b) it's kind of wasteful as a correctly configured Rails implementation would fallback and work correctly.

I think there's still some cleanup to do here, it feels a little clobbered together as I was learning as I went so totally happy to solicit feedback (or hear and attempt to address any problems).

@@ -23,6 +23,7 @@ en:
nostdin: Do not read from stdin
out_format: 'Output format: %{valid_text}'
pattern_router: 'Use pattern router: keys moved per config data.write'
skip_interpolation: Skip translating strings that are straight interpolations (eg. "%{comment}")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than adding an option, perhaps we should always simply copy such keys verbatim? There is never a reason to translate them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be preferential in our use case but I didn't want to assume...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we don't even copy them, we have Rails configured with fallback which then just works... Not sure that's a safe assumption either though.

Copy link
Owner

@glebm glebm Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is never any reason to pass interpolation-only values through machine translation, so let's copy them verbatim?

Copy link
Contributor Author

@markedmondson markedmondson Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me! :)

Edit: Although in our case we don't want to copy them either, I'd prefer to skip and leave them to fallback. Should I leave a mechanism in place for that?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, makes sense to have that as an option.

Comment on lines 113 to 142
describe 'multi-line' do

end
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Remove

@markedmondson markedmondson requested a review from glebm September 3, 2024 23:57
@markedmondson
Copy link
Contributor Author

markedmondson commented Oct 17, 2024

I spent a little more time on this on a large translation (~610 keys in 30 locales) and added a tweak.

Namely, ChatGPT is pretty bad at returning the correct number of responses back, I've added the array size to the prompt but even then it struggles if there are repeated translations - so there's a fail safe that'll retry and then fall back to doing line by line. I also found dropping the default array size from 50 to 25 improved things somewhat.

Edit: Looks like there's a problem with the removal of pluralization keys that aren't required, eg we translate one, other and zero from en.yml, they get copied to ru.yml even though it only needs one, few, many, other because it gets caught and merged in the missing_diff_forest (even though missing_plural_forest removes it).

lib/i18n/tasks/base_task.rb Outdated Show resolved Hide resolved
@markedmondson
Copy link
Contributor Author

markedmondson commented Oct 22, 2024

Edit: Looks like there's a problem with the removal of pluralization keys that aren't required, eg we translate one, other and zero from en.yml, they get copied to ru.yml even though it only needs one, few, many, other because it gets caught and merged in the missing_diff_forest (even though missing_plural_forest removes it).

I fixed this in 7e47ded

I think a separate task to remove plurals from locales that are not required would be kinda useful as a distinct thing.

@markedmondson markedmondson force-pushed the pluralization branch 2 times, most recently from 0fdd55f to 179e0b1 Compare November 7, 2024 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants