Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve indexing for multi-lingual items #275

Open
ruthtillman opened this issue Jan 25, 2021 · 0 comments
Open

Improve indexing for multi-lingual items #275

ruthtillman opened this issue Jan 25, 2021 · 0 comments
Labels
enhancement New feature or request

Comments

@ruthtillman
Copy link
Collaborator

Our language faceting is based on the primary language field 008[35-37]. The 041's subfields contain additional language data. We have not been indexing this data because it can be overkill, e.g. https://searchworks.stanford.edu/view/13749584 (The Crown showing English, Arabic, Danish, Dutch, Estonian, Finnish, French, German, Italian, Norwegian, Polish, Spanish, Swedish, Turkish languages.)

But if the 008[35-37] is mul for multilingual, the 041a will often be useful. The 041a may repeat. It contains the same language codes as the 008[35-37], so the mapping won't need to be updated. But this will turn the facet from a single value into an array/list.

Some records will not have an 041a, so we'll need to do an "if exists" check.

So new logic would be:

  • If 008[35-37] == mul:
  • THEN index the 041a fields too
  • AND make unique (because the 041a may contain "mul" as well)

We'll want to retain the "mul" as well.

@ruthtillman ruthtillman added the enhancement New feature or request label Jan 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant