Improve indexing for multi-lingual items #275

ruthtillman · 2021-01-25T15:12:19Z

Our language faceting is based on the primary language field 008[35-37]. The 041's subfields contain additional language data. We have not been indexing this data because it can be overkill, e.g. https://searchworks.stanford.edu/view/13749584 (The Crown showing English, Arabic, Danish, Dutch, Estonian, Finnish, French, German, Italian, Norwegian, Polish, Spanish, Swedish, Turkish languages.)

But if the 008[35-37] is mul for multilingual, the 041a will often be useful. The 041a may repeat. It contains the same language codes as the 008[35-37], so the mapping won't need to be updated. But this will turn the facet from a single value into an array/list.

Some records will not have an 041a, so we'll need to do an "if exists" check.

So new logic would be:

If 008[35-37] == mul:
THEN index the 041a fields too
AND make unique (because the 041a may contain "mul" as well)

We'll want to retain the "mul" as well.

The text was updated successfully, but these errors were encountered:

ruthtillman added the enhancement label Jan 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve indexing for multi-lingual items #275

Improve indexing for multi-lingual items #275

ruthtillman commented Jan 25, 2021

Improve indexing for multi-lingual items #275

Improve indexing for multi-lingual items #275

Comments

ruthtillman commented Jan 25, 2021