-
-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added polish diacritics in non_ascii_equivalents.py #386
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. This is somewhat similar to #387
Extending the tag list that way makes sense to me. The new tags seem to be in line with how the plugin originally was conceived. It probably would be even better to have some configuration for this, but in absence of this I think the extensions as presented here is useful.
What also applies here is my comment on #387 about using Picard's picard.util.textencoding.unaccent
function (please see my detailed comment there). This would allow to get rid of the explicit mapping of most accented characters. As I see it the first mapping section can be completely removed then, except for the two letters "Ł" and "ł" (which could be placed under "Misc letters" then.
In the future this will then avoid the need to add additional accented characters, likely there are a few that we still miss.
Hi. I implemented |
@Sophist-UK Yeah, sure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update, this looks good to me.
What about this: "č": "c", unaccent performs this? |
Yes. >>> unaccent("čšș")
'css' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot
I found two similar signs in my collection: µ - https://symbl.cc/en/00B5/ (in plugin) μ - https://symbl.cc/en/03BC/ (song Bjork - Hunter (μ-Ziq remix) ) |
@Echelon666 as @phw said, in current state of this plugin we will almost always miss some characters. That's why he proposed to add scripting functionality, so users'll be able to define missing characters or change transliterations themselves. I'd be willing to add scripting functionality, but as for now I should focus on defending my thesis. |
Sure. ;) I thought this was important. ;) |
As in title I've added some polish diacritics and also extended filter tags list