This is the code related to the paper Simple ways to improve NER in every language using markup. This paper explores different and quick approaches to improve the performance of NER systems. The experiements were done over different langauges like English, Spanish, Croatian and Finnish.
An improved version of the code ca be found in https://github.com/EMBEDDIA/NER_FEDA, where we explore as well a Frustratingly Easy Domain Adaptation for NER.
Please cite this work using the following paper:
@inproceedings{cabrera-diego_simple_2021,
address = {Ljubljana, Slovenia},
title = {Simple ways to improve {NER} in every language using markup},
volume = {2829},
url = {http://ceur-ws.org/Vol-2829/paper2.pdf},
booktitle = {Proceedings of the 2nd {International} {Workshop} on {Cross}-lingual {Event}-centric {Open} {Analytics} ({CLEOPATRA} 2021)},
publisher = {CEUR-WS},
author = {Cabrera-Diego, Luis Adrián and Moreno, Jose G. and Doucet, Antoine},
editor = {{Elena Demidova} and {Sherzod Hakimov} and {Jane Winters} and {Marko Tadić}},
year = {2021},
pages = {17--31}
}
We cleaned the code before making it public, if you find any bug, please, let us know by raising an issue.
This work is is result of the European Union H2020 Project Embeddia. Embeddia is a project that creates NLP tools that focuses on European under-represented languages and that has for objective to improve the accessibility of these tools to the general public and to media enterprises. Visit Embeddia's Github to discover more NLP tools and models created within this project.