31 May Croatian and Serbian lemmatiser [legacy]
This tool is considered a legacy tool as the NLP pipeline achieves better results on the same task, but is not available as a web service yet.
A tool for automatic lemmatisation (returning the base or dictionary form of an inflected word). The tool looks up the hrLex/srLex lexicons and uses a predictive model for lemmatising OOVs (out of vocabulary words) which was trained on available corpora and lexicons.
- For local use, the code and models of the lemmatiser can be downloaded from this GitHub repository.
- The lemmatiser web service can be used online, via our web interface that can be found here.
- Our web service can be accessed from of our Python library, which can also be downloaded from the CLARIN.SI GitHub repository. Instructions on how to install the ReLDI library from GitHub can be found here (in Serbian). Alternatively, the easiest way to install it is through PyPI from the command line interface. (Detailed instructions also on GitHub.)
The third option, i.e. using the ReLDI Python library, is most recommended for handling larger amounts of data.