Morphosyntactic tagset converter for positional tagsets
TaCo is a statistical morphosyntactic tagset converter designed for positional tagsets, especially Polish tagsets. The typical use is to convert manual annotation of a corpus with tags from one tagset to another tagset. It is based on decision trees produced by C5.0 algorithm and additionally makes use of morphological analyzer Morfeusz. The tool can be configured for converting between various pairs of tagsets and with some additional effort it can be modified to use different morphological analyzers. The converter comes with an example configuration and a trained model for conversion from the IPIPAN Corpus tagset to the National Corpus of Polish tagset.