LC-STAR English-Slovenian Bilingual Aligned Phrasal lexicon
View resource name in all available languages
Lexique aligné anglais-slovène de locutions LC-STAR
The LC-STAR English-Slovenian Bilingual Aligned Phrasal lexicon was created within the scope of the LC-STAR project (IST 2001-32216) which was sponsored by the European Commission. It was designed for SST (Speech-to-Speech Translation).
The lexicon comprises 12,722 phrases from the tourist domain. It is based on a list of short sentences obtained by translation from a US-English 10,522 phrase corpus. The total number of unique separate words is 43,209.
The lexicon contains the following information:
- US-English phrase (orthography),
- its translation into Slovenian (orthography),
and for each token in Slovenian a phrase provides the following:
- orthography of a word,
- part of speech,
- whether the phrase is idiomatic or not,
- if a word is a foreign word. In this lexicon, foreign words were only tagged if they were written with foreign orthography (e.g. English characters).
The lexicon is provided in XML format. The database is stored on 1 CD.
View resource description in all available languages