TC-STAR English Training Corpora for ASR: Transcriptions of EPPS Speech
View resource name in all available languages
Corpus d’entraînement TC-STAR anglais pour l’ASR: Transcriptions EPPS
ID:
ELRA-S0249
TC-STAR is a European integrated project focusing on all core technologies for Speech-to-Speech Translation (SST): Automatic Speech Recognition (ASR), Spoken Language Translation (SLT), and Text to Speech Synthesis (TTS).
This corpus consists of transcriptions from 92 hours of EPPS (European Parliament Plenary Sessions) speeches held or interpreted in European English (a mixture of native and non-native English). The recordings (not included in the present package) were obtained from Europe by Satellite (http://europa.eu.it/comm/ebs) from May 2004 until May 2006. The corpus consists of 63 transcriptions files. The transcription files are stored in Transcriber XML file format.
The speech databases made within the TC-STAR project were validated by SPEX, in the Netherlands, to assess their compliance with the TC-STAR format and content specifications.
For corresponding recordings, see ELRA-S0251.
View resource description in all available languages