This corpus is composed of the audio and the manual transcriptions of the LECTRA Corpus: classroom LECture TRAnscriptions in European Portuguese. The corpus includes seven 1-semester University courses. All lectures were taught at Technical University of Lisbon (IST), recorded in the presence of students, except IICT, recorded in another university and in a quiet office environment, targeting an Internet audience. The corpus contains a total of 21 hours of audio speech that were manually transcribed by several trained annotators.
For a complete description of the corpus and the report of Automatic Speech Recognition results, the reader may refer to (Trancoso et al., 2008) and (Pellegrini et al., 2012).