Hungarian Poem (János vitéz/John the Valiant by Sándor Petőfi) Reading Speech and Aligned
Text Selection Database
Database of portions of text and audio version of a Hungarian piece of poetry. (The audio data is not stored in this database, but can be freely downloaded from librivox.org.) The recordings are segmented between speech pauses, which not necessarily correspond to sentence boundaries. The reading is mostly, but not completely accurate. Hence, an automatic speech recognizer was utilized to choose only those segments, where there is a high match between the automatic recognition result and the original text. Thus the database comprises only those segments that are considered to have a reliable transcription. The database can be applied in speech technology research, phonetic, phonological research and for developing and testing speech recognition systems.