Austrian SpeechDat(AT) MDB-1000 database
View resource name in all available languages
Base de donnéesSpeechDat(AT) MDB-1000 de l'autrichien
The Austrian SpeechDat(AT) MDB-1000 database contains the recordings of 1,000 Austrian speakers (543 males, 457 females) recorded over the Austrian mobile telephone network. The database is partitioned into 5 CD-ROMs, in ISO 9660 format.
Speech samples are stored as sequences of 8-bit 8 kHz A-law, uncompressed. Each prompted utterance is stored in a separate file, and each signal file is accompanied by an ASCII SAM label file which contains the relevant descriptive information.
This speech database, was validated by SPEX (the Netherlands) to assess its compliance with the SpeechDat format and content specifications.
Each speaker uttered the following items:
* 3 isolated digits
* 4 connected digits (prompt sheet number -5 digits, telephone number –9/11 digits, credit card number –15/16 digits, PIN code –6 digits)
* 1 natural number
* 2 money amounts (currency amount, mixed size and units)
* 2 yes/no questions (predominantly "yes", predominantly "no")
* 3 dates (spontaneous date e.g. birthday, prompted date, relative and general date expression)
* 2 times (spontaneous time of day, prompted mixed/analogue digital)
* 6 application words
* 1 word spotting phrase using embedded application words
* 7 directory assistance names (spontaneous names e.g. forenames, city of birth, a name out of a set of 150 SDB full names, most frequent cities, most frequent companies)
* 3 spellings (spontaneous e.g. forename, directory city name, real/artificial city name)
* 4 isolated words
* 12 phonetically rich sentences
* 7 speaker specific material (speaker gender question, call from fixed or mobile network, speaker region question, today’s date, environment of call, native language, educational level)
The following age distribution has been obtained: 18 speakers are under 16, 550 are between 16 and 30, 262 are between 31 and 45, 157 are between 46 and 60, and 13 speakers are over 60.
A pronunciation lexicon with a phonemic transcription in SAMPA is also included.
View resource description in all available languages