[Apologies for multiple postings]
We are happy to announce that 1 new phonetic database and 1 new speech corpus are available in our catalogue.
*Comprehensive Arabic Phonetic Database https://catalog.elra.info/en-us/repository/browse/ELRA-S0493/*
ISLRN: 511-751-240-544-8 https://islrn.org/resources/511-751-240-544-8/
The Comprehensive Arabic Phonetic Database is a robust and detailed linguistic resource offering both phonemic and phonetic transcriptions, precisely reflecting how Modern Standard Arabic words are realized in actual speech. It is a highly comprehensive and accurate Arabic phonetic/phonemic database, covering over 329,000 entries, including over 61,000 general vocabulary entries, 101,000 Arab personal names, 143,000 foreign personal names in Arabic and 21,000 worldwide place names both Arab and non-Arab. Each entry consists of canonical forms both vocalized and unvocalized (as in natural language) accompanied by phonetic transcriptions in IPA and X-SAMPA and the user-friendly CARS phonemic transcription system. Additionally, unique features include explicit indication of vowel neutralization, accurate word stress, gender and number codes (singular or plural), and POS (part-of-speech) codes. The database is provided in a flat TSV text file.
See also the *DiaLEX https://catalog.elra.info/en-us/repository/search/?q=dialex* and *ArabLEX https://catalog.elra.info/en-us/repository/search/?q=arablex* collections for Arabic from the same provider…
*EthioSpeech https://catalog.elra.info/en-us/repository/browse/ELRA-S0494/*
ISLRN:886-456-351-764-8 https://islrn.org/resources/886-456-351-764-8/
EthioSpeech Corpora is comprised of over 391 hours of recorded read speech in six different Ethiopian languages by ca. 200 speakers per language: Amharic (68 hours), Tigrigna (62 hours), Oromo (70 hours), Somali (56 hours), Afar (68 hours), and Sidama (68 hours). The dominating domain is media (mainly newspapers), but for some of the languages texts from different domains were used, including spiritual contents. The recording is made using mobile devices using the LIG-Aikuma speech recording tool that is installed on the devices. The gender and age balance of readers is nearly equal for Amharic, Tigrigna and Oromo, whereas mainly male gender for the other 3 languages. The age distribution is between 18 and 40.
For more information on the catalogue or if you would like to enquire about having your resources distributed by ELRA, please *contact us* mailto:contact@elda.org.
_________________________________________
Visit the *ELRA Catalogue of Language Resources* http://catalog.elra.info *Archives * https://www.elra.info/catalogues/language-resources-announcements/of ELRA Language Resources Catalogue Updates