Hi Luis,
Don’t know if this could be useful to you, but currently, the DBnary dataset contains phonetic (IPA) transcription of many entries.
DBnary is linked data and can be explored through its public endpoint using SPARQL language: http://kaiko.getalp.org/sparql
For instance the following query will tell you how many phonetic reps are available in which languages.
select ?lang count(?pr) where { [] ontolex:phoneticRep ?pr. BIND (lang(?pr) as ?lang) } GROUP BY ?lang ORDER BY DESC(COUNT(?pr))
This will give you a long table (I only include the first lines (results are order on the number of phoneticRep).
lang callret-1 fr-fonipa 2657875 en-fonipa 663697 ru-fonipa 389891 de-fonipa 230875 fi-fonipa 199269 es-fonipa 187090 la-fonipa 171134 it-fonipa 154881 pl-fonipa 136446 sh-fonipa 116478 pt-fonipa 90199 ca-fonipa 86385 eo-fonipa 84626 avk-fonipa 73459 es-ipa 72652 vi-fonipa 72147 As the data is continuously extracted from wiktionaries, the numbers will evolve (and as several language extractors do not yet extract the phonetic representation, feel free to file a feature request on DBnary bug tracker).
More info at :
http://kaiko.getalp.org/about-dbnary/
Regards,
Gilles,
On 7 Sep 2022, at 16:26, Luis Camacho Caballero camacho.l@pucp.edu.pe wrote:
Dear colleagues
I'm devoted to the revitalization and massification of the Andean Amazonian native language with computational processing as a key enabler.
Among the many tasks to do, nowadays I'm dealing with the creation of neologisms. That is why I'm looking for the larger multilingual dictionary of phonetic spelling, even better if that database includes asian languages (mandarin, japanese, korean, hindi, urdu, etc).
If you have this kind of database, I kindly ask you for bring me access, if you don't, I'd appreciate any clue about where and/or how access to it
Kind regards
Luis Camacho https://orcid.org/0000-0001-6569-550X
Corpora mailing list -- corpora@list.elra.info https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ To unsubscribe send an email to corpora-leave@list.elra.info