Dear Fanny,
just to complement Salvas response. In the case of the LivingNER corpus, human entities have been normalized to the NCBI Taxonomy code 9606 (just select those mentions linked to this species code). This corpus also has "silver standard" versions in English, French, Italian, Portuguese, Romanian, Catalan and Galician and can be freely accessed from:
https://zenodo.org/records/7684093
Best regards,
Martin
El lun, 6 nov 2023 a las 12:09, Salvador Lima via Corpora (< corpora@list.elra.info>) escribió:
Dear Fanny,
You might be interested in the LivingNER corpus, which includes a label for humans mentioned in a collection of clinical case reports in Spanish. We also released some multilingual Silver Standard versions of the corpus generated automatically which are not perfect but might be useful anyway. It does not include any of the features you mention, though. Here is the link: https://temu.bsc.es/livingner/
There is another clinical corpus that could be used to incorporate some of those features. The MEDDOPROF corpus (https://temu.bsc.es/meddoprof/) annotates mentions of occupations and working statuses, which quite often overlap with human mentions (not always, as there are also job descriptions and more).
Hope it helps!
Best wishes, Salva
El lun, 6 nov 2023 a la(s) 12:02, fanny.ducel--- via Corpora ( corpora@list.elra.info) escribió:
Dear all,
For an experiment, I need some resources that include semantic annotations for inflectional languages (especially Italian, German, and Spanish). More precisely, I would need a list of common nouns that refer to human entities in these languages (i.e. "man", "woman", "teenager", "uncle", "baker", "liar", ...). If the annotations also include information on gender, it would be even better.
For instance, for French, DELA (UNITEX dictionaries), Démonette, or FrSemCor are appropriate, but their counterparts in other languages (especially for DELA) do not include the semantic annotations I am looking for. To give you a more concrete idea, in the mentioned French resources, we can find lexical entities followed by annotations such as ":Person", "+Hum", "+Profession" (occupation) or "@AGM/@AGF" (masculine/feminine agent). Do you have any ideas or suggestions?
So far, the relevant resources I found are either under a prohibitive license or the links are not working anymore. I am looking for resources that are free for non-commercial, academic use.
Thanks a lot and have a nice week,
Fanny Ducel - PhD Student at LISN, Université Paris-Saclay (France) - fanny.ducel@lisn.fr _______________________________________________ Corpora mailing list -- corpora@list.elra.info https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ To unsubscribe send an email to corpora-leave@list.elra.info
Corpora mailing list -- corpora@list.elra.info https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ To unsubscribe send an email to corpora-leave@list.elra.info