ูุณุฑูุง ุงูุงุนูุงู ุนู ู ุนุฌู ูุจุณ ุงูุญุงุณูุจู We are very happy to release
๐๐๐๐๐ฌ - ๐๐ฉ๐๐ง-๐๐จ๐ฎ๐ซ๐๐ ๐๐๐ฑ๐ข๐๐จ๐ ๐ซ๐๐ฉ๐ก๐ข๐ ๐๐๐ญ๐๐๐๐ฌ๐
Qabas = 60k Lemmas + manually linked with 12 corpora (2.3 tokens) + 110 lexicons (~ 300k lemmas)
Birzeit Universityโs SinaLab for Computational Linguistics and Artificial Intelligence https://sina.birzeit.edu/ has officially launched Qabas https://sina.birzeit.edu/qabas, an open-source lexicographic database for Arabic, designed specifically for Natural Language Processing (NLP) applications. Qabas stands out by linking its lexical entries (lemmas) with lemmas from 110 different lexicons and numerous morphologically annotated corpora (around 2 million tokens), creating an extensive lexicographic graph. This project has been under development for over fourteen years. Lexicons have evolved from being primarily hard-copy resources for human use to having substantial significance in NLP applications. Although Arabic is a highly resourced language in terms of traditional lexicons, not enough attention is given to developing AI-oriented lexicographic databases. Additionally, none of the Arabic lexicons are available open-source, due to copyright restrictions imposed by their owners. As for Qabas, it is an open-source Arabic lexicon designed for NLP applications, and its novelty lies in its synthesis of many lexical resources. Each lexical entry (i.e., lemma) in Qabas is linked with equivalent lemmas in 110 other lexicons, and with 12 morphologically-annotated corpora (about 2M tokens); The philosophy of Qabas is to construct a large lexicographic data graph by linking existing Arabic lexicons and annotated corpora. Qabas stands as the largest Arabic lexicon, encompassing about 58K lemmas (45K nominal lemmas, 12.5K verbal lemmas, and 500 function word lemmas). Prof. Mustafa Jarrar, the projectโs manager and main author, emphasized the importance of making Qabas freely available as an open-source resource, allowing everyone to access and use it for both commercial and non-commercial purposes. Prof. Jarrar hopes that researchers, companies, and software developers will leverage the lexiconโs data to develop innovative content and applications that benefit humanity. Prof. Talal Shahwan, President of Birzeit University, stated that despite the challenging conditions in Palestine, the university remains committed to excellence and to its mission towards knowledge. He emphasized that this achievement was made possible by the dedication of the universityโs faculty and researchers.
Qabas is publicly available online at: https://sina.birzeit.edu/qabas To download Qabas and find out more, see: https://sina.birzeit.edu/qabas/about
Article: https://www.jarrar.info/publications/JH24.pdf
Weโd love your feedback: Facebook: https://www.facebook.com/watch?v=880418097306662 LinkedIn: https://www.facebook.com/watch?v=880418097306662
Best --Mustafa __________________________ Mustafa Jarrar, PhD Professor of Artificial Intelligence Chair, PhD Program in Computer Science Birzeit University, Palestine Page: http://www.jarrar.info http://www.jarrar.info/ SinaLab: https://sina.birzeit.edu https://sina.birzeit.edu/