Dear all,
We are happy to release six corpora (1.3 Million tokens) with full morphological annotations for (Palestinian, Lebanese, Yemeni, Iraqi, Libyan, and Sudanese) dialects. All are annotated using the LDC’s SAMA tagsets.
Search: https://portal.sina.birzeit.edu/curras Download: https://portal.sina.birzeit.edu/curras/about-en.html
This video demonstrates how to search the corpora in Arabic/English.
https://twitter.com/mjarrar/status/1604078695068598273%EF%BF%BC #arabic_language_day We are very happy to release 6 Arabic dialects corpora (1.3 million tokens, morphologically annotated): Curras(Palestinian), Baladi (Lebanese), Lisani (Yemeni, Irqi, Libyan, Sudanese) by @UN, @BirzeitU and @AUB_Lebanon. https://t.co/ZP3hqVSRWc  Mustafa Jarrar twitter.com
Best --Mustafa __________________________ Mustafa Jarrar, PhD Professor of Artificial Intelligence Chair, PhD Program in Computer Science Birzeit University, Palestine Whatsapp:+972599662258 | mjarrar@birzeit.edu http://www.jarrar.info