We are very happy to announce the twenty-second release of annotated
treebanks in Universal Dependencies, v2.16, available at
https://universaldependencies.org/.
Universal Dependencies is a project that seeks to develop
cross-linguistically consistent treebank annotation for many
languages with the goal of facilitating multilingual parser
development, cross-lingual learning, and parsing research from a
language typology perspective (de Marneffe et al., 2021; Nivre et
al., 2020). The annotation scheme is based on (universal) Stanford
dependencies (de Marneffe et al., 2006, 2008, 2014), Google
universal part-of-speech tags (Petrov et al., 2012), and the
Interset interlingua for morphosyntactic tagsets (Zeman, 2008). The
general philosophy is to provide a universal inventory of categories
and guidelines to facilitate consistent annotation of similar
constructions across languages, while allowing language-specific
extensions when necessary.
The 319 treebanks in v2.16 are annotated according to
version 2 of the UD guidelines and represent the following 179
languages: Abaza, Abkhaz, Afrikaans, Akkadian, Akuntsu, Albanian,
Alemannic, Amharic, Ancient Greek, Ancient Hebrew, Apurina, Arabic,
Armenian, Assyrian, Azerbaijani, Bambara, Basque, Bavarian, Beja,
Belarusian, Bengali, Bhojpuri, Bokota, Bororo, Breton, Bulgarian,
Buryat, Cantonese, Cappadocian, Catalan, Cebuano, Chinese, Chukchi,
Classical Armenian, Classical Chinese, Coptic, Croatian, Czech,
Danish, Dutch, Egyptian, English, Erzya, Esperanto, Estonian,
Faroese, Finnish, French, Frisian Dutch, Galician, Georgian, German,
Gheg, Gothic, Greek, Guajajara, Guarani, Gujarati, Gwichin, Haitian
Creole, Hausa, Hebrew, Highland Puebla Nahuatl, Hindi, Hittite,
Hungarian, Icelandic, Ika, Indonesian, Irish, Italian, Japanese,
Javanese, Kaapor, Kangri, Karelian, Karo, Kazakh, Khoekhoe,
Khunsari, Kiche, Komi Permyak, Komi Zyrian, Korean, Kurmanji,
Kyrgyz, Latgalian, Latin, Latvian, Ligurian, Lithuanian, Livvi, Low
Saxon, Luxembourgish, Macedonian, Madi, Maghrebi Arabic French,
Makurap, Malayalam, Maltese, Manx, Marathi, Mbya Guarani, Middle
French, Moksha, Munduruku, Naga, Naija, Nayini, Neapolitan, Nenets,
Nheengatu, North Sami, Northwest Gbaya, Norwegian, Occitan, Odia,
Old Church Slavonic, Old East Slavic, Old English, Old French, Old
Irish, Old Turkish, Ottoman Turkish, Pashto, Paumari, Persian, Pesh,
Phrygian, Polish, Pomak, Portuguese, Romanian, Russian, Sanskrit,
Scottish Gaelic, Serbian, Sindhi, Sinhala, Skolt Sami, Slovak,
Slovenian, Soi, South Levantine Arabic, Spanish, Spanish Sign
Language, Swedish, Swedish Sign Language, Tagalog, Tamil, Tatar,
Teko, Telugu, Telugu English, Thai, Tswana, Tupinamba, Turkish,
Turkish English, Turkish German, Ukrainian, Umbrian, Upper Sorbian,
Urdu, Uyghur, Uzbek, Veps, Vietnamese, Warlpiri, Welsh, Western
Armenian, Western Sierra Puebla Nahuatl, Wolof, Xavante, Xibe,
Yakut, Yoruba, Yupik and Zaar. The 179 languages belong to 35
families: Afro-Asiatic, Arawakan, Arawan, Austro-Asiatic,
Austronesian, Basque, Bororoan, Chibchan, Chukotko-Kamchatkan, Code
switching, Constructed, Creole, Dravidian, Eskimo-Aleut,
Indo-European, Japanese, Kartvelian, Khoe-Kwadi, Korean, Macro-Je,
Mande, Mayan, Mongolic, Na-Dene, Niger-Congo, Northwest Caucasian,
Pama-Nyungan, Sign Language, Sino-Tibetan, Tai-Kadai, Tungusic,
Tupian, Turkic, Uralic and Uto-Aztecan. Depending on the language,
the treebanks range in size from less than 1,000 tokens to over 3
million tokens. We expect the next release to be available in
November 2025.
The size of the following 48 treebanks changed significantly since
the last release:
Abkhaz AbNC : 6363 →
9652
Alemannic UZH : 0 → 1444
Ancient Hebrew PTNK : 39036 → 90770
Azerbaijani TueCL : 663 → 912
Bokota ChibErgIS : 0 → 2713
Bororo BDT : 6993 → 160356
Classical Armenian CAVaL : 88009 → 99663
Coptic Bohairic : 0 → 32724
Czech PDT : 1506486 → 0
Czech PDTC : 0 → 3440052
Egyptian UJaen : 14650 → 21927
English CHILDES : 0 → 226470
English GUM : 211920 → 233926
English LinES : 94217 → 106305
Esperanto Cairo : 0 → 177
Esperanto Prago : 0 → 839
French ALTS : 0 → 43832
Georgian GNC : 0 → 18747
Greek Cretan : 0 → 4351
Greek Lesbian : 0 → 3333
Haitian Creole Adolphe : 0 → 71734
Ika ChibErgIS : 0 → 3706
Khoekhoe KDT : 0 → 29007
Korean KSL : 66989 → 108072
Korean LittlePrince : 0 → 13656
Kyrgyz TueCL : 1001 → 1250
Latin CIRCSE : 18968 → 24899
Middle French PROFITEROLE: 12025 → 68454
Naga Suansu : 0 → 3123
Neapolitan RB : 10 → 199
Nenets Tundra : 0 → 651
Nheengatu CompLin : 19278 → 21813
Northwest Gbaya Autogramm: 2417 → 2693
Occitan CorAG : 0 → 37585
Occitan TTB : 0 → 25619
Odia ODTB : 0 → 1029
Old English Cairo : 0 → 171
Ottoman Turkish DUDU : 813 → 10287
Pashto Sikaram : 995 → 2515
Pesh ChibErgIS : 2508 → 4275
Russian Taiga : 197001 → 1758939
Sindhi Isra : 0 → 15741
Swedish LinES : 90961 → 102538
Turkish English BUTR : 0 → 393
Turkish TueCL : 0 → 904
Ukrainian ParlaMint : 51997 → 84189
Uzbek TueCL : 0 → 939
Xavante XDT : 1740 → 2234
In total, the new release contains 2,263,318 sentences,
36437487 surface tokens and 37,158,675 syntactic words.
Daniel Zeman, Joakim Nivre, Mitchell Abrams, Elia Ackermann, Jephtey
Adolphe, Noëmi Aepli, Hamid Aghaei, Željko Agić, Amir Ahmadi, Lars
Ahrenberg, Chika Kennedy Ajede, Arofat Akhundjanova, Furkan Akkurt,
Gabrielė Aleksandravičiūtė, Ika Alfina, Avner Algom, Khalid
Alnajjar, Chiara Alzetta, Antonios Anastasopoulos, Erik Andersen,
Matthew Andrews, Lene Antonsen, Tatsuya Aoyama, Katya Aplonova,
Angelina Aquino, Carolina Aragon, Glyd Aranes, Maria Jesus Aranzabe,
Bilge Nas Arıcan, Þórunn Arnardóttir, Gashaw Arutie, Jessica
Naraiswari Arwidarasti, Masayuki Asahara, Katla Ásgeirsdóttir, Deniz
Baran Aslan, Cengiz Asmazoğlu, Luma Ateyah, Furkan Atmaca, Mohammed
Attia, Aitziber Atutxa, Liesbeth Augustinus, Mariana Avelãs, Elena
Badmaeva, Jana Bajorat, Keerthana Balasubramani, Miguel Ballesteros,
Esha Banerjee, Sebastian Bank, Bryan Khelven da Silva Barbosa,
Verginica Barbu Mititelu, Starkaður Barkarson, Rodolfo Basile,
Victoria Basmov, Colin Batchelor, John Bauer, Seyyit Talha Bedir,
Shabnam Behzad, Juan Belieni, Alevtina Bémová, Kepa Bengoetxea,
İbrahim Benli, Yifat Ben Moshe, Marie Benzerrak, Ansu Berg, Gözde
Berk, Riyaz Ahmad Bhat, Erica Biagetti, Eckhard Bick, Agnė
Bielinskienė, Esma Fatıma Bilgin Taşdemir, Helin Binici, Kristín
Bjarnadóttir, Verena Blaschke, Rogier Blokland, Nina Böbel, Victoria
Bobicev, Loïc Boizou, Stavros Bompolas, Johnatan Bonilla, Emanuel
Borges Völker, Carl Börstell, Cristina Bosco, Gosse Bouma, Sam
Bowman, Adriane Boyd, Anouck Braggaar, António Branco, Myriam Bras,
Kristina Brokaitė, Lanni Bu, Eva Buráňová, Aljoscha Burchardt,
Carmen Cabeza, Natalia Cáceres Arandia, Marisa Campos, Marie
Candito, Bernard Caron, Gauthier Caron, Catarina Carvalheiro, Rita
Carvalho, Lauren Cassidy, Maria Clara Castro, Sérgio Castro, Tatiana
Cavalcanti, Gülşen Cebiroğlu Eryiğit, Flavio Massimiliano Cecchini,
Giuseppe G. A. Celano, Anila Çepani, Slavomír Čéplö, Neslihan Cesur,
Savas Cetin, Özlem Çetinoğlu, Fabricio Chalub, Liyanage Chamila,
Claudine Chamoreau, Shweta Chauhan, Yifei Chen, Ethan Chi, Taishi
Chika, Yongseok Cho, Jinho Choi, Bermet Chontaeva, Jayeol Chun,
Juyeon Chung, Alessandra T. Cignarella, Silvie Cinková, Aurélie
Collomb, Çağrı Çöltekin, Miriam Connor, Claudia Corbetta, Daniela
Corbetta, Francisco Costa, Marine Courtin, Benoît Crabbé, Mihaela
Cristescu, Vladimir Cvetkoski, Netanel Dahan, Ingerid Løyning Dale,
Philemon Daniel, Khensa Daoudi, Bijayalaxmi Dash, Satya Ranjan Dash,
Elizabeth Davidson, Leonel Figueiredo de Alencar, Mathieu Dehouck,
Martina de Laurentiis, Marie-Catherine de Marneffe, Ahmet Demir,
Valeria de Paiva, Mehmet Oguz Derin, Elvis de Souza, Arantza Diaz de
Ilarraza, Roberto Antonio Díaz Hernández, Carly Dickerson, Ariani Di
Felippo, Arawinda Dinakaramani, Elisa Di Nuovo, Bamba Dione, Peter
Dirix, Hoa Do, Kaja Dobrovoljc, Caroline Döhmer, Adrian Doyle,
Timothy Dozat, Kira Droganova, Magali Sanches Duran, Puneet Dwivedi,
Christian Ebert, Hanne Eckhoff, Masaki Eguchi, Sandra Eiche, Roald
Eiselen, Marhaba Eli, Ali Elkahky, Binyam Ephrem, Olga Erina, Tomaž
Erjavec, Louise Esher, Soudabeh Eslami, Farah Essaidi, Aline
Etienne, Wograine Evelyn, Sidney Facundes, Richárd Farkas, Ján
Faryad, Federica Favero, Jannatul Ferdaousi, Marília Fernanda,
Hector Fernandez Alcalde, Amal Fethi, Jennifer Foster, Barbara
Francioni, Theodorus Fransen, Cláudia Freitas, Kazunori Fujita,
Katarína Gajdošová, Daniel Galbraith, Edith Galy, Federica Gamba,
Marcos Garcia, José María García-Miguel, Moa Gärdenfors, Tanja
Gaustad, Efe Eren Genç, Fabrício Ferraz Gerardi, Kim Gerdes, Luke
Gessler, Filip Ginter, Gustavo Godoy, Iakes Goenaga, Koldo Gojenola,
Memduh Gökırmak, Yoav Goldberg, Gili Goldin, Xavier Gómez Guinovart,
Berta González Saavedra, Bernadeta Griciūtė, Matias Grioni, Loïc
Grobol, Normunds Grūzītis, Bruno Guillaume, Kirian Guiller, Céline
Guillot-Barbance, Tunga Güngör, Vladimir Gurevich, Nizar Habash,
Hinrik Hafsteinsson, Michael Hahn, Jan Hajič, Jan Hajič jr., Eva
Hajičová, Mika Hämäläinen, Linh Hà Mỹ, Na-Rae Han, Muhammad
Yudistira Hanifmuti, Takahiro Harada, Sam Hardwick, Kim Harris,
Naïma Hassert, Dag Haug, Jiří Havelka, Johannes Heinecke, Oliver
Hellwig, Felix Hennig, Barbora Hladká, Jaroslava Hlaváčová, Florinel
Hociung, Diana Hoefels, Petter Hohle, Nick Howell, Yidi Huang,
Marivel Huerta Mendez, Jena Hwang, Takumi Ikeda, Inessa Iliadou,
Anton Karl Ingason, Radu Ion, Elena Irimia, Ọlájídé Ishola, Artan
Islamaj, Kaoru Ito, Federica Iurescia, Jessica K. Ivani, Sandra
Jagodzińska, Siratun Jannat, Tomáš Jelínek, Apoorva Jha, Katharine
Jiang, Sylvanus Job, Mayank Jobanputra, Anders Johannsen, Hildur
Jónsdóttir, Fredrik Jørgensen, Zhuoxuan Ju, Markus Juutinen, Hüner
Kaşıkara, Nadezhda Kabaeva, Sylvain Kahane, Hiroshi Kanayama, Jenna
Kanerva, Neslihan Kara, Ritván Karahóǧa, Jiří Kárník, Andre Kåsen,
Tolga Kayadelen, Sarveswaran Kengatharaiyer, Václava Kettnerová,
Lilit Kharatyan, Jesse Kirchner, Elena Klementieva, Elena Klyachko,
Petr Kocharov, Arne Köhn, Abdullatif Köksal, Veronika Kolářová,
Kamil Kopacewicz, Timo Korkiakangas, Mehmet Köse, Alexey Koshevoy,
Nelda Kote, Natalia Kotsyba, Barbara Kovačić, Jolanta Kovalevskaitė,
Emmanuelle Kowner, Simon Krek, Parameswari Krishnamurthy, Sandra
Kübler, Lucie Kučová, Adrian Kuqi, Oğuzhan Kuyrukçu, Aslı Kuzgun,
Sookyoung Kwak, Kris Kyle, Käbi Laan, Veronika Laippala, Lorenzo
Lambertino, Israel Landau, Tatiana Lando, Septina Dian Larasati,
Pierre Larrivée, Alexei Lavrentiev, John Lee, Phương Lê Hồng,
Alessandro Lenci, Saran Lertpradit, Herman Leung, Maria Levina,
Lauren Levine, Cheuk Ying Li, Josie Li, Keying Li, Yixuan Li, Yuan
Li, KyungTae Lim, Bruna Lima Padovani, Yi-Ju Jessica Lin, Krister
Lindén, Yang Janet Liu, Zoey Liu, Nikola Ljubešić, Irina
Lobzhanidze, Olga Loginova, Markéta Lopatková, Lucelene Lopes, Edita
Luftiu, Arsenii Lukashevskyi, Stefano Lusito, Anne-Marie Lutgen,
Andry Luthfi, Mikko Luukko, Olga Lyashevskaya, Teresa Lynn, Vivien
Macketanz, Menel Mahamdi, Jean Maillard, Ilya Makarchuk, Aibek
Makazhanov, Francesco Mambrini, Michael Mandl, Christopher Manning,
Ruli Manurung, Büşra Marşan, Cătălina Mărănduc, David Mareček,
Katrin Marheinecke, Stella Markantonatou, Héctor Martínez Alonso,
Lorena Martín Rodríguez, André Martins, Cláudia Martins, Jan Mašek,
Hiroshi Matsuda, Yuji Matsumoto, Alessandro Mazzei, Ryan McDonald,
Sarah McGuinness, Maitrey Mehta, Pierre André Ménard, Gustavo
Mendonça, Hilla Merhav, Tatiana Merzhevich, Paul Meurer, Niko
Miekka, Marie Mikulová, Emilia Milano, Aleksandra Miletić, Aaron
Miller, Junghyun Min, Yael Minerbi, Jiří Mírovský, Karina
Mischenkova, Anna Missilä, Cătălin Mititelu, Maria Mitrofan, Yusuke
Miyao, Biswakalpita Mohapatra, AmirHossein Mojiri Foroushani, Judit
Molnár, Amirsaeid Moloodi, Simonetta Montemagni, Amir More, Laura
Moreno Romero, Giovanni Moretti, Shinsuke Mori, Tomohiko Morioka,
Shigeki Moro, Bjartur Mortensen, Bohdan Moskalevskyi, Kadri
Muischnek, Robert Munro, Yugo Murawaki, Nikolett Mus, Kaili
Müürisep, Pinkey Nainwani, Mariam Nakhlé, Juan Ignacio Navarro
Horñiacek, Anna Nedoluzhko, Gunta Nešpore-Bērzkalne, Manuela Nevaci,
Lương Nguyễn Thị, Huyền Nguyễn Thị Minh, Yoshihiro Nikaido, Vitaly
Nikolaev, Rattima Nitisaroj, Victor Norrman, Alireza Nourian, Michal
Novák, Maria das Graças Volpe Nunes, Hanna Nurmi, Stina Ojala, Atul
Kr. Ojha, Hulda Óladóttir, Adédayọ̀ Olúòkun, Mai Omura, Emeka
Onwuegbuzia, Noam Ordan, Petya Osenova, Robert Östling, Annika Ott,
Lilja Øvrelid, Masanori Oya, Şaziye Betül Özateş, Merve Özçelik,
Arzucan Özgür, Balkız Öztürk Başaran, Teresa Paccosi, Petr Pajas,
Alessio Palmero Aprosio, Jarmila Panevová, Anastasia Panova, Thiago
Alexandre Salgueiro Pardo, Shantipriya Parida, Hyunji Hayley Park,
Niko Partanen, Elena Pascual, Marco Passarotti, Agnieszka Patejuk,
Guilherme Paulino-Passos, Giulia Pedonese, Oggi Peeters, Angelika
Peljak-Łapińska, Siyao Peng, Siyao Logan Peng, Rita Pereira, Sílvia
Pereira, Cenel-Augusto Perez, Natalia Perkova, Guy Perrier, Slav
Petrov, Daria Petrova, Andrea Peverelli, Jason Phelan, Claudel
Pierre-Louis, Jussi Piitulainen, Yuval Pinter, Clara Pinto, Rodrigo
Pintucci, Tommi A Pirinen, Emily Pitler, Magdalena Plamada, Barbara
Plank, Alistair Plum, Thierry Poibeau, Larisa Ponomareva, Martin
Popel, Clamença Poujade, Lauma Pretkalniņa, Rigardt Pretorius,
Sophie Prévost, Prokopis Prokopidis, Adam Przepiórkowski, Robert
Pugh, Tiina Puolakainen, Christoph Purschke, Sampo Pyysalo, Peng Qi,
Andreia Querido, Andriela Rääbis, Ella Rabinovich, Alexandre
Rademaker, Mutee-u Rahman, Mizanur Rahoman, Taraka Rama, Loganathan
Ramasamy, Carlos Ramisch, Joana Ramos, Fam Rashel, Mohammad Sadegh
Rasooli, Vinit Ravishankar, Livy Real, Petru Rebeja, Siva Reddy,
Mathilde Regnault, Georg Rehm, Arij Riabi, Ivan Riabov, Michael
Rießler, Erika Rimkutė, Larissa Rinaldi, Laura Rituma, Putri
Rizqiyah, Luisa Rocha, Eiríkur Rögnvaldsson, Ivan Roksandic, Norton
Trevisan Roman, Mykhailo Romanenko, Natalia Romanova, Rudolf Rosa,
Valentin Roșca, Paulette Roulon, Davide Rovati, Ben Rozonoyer, Olga
Rudina, Jack Rueter, Paolo Ruffolo, Kristján Rúnarsson, Rozana
Rushiti, Shoval Sadde, Pegah Safari, Aleksi Sahala, Kalyanamalini
Sahoo, Saraswati Sahoo, Shadi Saleh, Alessio Salomoni, Tanja
Samardžić, Konstantinos Sampanis, Stephanie Samson, Xulia
Sánchez-Rodríguez, Manuela Sanguinetti, Ezgi Sanıyar, Dage Särg,
Marta Sartor, Albina Sarymsakova, Mitsuya Sasaki, Baiba Saulīte,
Agata Savary, Yanin Sawanakunanon, Shefali Saxena, Kevin Scannell,
Salvatore Scarlata, Emmanuel Schang, Nathan Schneider, Sebastian
Schuster, Lane Schwartz, Djamé Seddah, Wolfgang Seeker, Sven
Sellmer, Mojgan Seraji, Magda Ševčíková, Petr Sgall, Syeda Shahzadi,
Mo Shen, Atsuko Shimada, Gyu-Ho Shin, Hiroyuki Shirasu, Yana
Shishkina, Muh Shohibussirri, Maria Shvedova, Jean Sibille, Janine
Siewert, Einar Freyr Sigurðsson, João Silva, Aline Silveira, Natalia
Silveira, Sara Silveira, Maria Simi, Radu Simionescu, Katalin Simkó,
Mária Šimková, Haukur Barri Símonarson, Kiril Simov, Dmitri
Sitchinava, Ted Sither, Aaron Smith, Isabela Soares-Bastos, Per Erik
Solberg, Dolores Sollberger, Barbara Sonnenhauser, Shafi Sourov,
Nina Speransky, Rachele Sprugnoli, Vivian Stamou, Steinþór
Steingrímsson, Antonio Stella, Jan Štěpánek, Barbora Štěpánková,
Abishek Stephen, Milan Straka, Omer Strass, Emmett Strickland, Jana
Strnadová, Alane Suhr, Yogi Lesmana Sulestio, Umut Sulubacak,
Hakyung Sung, Shingo Suzuki, Daniel Swanson, Zsolt Szántó, Chihiro
Taguchi, Dima Taji, Luigi Talamo, Fabio Tamburini, Mary Ann C. Tan,
Takaaki Tanaka, Dipta Tanaya, Mirko Tavoni, Nursena Teker, Samson
Tella, Isabelle Tellier, Marinella Testori, Guillaume Thomas, Tarık
Emre Tıraş, Thea Tollersrud, Sara Tonelli, Liisi Torga, Lucas
Toribio, Marsida Toska, Trond Trosterud, Anna Trukhina, Reut
Tsarfaty, Kira Tulchynska, Utku Türk, Francis Tyers, Sveinbjörn
Þórðarson, Vilhjálmur Þorsteinsson, Sumire Uematsu, Roman Untilov,
Zdeňka Urešová, Larraitz Uria, Hans Uszkoreit, Andrius Utka, Elena
Vagnoni, Sowmya Vajjala, Socrates Vak, Socrates Vakirtzian, Rob van
der Goot, Martine Vanhove, Daniel van Niekerk, Gertjan van Noord,
Viktor Varga, Uliana Vedenina, Giulia Venturi, Marianne
Vergez-Couret, Barbora Vidová Hladká, Eric Villemonte de la
Clergerie, Veronika Vincze, Anishka Vissamsetty, Natalia Vlasova,
Eleni Vligouridou, Aya Wakasa, Joel C. Wallenberg, Lars Wallin,
Abigail Walsh, John Wang, Jonathan North Washington, Leonie
Weissweiler, Maximilan Wendt, Paul Widmer, Shira Wigderson, Sri
Hartati Wijono, Vanessa Berwanger Wille, Seyi Williams, Miriam
Winkler, Shuly Wintner, Mats Wirén, Christian Wittern, Alena
Witzlack-Makarevich, Tsegay Woldemariam, Tak-sum Wong, Alina
Wróblewska, Qishen Wu, Mary Yako, Kayo Yamashita, Naoki Yamazaki,
Chunxiao Yan, Xiulin Yang, Koichi Yasuoka, Marat M. Yavrumyan, Arife
Betül Yenice, Enes Yılandiloğlu, Olcay Taner Yıldız, Zhuoran Yu,
Arlisa Yuliawati, Zdeněk Žabokrtský, Shorouq Zahra, Amir Zeldes, He
Zhou, Hanzhi Zhu, Yilun Zhu, Anna Zhuravleva, Rayan Ziane, Artūrs
Znotiņš
References
Marie-Catherine de Marneffe, Christopher Manning, Joakim Nivre,
Daniel Zeman. 2021. Universal Dependencies. In Computational
Linguistics 47:2, pp. 255–308.
Joakim Nivre, Marie-Catherine de Marneffe, Filip Ginter, Jan Hajič,
Christopher D. Manning, Sampo Pyysalo, Sebastian Schuster, Francis
Tyers, Daniel Zeman. 2020. Universal Dependencies v2: An Evergrowing
Multilingual Treebank Collection. In Proceedings of LREC.
--------------------------------------------------------------------------------
Marie-Catherine de Marneffe, Bill MacCartney, and Christopher D.
Manning. 2006. Generating typed dependency parses from phrase
structure parses. In Proceedings of LREC.
Marie-Catherine de Marneffe and Christopher D. Manning. 2008. The
Stanford typed dependencies representation. In COLING Workshop on
Cross-framework and Cross-domain Parser Evaluation.
Marie-Catherine de Marneffe, Timothy Dozat, Natalia Silveira, Katri
Haverinen, Filip Ginter, Joakim Nivre, and Christopher Manning.
2014. Universal Stanford Dependencies: A cross-linguistic typology.
In Proceedings of LREC.
Joakim Nivre, Marie-Catherine de Marneffe, Filip Ginter, Yoav
Goldberg, Jan Hajič, Christopher D. Manning, Ryan McDonald, Slav
Petrov, Sampo Pyysalo, Natalia Silveira, Reut Tsarfaty, Daniel
Zeman. 2016. Universal Dependencies v1: A Multilingual Treebank
Collection. In Proceedings of LREC.
Slav Petrov, Dipanjan Das, and Ryan McDonald. 2012. A universal
part-of-speech tagset. In Proceedings of LREC.
Daniel Zeman. 2008. Reusable Tagset Conversion Using Tagset Drivers.
In Proceedings of LREC.