We are happy to announce the release of version 2.12 of SUD (Surface Syntactic Universal Dependencies, see https://surfacesyntacticud.github.io/) 244 treebanks are available (https://grew.fr/download/sud-treebanks-v2.12.tgz): 8 are native SUD corpora and 236 are automatically converted from UD v2.12. See https://surfacesyntacticud.github.io/data/ for details. All 2.12 corpora of UD and SUD are availble on Grew-match: https://universal.grew.fr https://universal.grew.fr/ A set of “Universal tables”, giving a global view of usage of features, dependency relations in UD and SUD treebanks, are available on https://tables.grew.fr https://tables.grew.fr/ See the UD announcement https://list.elra.info/mailman3/hyperkitty/list/corpora@list.elra.info/thread/EZOTTXRH5QR4GXSBTQ32DEEIW7RCBYZ5/ for more information about corpora and contributors.
SUD is characterized by its distributional and functional head and the syntactic relation corresponding to positional paradigms.
SUD offers several advantages for various studies, particularly in the areas of phrase structure, word order, and typology. For example, UD may present challenges in identifying noun phrases (NPs) since adpositions depend on nouns or in discussing subject-auxiliary order since the subject is directly linked to the lexical verb.
It is important to note that the transformation from UD to SUD is accomplished using a universal Grew grammar that incorporates a set of heuristics. One such heuristic is that the most distant functional words dominate the nearest functional words to the lexical head. While this heuristic has proven effective in many cases, there are exceptions. As a result, specific grammars have been developed for languages such as German or Wolof. We encourage you to report any issues on the SUD GitHub repository https://github.com/surfacesyntacticud/guidelines/issues, and we will be in touch to collaborate on the development of specific grammars if needed.
If you plan to develop a new UD treebank, you can consider to start a native SUD treebank, especially if you are familiar with standard syntactic theories. If you already have a treebank in a different annotation scheme (including phrase-structure based annotation), it can be simpler to first convert it in SUD and then in UD. In any case, you can contact us.
https://notes.inria.fr/l3Wgvt8LTm22PtyBBF6tpg#references-about-sudReferences about SUD
Kim Gerdes, Bruno Guillaume, Sylvain Kahane, Guy Perrier. Starting a new treebank? Go SUD! Theoretical and practical benefits of the Surface-Syntactic distributional approach https://hal.inria.fr/hal-03509136v1 in DepLing 2021 http://depling.org/depling2021/. Kim Gerdes, Bruno Guillaume, Sylvain Kahane, Guy Perrier. Improving Surface-syntactic Universal Dependencies (SUD): surface-syntactic relations and deep syntactic features https://hal.inria.fr/hal-02266003v1 in TLT 2019 https://syntaxfest.github.io/syntaxfest19/tlt2019/tlt2019.html. Kim Gerdes, Bruno Guillaume, Sylvain Kahane, Guy Perrier. SUD or Surface-Syntactic Universal Dependencies: An annotation scheme near-isomorphic to UD https://hal.inria.fr/hal-01930614v1 in UDW 2018 https://universaldependencies.org/udw18/.