*CFP Deadline Extended to February 24, 2020*

2nd Call For Papers:

The Linguistic Data Consortium (LDC) will host the workshop "Citizen Linguistics in Language Resource Development" (CLLRD 2020) at LREC 2020 in Marseille, France, on May 16, 2020.

Notwithstanding advances in data collection and processing, language related research, education and technology development continue to suffer from inadequate supply of Language Resources. To supplement traditional LR development, which typically relies upon top down support from some government or private foundation, Citizen Linguistics (the Citizen Science of Language) changes the incentive model to attract a new workforce which in turn requires a different kind of workflow. Incentives to Citizen Linguists may include the opportunities to learn and develop new skills; to socialize, compete and earn status or recognition; to document their language and promote their culture and, most importantly, to contribute directly to research and indirectly to a greater cause or social good. By offering human contributors sustained access to appropriate opportunities, activities, and incentives, we can enhance LR development well beyond what traditional direct funding alone can produce. However, along with these new incentives and workflows come new challenges whose solutions are relevant even to expert (paid) annotation.

The goal of this hybrid workshop/tutorial is two-fold. First is to provide a forum for researchers and practitioners to explore and discuss the issues, advantages and challenges of using Citizen Linguistics as a method for the creation of language resources. Second is to introduce LanguageARC, a new Citizen Linguistics web portal for collecting language data and judgements.

Topics

There will be two sessions at the workshop. For the first session, papers are welcome on any topic related to Citizen Linguistics in the development of Language Resources including:

· language specific challenges

· workforce recruitment, training and evaluation

· task design, granularity and assignment

· workflow and ordering

· response evaluation and aggregation

· the preparation of language resources from raw results and their use in research and in developing and evaluating HLTs.

For the second session, papers are welcome on any topic related specifically to the use of LanguageARC.org to create tasks that collect language data for research and development. Presenting authors of Best Papers Employing LanguageARC will receive travel subsidies to present during this workshop at LREC. The second session will also include a brief tutorial on LanguageARC for new or potential users. By the end of the tutorial, attendees will be fully capable of implementing their data collection or annotation project via LanguageARC.

Submissions

We will accept papers between 4 and 8 pages excluding references. Accepted workshop papers will be published as workshop proceedings along with the main conference papers. Papers must follow the LREC 2020 style sheet and author’s kit templates. Papers are to be submitted via the workshop START page.

Important Dates

- submission deadline: February 24, 2020

- notification of acceptance: March 12, 2020

- deadline for camera-ready versions: April 2, 2020

Identify, Describe and Share your LRs!

Describing your LRs in the LRE Map is now a normal practice in the submission procedure of LREC (introduced in 2010 and adopted by other conferences). To continue the efforts initiated at LREC 2014 about “Sharing LRs” (data, tools, web-services, etc.), authors will have the possibility, when submitting a paper, to upload LRs in a special LREC repository. This effort of sharing LRs, linked to the LRE Map for their description, may become a new “regular” feature for conferences in our field, thus contributing to creating a common repository where everyone can deposit and share data.
As scientific work requires accurate citations of referenced work so as to allow the community to understand the whole context and also replicate the experiments conducted by other researchers, LREC 2020 endorses the need to uniquely Identify LRs through the use of the International Standard Language Resource Number (ISLRN, www.islrn.org), a Persistent Unique Identifier to be assigned to each Language Resource. The assignment of ISLRNs to LRs cited in LREC papers will be offered at submission time.

For more information please visit the workshop website (https://sites.google.com/view/cllrd-2020/) or contact James Fiumara: jfiumara AT ldc.upenn.edu

The organizing committee,

Chris Callison-Burch, University of Pennsylvania

Christopher Cieri, Linguistic Data Consortium, University of Pennsylvania

James Fiumara, Linguistic Data Consortium, University of Pennsylvania

Mark Liberman, Linguistic Data Consortium, University of Pennsylvania

Chris
—
Christopher Cieri
Executive Director, Linguistic Data Consortium and Adjunct Associate Professor of Linguistics
University of Pennsylvania
3600 Market Street, Philadelphia, PA. 19104
p: 215-573-5489, f: 215-573-2175, mailto:ccieri@ldc.upenn.edu