SIG Writing 2024
Call for Conference Papers & Research School Participation
The EARLI Special Interest Group Writing, Paris Nanterre
University (France), Sorbonne Nouvelle University (France), and
the University of Turku (Finland) invite proposals to the 20th biennial SIG
Writing conference to be held at Paris Nanterre University, Nanterre,
France, from 26-28 June 2024. Prior to the conference, from 24-25 June
2024, the Research School will be held.
The conference theme for SIG Writing 2024 is ways2write. Writing is a
ubiquitous social and professional practice in most parts of the world. It
takes on diverse, increasingly mixed and heterogeneous forms, manifests
itself over ever wider spans of life and activity, exploits more and more
media and technologies, and is increasingly fluid. The source and its
individual and collective characteristics (age, cognitive abilities,
human/non-human...), the medium, the context and the production objectives,
time and space—all give rise to a diversity of writings and ways of
writing, raising numerous questions in an interdisciplinary field. While
the conference traditionally hosts a diversity of topics and approaches,
proposals within the ways2write theme are particularly welcome.
More information and full calls for conference papers and Research School
participation at the SIG WRITING 2024 website:
https://sites.google.com/view/sig-writing-2024/
* Submission deadline *EXTENDED*: November 17, 2023
* Submission guidelines
The aim of the conference is to promote interaction among researchers who
are interested in understanding the cognitive, social, and developmental
processes involved in writing, who are concerned with designing writing
instruction in various educational settings, or who are engaged with
exploring the functions of writing in different social and institutional
contexts. The scope of the conference is broad, and we hope to draw
together a wide range of researchers and professionals.
CONFERENCE: Available presentation formats are Paper, Symposium, Poster,
Roundtable, and Demonstration Session.
The abstract should be 250-350 words (including any references). All
proposals must be written in English and submitted using the online
submission system: https://www.earli-eapril.org
PhD students and early career researchers are invited to submit to and
participate in the pre-conference Research School.
RESEARCH SCHOOL: Available presentation formats are Paper Presentation,
Poster or 30' Article Manuscript Discussion.
The abstract should be 350 words (including any references). All proposals
must be written in English and submitted using the online submission
system: https://www.earli-eapril.org
For any questions, please contact us at sigwriting24.paris(a)gmail.com
You can also follow the updates on Twitter/X: @SIGWriting2024
Or on Facebook: SigWriting Paris (
https://www.facebook.com/profile.php?id=100091544465843)
On behalf of the organizing committee,
Aleksandra Miletić
----------
Aleksandra Miletić
Post-doctoral researcher on the CorCoDial
<https://blogs.helsinki.fi/yvesscherrer/corcodial-corpus-based-computational…>
Project
Language Technology Group, University of Helsinki
Hi Ada,
As these threads consists in a discussion rather than a set of scientific statements (the first one being motivated by responding to a stimuli, while the second consists in defining/motivating a scientific position that is supposed to stand aside of any specific discussion), I forbid you to use any of my writings made on the corpora list in any of your web sites.
Of course, I still authorise corpora list to keep archives (as these are maintained along with the full discussion context).
Regards,
Gilles Sérasset,
> On 31 Oct 2023, at 19:19, Ada Wan <adawan919(a)gmail.com> wrote:
>
> Dear all
>
> I am about to post CorporaList threads which I have responded to on my own website, as it seems some of my replies are not yet showing on the public website (https://list.elra.info/mailman3/hyperkitty/list/corpora@list.elra.info/ <https://list.elra.info/mailman3/hyperkitty/list/corpora@list.elra.info/>).
> If any of you should have any objections to this (because you don't want your replies to be seen), please let me know asap.
>
> Thanks and best
> Ada
>
>
> On Mon, Oct 30, 2023 at 9:31 PM Ada Wan <adawan919(a)gmail.com <mailto:adawan919@gmail.com>> wrote:
> [Disregard if not interested]
>
> Dear all
>
> Thanks for your emails. The issue of where the misunderstanding might lie is clearer to me now, esp. given Gilles' example with his niece.
> (@Anil: perhaps you are right in your observations in a possible style change in my correspondences --- I may well have been running out of patience at this point (considering I have been in rebuttal mode since at least 2019 [1]?! So it's a good thing that morphology is coming to an end!). In the beginning, I had expected the professionals whom I expect to be experienced in "language"/data matters (and the subscribers of the CorporaList) to be the first to appreciate my results, but it turned out to be the other way around, it seems. Those who have been exposed to fewer "language tales" [2] can be quicker in getting it. But anyway, please allow me to explain again below.)
>
> Most importantly, in the niece example, there are 2 things that should be discerned from one another:
> i. what the niece uttered [i.e. data/observation (do note also how the data is collected: recorded or transcribed?)], and
> ii. what one's interpretation/analysis of her utterance is [i.e. interpretation/analysis of observation].
>
> In "grammarese" formulation, the case in question is as follows: Gilles' niece conjugated an irregular verb with a regular verb conjugation pattern.[3]
>
> Gilles suspects that (linguistic) morphology exists (and/or is universal?) because the pattern of the niece's utterance resembled one of the patterns (sometimes formulated from "rules" [4]) often studied in literature on morphology.
>
> Re "she clearly showed me that her way of learning languages did not consisted in reading/listening to huge amounts of utterances ...":
> even if the niece had only been exposed to 10 utterances, if 8 of which exhibit a certain pattern, and 2 of which are more irregular/outlier-like, chances of her applying habits that are in line with the pattern observed more often in the rarer/unobserved cases can be high --- and would you not agree that's rather reasonable?
> There are or may be un-/subconscious *patterns*, sure. But I do not argue against these, for such patterns do not have to be formulated in terms of "stems"/"roots"/"affixes", and more importantly, most of these patterns surface more often in books than in real life anyway. So the fact that one believes that a morphological paradigm is to be formulated in a certain way is pretty much a matter of preference of a (group of) researcher(s).
>
> Re "but she was able to learn some word formation rules from very few examples":
> what she "learned" might just be some patterns --- at least according to your/our analysis here. That is, she might not have yet had much exposure to "rules", but Gilles might have. (Hence his conviction of the reality of morphology may be stronger.)
>
> Re "In my humble opinion, this proves that morphology exists, if not in the LLM matrixes, at least in the human brain":
> I don't disagree with how one's mind can be clouded by archaic ideals or theories. But shouldn't a better theory exist outside of the mind of a person or a group of scientists as well?
>
> If one accounts for text data in its entirety, i.e. without disregarding or adding in whitespaces, evaluate in bigger span (as mentioned in the rebuttal here [5]), the notion of morphology is actually irrelevant to a comprehensive study of (language) data. Wouldn't you agree?
> With your plane and bird analogy: so you could claim that if you do insist on cherry-picking from data, shouldn't your analyses still matter? Well, if they don't generalize well, they may end up mattering to you only.
>
> Re "... (or issued from a colonialist point of view of Aves on the task at hand…) and asking them to renounce this oh so obsolete bad habit":
> I suppose it depends on which side of history one would like to be on too.
>
> I understand that it can be much harder for those who have lived in a country where "language" activities (and/or the concept of "language") have been officially and explicitly supported/promoted. This "privilege" now puts many of us in a rather disadvantageous position in unlearning much.
>
> Re "ML based language models":
> I don't know what you understand of these, but the logic behind such (e.g. a probabilistic processing/interpretation of sequences) is often not far from how "humans" are known to "process language(s)" --- which is why many modeling experiments can bridge "both spheres" (though I believe many experienced in modeling would buy less into this "human 'versus' machine" narrative).
>
> @Gilles: I am also curious what your takeaway is from Quine's "Word and Object" (e.g. at https://mitpress.mit.edu/9780262670012/word-and-object/ <https://mitpress.mit.edu/9780262670012/word-and-object/>) in relation to our conversation here.
>
> @Anil: the computational phenomenology is already in "Fairness in Representation" (note that the insights were obtained from a collection of many models, i.e. most of them are epi-phenomena). So I think what I have in mind is orthogonal to what you described. Crimes and other misconduct have also been around for millenia, are these things we want to keep?
> That having been clarified, do you have other objections to my contributions?
>
> I hope I have addressed your concerns sufficiently. If not, please let me know.
>
> Thanks and best
> Ada
>
>
> [1] The results that ending up getting published in Fairness in Representation <https://openreview.net/forum?id=-llS6TiOew> (ICLR 2022) had been rejected about 5 times, those in "Statistical (Un-)typology" (even with "greedy" research incentives so to fit in) about another 5 times from May 2019 to April 2022, in addition to other attempts/withdrawals. Then all I have been dealing with is just retaliation. In fact, I just got some stuff stolen and had to get things reported to the police, so please pardon my delay in reply.
> [2] At a point, I thought perhaps it'd be best to have no disciplines. Then I realized not all disciplines are like "language", "linguistics", or "structural linguistics".
> That having been expressed, can having "no disciplines" be still a good thing? Possibly, but another debate, another time, perhaps.
> [3] But let's bear in mind: what one'd consider a "regular verb" (vs "irregular verb") is nothing but some sequence/utterance seen/heard more frequently than others.
> [4] esp. in the history of "transformational grammar" that was popular around the mid 20th century. "Grammar rules" might have been around for longer, but branding things as within the domain of "morphology" as a module of a bigger "structure"/"structural framework" of "linguistic analysis" is a matter that has become more popular only in the past half a century or so due to "transformational grammar" / "structural linguistics".
> But please do note that even in "structural linguistics", many patterns are explained away in terms of (the ranking of) constraints (i.e. no "transformation"). There are no/few reasons to posit the notion of "deep structure(s)", from/through which, in the case of morphological analyses, "stems"/"roots" get to be held often as the bases of inflection. That is, aside from "grammar rules" taught in e.g. schools and those inside of researchers' mind, evidence for the existence of "rules" is actually rather little, if any. [N.B. this can be considered advanced for those who didn't have a theoretical background in Linguistics.]
> [5] https://openreview.net/forum?id=-llS6TiOew <https://openreview.net/forum?id=-llS6TiOew>
>
>
>
> On Thu, Oct 26, 2023 at 6:05 PM Anil Singh <anil.phdcl(a)gmail.com <mailto:anil.phdcl@gmail.com>> wrote:
> I have also been carefully reading the exchanges. Although I was planning not to add to this exchange, at this point I am tempted to reply.
>
> Ada's early emails were adding something to the discussion and debate, but at this point they are simply saying 'I am right, you are wrong', without giving any explanation or evidence.
>
> I was also thinking of the same kind of examples as given by Gilles. Till Ada provides some very good reasoning and evidence, it is hard for me to completely agree with her, although as I said earlier, I do agree with her on many, perhaps most of things.
>
> Ada, I sincerely respect your learning and competence. However, you said earlier you are proposing an alternative computational phenomenology. That would be really interesting. Won't it be better to first propose it and argue in more specific terms and with more convincing arguments and evidence that it is the right one, or at least 'more right' than the existing ones (there are more than one). Given that there is already Information Theory, it has to go beyond byte, which is an accidental unit of computation, and character, which is also not well-defined, sometimes even for one specific writing system. To give one such example, perhaps not the best one, I always thought of Indic script dependent vowel (maatraa) as a character, but I recently found that languages like Java and Python do not treat such written symbols as character, so when I try to get the length of an Indic-script string, the in-built string length functions give only the number of consonant symbols and independent vowels in the string. We got wrong results using these functions and I only accidentally discovered that this is the case. The reason, of course, is that these functions and programming languages treat such dependent vowels as diacritics, which is also correct in some ways. I did not realize this earlier because in India we often use a Latin script-based notation called WX for Indic scripts in NLP due to the encoding and input method related problems that I referred to in one of my earlier replies. The WX notation, however, does not distinguish between dependent and independent vowels and treats both of them as the same character, which is how most of us, if not all, think of them in India to the best of my knowledge. On the other hand, the consonant symbol modifier 'halant' is not used in WX, but is used in Indic-scripts and its presence might also cause disagreements about what the string length is. In other words, character as a unit does not work in your terms. In fact, who knows how many errors for Indic script text have made their way into computational results due to this simple fact. And perhaps they still do because it took me a long time to realize this, which at first led to consternation, because in text processing if you can't rely on the string length function, what can you rely on?
>
> As for phonemes, major ML researchers like Vincent Ng don't believe it to be a real unit of language. The argument is that we don't need phonemes for applications like speech recognition.
>
> If not byte and character, what are we left with in terms of computational phenomenology? At the very least there has to be such a well-argued and well-evidenced alternative in order to try to persuade others to agree to your views. I would be very much interested in thinking about such an alternative even if at present I don't think you are right about all your views. After all, to throw away millenia of work on language-science, very strong reasoning and evidence for an alternative is not an unrealistic expectation.
>
> On Thu, Oct 26, 2023 at 8:44 PM Gilles Sérasset via Corpora <corpora(a)list.elra.info <mailto:corpora@list.elra.info>> wrote:
> Hi Ada,
>
> When my niece was 3 year old, she said to her little brother “Maman, elle venira plus tard…” (Mum will come back later, in “incorrect” French).
>
> She made a “mistake" here by using “venira” (a wrong future form for verb venir (to come)) instead of the “correct" “viendra”. It was wrong, but perfectly predictable using the most productive morphological rules of French future formation.
>
> She was 3 years old, so I doubt she was really understanding what morphology is, nevertheless, with this mistake, she clearly showed me that her way of learning languages did not consisted in reading/listening to huge amounts of utterances but she was able to learn some word formation rules from very few examples. And indeed, human is still able to perfectly learn complex things with very small explanation and/or very few example (something that is totally beyond ML based language models).
>
> In my humble opinion, this proves that morphology exists, if not in the LLM matrixes, at least in the human brain. Hence modelling such rules (and even using them to analyse or produce) is a valid approach, independently of any other (also valid) approaches.
>
> If I want to say it another way :
>
> There has been many scientific proofs that human will not be able to fly… And these proofs were valid under their own hypothesis.
>
> Indeed, planes do not flap their wings… they are using other ways to perform a task that was performed by birds.
>
> Nevertheless, I have never been the witness of any plane (or pilot) trying to convince birds that their way of flying is obsolete (or issued from a colonialist point of view of Aves on the task at hand…) and asking them to renounce this oh so obsolete bad habit.
>
> Regards,
>
> Gilles,
>
> _______________________________________________
> Corpora mailing list -- corpora(a)list.elra.info <mailto:corpora@list.elra.info>
> https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ <https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/>
> To unsubscribe send an email to corpora-leave(a)list.elra.info <mailto:corpora-leave@list.elra.info>
>
>
> --
> - Anil
We are proud to announce the release of CreoleVal - a collection of benchmarks for 28 Creole languages. The collection of datasets span tasks such as relation classification, machine comprehension, machine translation, named entity recognition, and use cases such as language modeling. We cover Haitian Creole, Bislama, Chavacano, Pitkern, Singlish, Tok Pisin, Papiamento, and others.
We hope the NLP community will include this collection of datasets in ongoing & future evaluations of methods directed at low-resource languages. Not only that, we also hypothesise that CreoleVal will open the door for controlled experimentation with transfer learning methodology.
This resource has been long in the making, and was made possible by a long list of collaborators.
For a pre-print, see: https://arxiv.org/abs/2310.19567
For code and data, see: https://github.com/hclent/CreoleVal
(Repository under construction)
INNOVATIONS IN SEARCH AND INFORMATION RETRIEVAL - BCS SEARCH SOLUTIONS - LONDON, UK
Search Solutions is the BCS Information Retrieval Specialist Group’s annual event focused on practitioner issues in the arena of search and information retrieval. It is a unique opportunity to bring together academic research and practitioner experience.
The Search Solutions event consists of a Tutorial day and a Conference day, each of which has a separate registration.
LOCATION
Both the Tutorials and the Conference take place at the BCS London Headquarters in London, UK, Ground Floor, 25 Copthall Avenue EC2R 7BP. This is a 10 minute walk from Liverpool Street Station (Elizabeth Line and London Underground) and a 20 minute walk from London Bridge Station.
Registration forms on Eventbrite
Tutorials:
https://www.eventbrite.co.uk/e/search-solutions-2023-tutorials-tickets-7392…
Conference:
https://www.eventbrite.co.uk/e/search-solutions-2023-conference-tickets-739…
TUTORIALS
Tutorial 1
November 21, 10am-2pm (half day)
How Large Language Models Can Improve Your Search Project (Half-day)
Alessandro Benedetti, Sease Ltd. Apache Lucene/Solr committer and PMC member and Director and R&D Software Engineer
Tutorial 2
November 21, 10am-4.30pm (full day)
Uncertainty Quantification for Text Classification (Full Day)
Dell Zhang, Thomson Reuters Labs, London, UK.
Murat Sensoy, Amazon Alexa AI, London, UK.
Lin Gui, King’s College London, London UK
Yulan He, King’s College London & Alan Turing Institute, London, UK.
CONFERENCE PROGRAMME OUTLINE
November 22, 2023 9.45am - 7pm
The objective of the conference this year is to explore the implications and opportunities of AI-based technologies in enhancing the user experience in enterprise, e-commerce and systematic search. The conference marks the first anniversary of the launch of ChatGPT on 30 November 2022.
09.45 Introductions
09.55 Understanding the Dangers of using LLMs. Professor Julie Weeds, University of Sussex
10.30 Using AI tools for discovery Hong Zhou Wiley Scientific
11.00 Break
11.20 IR, AI and ‘search’ – Creating synergy Steve Zimmerman, Samy Ateia and Martin White consider the opportunities and the issues, with the assistance of the audience!
12.20 BCS Search Industry Awards – Introduction Tony Russell-Rose
12.30 Panel Session looking back at the morning presentations
13.00 Lunch
13.45 Pragmatic AI-powered Search – Keeping it Simple, not Stupid. Charlie Hull Open Source Connections
14.15 Retrieval-augmented text generation (RAG) for legal IR Grace Lee
14.45 Ontologies in the age of AI-based discovery Peter Winstanley
15.15 Break
15.50 Presentation of the BCS Search Industry Awards
16.00 New technologies for systematic reviews: are large language models really a gamechanger? James Thomas
16.30 Integrating ChatGPT into existing applications Paul Cleverley
17.00 Panel session – agreeing the take-aways from the conference
17.30 SS2023 Best Paper Award
17.45 Reception
19.00 Close
For further details and enquiries, please contact irsg(a)bcs.org.uk.
--
Ingo Frommholz (he/him), PhD, FBCS, FHEA
Reader (~Associate Professor) in Data Science
ACM CIKM 2023 General Chair
Head of Data, AI, Interaction, Retrieval and Language Group http://dairel.org
Deputy Head Digital Innovations and Solutions Centre (DISC)
University of Wolverhampton, UK
Adjunct Professor, Bern University of Applied Sciences, Switzerland
Web: http://www.frommholz.org/ | Email: ifrommholz(a)acm.org
Twitter: @iFromm | Mastodon: @ingo@idf.social
PGP/GPG fingerprint: B74E A422 C7B2 A5BB 2BC2 523B 2790 216E F8F8 D166
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x2790216EF8F8D166
The NLP group at Linköping University, Sweden<https://www.ida.liu.se/divisions/aiics/nlp/>, is now hiring a PhD student for research on multilingual LLMs for lower-resourced languages!
For this PhD position, you will work within the TrustLLM project, an EU-funded Horizon Europe project on developing open, trustworthy, and sustainable LLMs, initially targeting the Germanic languages. It involves consortium partners from Denmark, Germany, Iceland, The Netherlands, Norway, and Sweden. Specific research topics include:
* tokenization and embedding alignment techniques
* addressing grammatical correctness and bias in pre-training
* benchmarking and evaluation
You will also participate in the Graduate School in Computer Science (CUGS) at Linköping University.
Your PhD supervisors will be Marcel Bollmann<https://marcel.bollmann.me/> and Marco Kuhlmann<https://liu.se/en/employee/marku61>. This PhD position is a fully-funded, full-time, salaried position with attractive employee benefits and pension contributions.
For more information about the position and a link to the application system, please see
https://www.ida.liu.se/divisions/aiics/nlp/phd-student-trustllm/
You are welcome to contact me or Marco for additional information.
The application deadline is 2023-11-21.
Best wishes,
Marcel Bollmann
--
Marcel Bollmann, Dr. phil.
Associate Professor in Natural Language Processing
Department of Computer and Information Science, Linköping University, Sweden
www: https://marcel.bollmann.me/
The Applied Computational Linguistics group at Potsdam University conducts research on different perspectives on discourse structure (coherence, coreference, etc.) and on applications of discourse processing, for example in argument mining and in collaborations with social sciences (educational, political science). See http://angcl.ling.uni-potsdam.de.
The position to be filled is in a new DFG-funded project with partners in Computer Science (Halle-Wittenberg) as well as Literary Science and E-Research (Berlin). The goal is to implement a solution for text alignment in scenarios where an author writes different versions of the "same" text in different languages; i.e., where portions of the text in the other language can be changed, shortened, extended, or reordered. In collaboration with the team in Halle-Wittenberg, the successful candidate will conduct experiments on building alignments on word, phrase and sentence level. A central basis for these alignments will be measures of semantic similarity, for example by utilizing multilingual word embeddings.
The position runs for three years and offers the possibility for obtaining a PhD.
Salary is according to the Brandenburg TV-L salary scale (75%), see for instance https://zbb.brandenburg.de/sixcms/media.php/9/Tabelle%20TV-L_2022.pdf
Candidates are expected to hold an MSc degree in Computational Linguistics or a related field, with experience in methods relevant to the project. Solid programming skills in Python are required, experience in working with (multilingual) corpora is a plus. Near-native knowledge of either English or German is required, and some knowledge of the other language is a plus.
Please send your application before Nov 17th to the email below, consisting of a CV, a statement of research interests and a summary (or a sample chapter of) the MSc thesis; letters of reference may optionally be added.
Inquiries and applications to: Prof. Manfred Stede, stede(a)uni-potsdam.de
********************************************************
PROPOR 2024: 16th International Conference on Computational Processing of
Portuguese
Universidade de Santiago de Compostela (Santiago de Compostela - Galiza)
March 14th to 15th 2024
Call for Papers (Deadline approaching!)
https://propor2024.citius.gal/
********************************************************
*Important dates*
* Full and short paper submission deadline: *06/11/2023 (23:59 GMT-3)*
* Notification of paper acceptance or rejection: 07/12/2023
* Camera-ready papers due: TBA
* Conference: March 14th - 15th, 2024
The International Conference on Computational Processing of Portuguese (
PROPOR), whose next edition will take place for the first time in Galicia,
birthplace of the Portuguese language, is the main event in the area of
natural language processing that is focused on theoretical and
technological issues of written and spoken Portuguese and Galician
(considered as a local variety of the former). The meeting has been a very
rich forum for the exchange of ideas and partnerships for the research and
industry communities dedicated to the automated processing of this
language, promoting the development of methodologies, resources and
projects that can be shared among researchers and practitioners in the
field.
We call for papers describing work on any topic related to computational
language and speech processing of Portuguese/Galician by researchers in the
industry or academia. Topics of interest include, but are not limited to:
* Natural language processing tasks (e.g. parsing, word sense
disambiguation, coreference resolution)
* Natural language processing applications (e.g. question answering,
subtitling, summarization, sentiment analysis)
* Natural language generation
* Information extraction and information retrieval
* Speech technologies (e.g. spoken language generation, speech and speaker
recognition, spoken language understanding)
* Speech applications (e.g. spoken language interfaces, dialogue systems,
speech-to-speech translation)
* Resources, standardization and evaluation (e.g. corpora, ontologies,
lexicons, grammars)
* NLP-oriented linguistic description or theoretical analysis
* Distributional semantics and language modeling
* Portuguese language varieties and dialect processing (including the
language varieties of Angola, Brazil, Cape Verde, East Timor, Galicia,
Guinea-Bissau, Macau, Mozambique, Portugal, and Sao Tome and Principe)
* Multilingual studies, methods, applications and resources including
Portuguese/Galician
PROPOR 2024 will be held at the University of Santiago de Compostela
(Santiago de Compostela - Galicia, Spain) from March 14th to March 15th.
PROPOR 2024 will be the 16th edition of the biennial PROPOR conference,
hosted alternately in Brazil and in Europe (Portugal/Galicia). Past
meetings were held in Lisbon, PT (1993); Curitiba, BR (1996); Porto
Alegre, BR (1998); Évora, PT (1999); Atibaia, BR (2000); Faro, PT (2003);
Itatiaia, BR (2006); Aveiro, PT (2008); Porto Alegre, BR (2010); Coimbra,
PT (2012); São Carlos, BR (2014), Tomar, PT (2016), Canela, BR (2018),
Évora, PT (2020), and Fortaleza, BR (2022).
Submissions
Submissions should describe original, unpublished work. Authors are invited
to submit two kinds of papers:
* Full papers – Reporting substantial and completed work, especially those
that may contribute in a significant way to the advancement of the area.
Wherever appropriate, concrete evaluation results should be included. Full
papers may consist of up to 8 pages of content, plus unlimited pages of
references.
* Short papers – Reporting small, focused contributions such as ongoing
work, position papers, potential ideas to be discussed, or negative
results. Short papers may consist of up to 4 pages of content, plus
unlimited pages of references.
Both Full and Short papers will be published in the proceedings of the main
conference.
Each submission will be evaluated by at least three reviewers. As reviewing
will be double-blind, submitted papers must be anonymized, that is, they
should not contain the authors’ names and affiliations. Authors must avoid
self-references that reveal identity, like, “We previously showed (Smith,
1991) …”. Instead, they should prefer citations such as “Smith (1991)
previously showed …”. Separate author identification information will be
required as part of the submission process.
Submissions to PROPOR 2024 may not be made available online (e.g. via a
preprint server), and may not be submitted for review elsewhere while being
under review for this conference.
Submissions should be written in English. At submission time, only PDF
format is accepted. For the final versions, authors of accepted papers will
be given 1 extra content page to take the reviews into account. Authors of
accepted papers will be requested to send the source files for the
production of the proceedings. All submitted papers must conform to the
official ACL style guidelines. ACL provides style files for LaTeX and
Microsoft Word that meet these requirements. They can be found at:
* LaTeX styelesheet
* MS Word stylesheet
Paper should be submitted here in the following URL:
https://easychair.org/my/conference?conf=propor2024
Important dates
* Full and short paper submission deadline: 06/11/2023 (23:59 GMT-3)
* Notification of paper acceptance or rejection: 07/12/2023
* Camera-ready papers due: TBA
* Conference: March 14th - 15th, 2024
Publication
The proceedings of PROPOR 2024 will be published by ACL as a volume in ACL
Anthology (https://aclanthology.org/ ). They will be available online. To
ensure publication, at least one author of each accepted paper must
complete an adequate registration for PROPOR 2024 by the early registration
deadline.
Kindest regards,
António Teixeira, Livy Real & Marcos Garcia
PROPOR 2024 Program Chairs
Dear colleagues,
Last month, we shared the result of our collaborative work on a core metadata scheme for learner corpora with LCR2022 participants. Our proposal builds on Granger and Paquot (2017)'s first attempt to design such a scheme and during our presentation, we explained the rationale for expanding on the initial proposal and discussed selected aspects of the revised scheme.
Our proposal is available at https://docs.google.com/spreadsheets/d/1-RbX5iUCUtCBkZU9Rfk-kv-Vzc--F-eUW2O…<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.goog…>
We firmly believe that our efforts to develop a core metadata scheme for learner corpora will only be successful to the extent that (1) the LCR community is given the opportunity to engage with our work in various ways (provide feedback on the general structure of the scheme, the list of variables that we identified as core and their operationalization; test the metadata on other learner corpora; use the scheme to start a new corpus compilation, etc.) and (2) the core metadata scheme is the result of truly collaborative work.
As mentioned at LCR2022, we will be collecting feedback on the metadata scheme until the end of October. The online feedback form is available at:
https://docs.google.com/document/d/1NeDUuxGJlPSJI9wHVA1xgGM-aV8jXTa8Qlb45K-…<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.goog…>
We'd like to thank all the colleagues who already got back to us (at LCR2022, by email or via the online form). We also thank them for their appreciation and enthusiasm for our work! We'd also like to encourage more colleagues (and particularly those of you who have experience in learner corpus compilation) to provide feedback! We need help in finalizing the core metadata scheme to make sure that it can be applied in all learner compilation contexts. In short, we need you to make sure the scheme meets the needs of the LCR community at large.
With very best wishes,
Magali Paquot (also on behalf of Alexander König, Jennifer-Carmen Frey, and Egon W. Stemle)
Reference
Granger, S. & M. Paquot (2017). Towards standardization of metadata for L2 corpora. Invited talk at the CLARIN workshop on Interoperability of Second Language Resources and Tools, 6-8 December 2017, University of Gothenburg, Sweden.
Dr. Magali Paquot
Centre for English Corpus Linguistics
Institut Langage et Communication
UCLouvain
https://perso.uclouvain.be/magali.paquot/
UCLouvain Error Editor Version 2 (UCLEEv2) Now Available!
We are very pleased to announce that Version 2 of the UCLouvain Error Editor (UCLEEv2) has just been released under a Creative Commons Licence (CC BY-NC-ND 4.0). The program can be downloaded from UCLouvain’s Open Educational Resources platform using the following link: http://hdl.handle.net/20.500.12279/968
The software, which was developed at the Centre for English Corpus Linguistics (CECL) (Université catholique de Louvain, Belgium), aims to facilitate the insertion of error tags and corrections into learner texts, as well as their subsequent processing. It comes with the following functions:
* Tagset selection
* Menu-driven tag insertion
* Tag consistency checking
* Concordancing
* Error statistics
* Automatic exercise generator.
By default UCLEEv2 uses the freely available tagset described in the Louvain Error Tagging Manual Version 2.0<https://cdn.uclouvain.be/groups/cms-editors-cecl/cecl-papers/Granger%20et%2…> (Granger, Swallow & Thewissen 2022), but users can also design their own tagset and work with it in UCLEEv2. The initial version of the tagset was developed to tag the first version of the International Corpus of Learner English (ICLE) (Granger et al. 2002) and further refined in the context of the two subsequent versions of the ICLE (Granger et al. 2009 and 2020). The various versions have been used in many other learner corpus projects internationally. UCLEEv2 includes a built-in converter which allows users to work on texts tagged with previous versions of UCLEE.
The program is accompanied by a detailed user guide<https://cdn.uclouvain.be/groups/cms-editors-cecl/cecl-papers/UCLEE%20user%2…> (Granger, Swallow & Thewissen 2023).
We hope that the program will prove useful to researchers interested in assessing the accuracy of L2 writing, investigating errors produced by specific learner groups and designing materials focused on their attested difficulties.
Feedback welcome!
Sylviane Granger, Helen Swallow and Jennifer Thewissen