[sorry about cross-posting]
The 5th Summer Datathon on Linguistic Linked Open Data (SD-LLOD-23)will
be held physically from June 11thto June 16rd2023 at Castle Luznica,
Zaprešić, Croatia. See https://datathon2023.jezik.hr/
<https://datathon2023.jezik.hr/>. The event has no registration feeand
offers over twenty travelling grants for participants.
The SD-LLOD datathon has the main goal of providing practical knowledge
to people from industry and academia in the application of Linked Open
Data technology to Linguistics and Language Technology. The ultimate
goal is to enable participants to migrate their own (or other’s)
linguistic data and publish them as Linked Data on the Web and/or
develop applications on top of Linguistic Linked Data (LLD). One of the
main focus points this year will be the use ofdeep learning and neural
approaches to/from LLD.
This datathon series is unique in its topic worldwide and continues from
the success of the previous editions in 2015 and 2017 in Cercedilla
(Spain), 2019 in Dagstuhl (Germany), and 2022 in Cercedilla again. This
edition is supported by COST (European Cooperation in Science and
Technology) https://cost.eu/ <https://cost.eu/>through NexusLinguarum,
the “European network for Web-centred linguistic data science” COST
Action (CA18209, https://nexuslinguarum.eu/ <https://nexuslinguarum.eu/>).
During the datathon, participants will be able to:
* Generate their own Linguistic Linked Data from existing data sources,
using visual toolslike VocBench and community standards like OntoLex lemon
* Applysemantic technologies(linked data, knowledge graphs, RDF, SPARQL)
to the field of language resources and learn about their benefits and
applications for specific use cases, particularly those involving
multilingual and/or multimodal aspects.
* Explore the potential use of embeddings, machine learning, and deep
learningtechniques in combination with Linguistic Linked Data. For instance:
* Neural machine translation from Natural Language to SPARQL
* Generating natural language from knowledge graphs
* Acquiring relations with neural language models
The program of the summer datathon will contain three types of sessions:
1. Seminars to explain theoretical aspects and discuss selected topics.
2. Hands-on sessionsto introduce the basic foundations of each topic,
method, and technique, which participants will apply directly through
different practical assignments.
3. Datathon sessions,where participants will work, in groups of 3-5, on
miniprojects and where they will apply what they have learned, involving
the generation and/or use of Linguistic Linked Data.
Participants are invited to propose a “miniproject” related to the
topics of the datathon, which might include some datasets for their
conversion into linked data. In this edition, we particularly encourage
miniprojects that involve interaction with machine learning, deep
learning, or embeddings techniques. A selection of proposals will form
the basis for the miniprojects which the participants will work on
during the datathon sessions. Participants who do not propose a
miniproject, or whose miniproject is not selected, will be able to join
another miniproject. There will be an award for the best miniproject.
Registration
============
The datathon is a sponsored event, and it hasno registration fee, but
participants are expected to cover the cost of their meals and
accommodation at the castle residence. Details about the registration
can be found at the datathon website: https://datathon2023.jezik.hr/
<https://datathon2023.jezik.hr/>
Registration will close on 1/05/2023. More than twenty travelling
grantswill be provided by NexusLinguarum (covering accommodation, meals
and travel expenses). See the datathon website for more details.
COVID statement
==============
The datathon is planned as a physical event. The local organisation is
committed to guaranteeing a safe event. Note that there might be some
COVID rules to comply with at the time of celebration of the event.
These will be announced in due course.
Important dates
======================
Registration opens: 13/02/2023
Registration closes: 1/05/2023 (extended)
Notification: 5/05/2023
Datathon: 11/06/2023 to 16/06/2023
Organisers
=========
Jorge Gracia (University of Zaragoza, Spain)
Christian Chiarcos (University of Augsburg, Germany)Dagmar Gromann
(University of Vienna, Austria)
Thierry Declerck (DFKI, Germany)
Milan Dojchinovski (CTU in Prague, Czech Republic / DBpedia Association,
Germany)
Local organisers
=============
Ana Ostroški (Institute of Croatian Language and Linguistics, Croatia)
Kristina Despot (Institute of Croatian Language and Linguistics, Croatia)
Tutors and lecturers
=========================================
Mehwish Alam (Institut Polytechnique de Paris, France)
Christian Chiarcos (University of Augsburg, Germany)Michael Cochez
(Vrije Universiteit Amsterdam, The Netherlands)
Hugo Gonçalo Oliveira (University of Coimbra, Portugal)
Dagmar Gromann (University of Vienna, Austria)
Thierry Declerck (DFKI, Germany)
Milan Dojchinovski (CTU in Prague, Czech Republic / DBpedia Association,
Germany)
Katerina Gkirtzou (Athena Research Center, Greece)
Jorge Gracia (University of Zaragoza, Spain)
Max Ionov (University of Cologne, Germany)
Diego Moussallem (Paderborn University)
Armando Stellato (University of Rome Tor Vergata, Italy)
Andon Tchechmedjiev (IMT École des Mines d’Alès, France)
In this newsletter:
In memoriam: Christopher Cieri 1963-2023
New publications:
Penn Korean Universal Dependency Treebank<https://catalog.ldc.upenn.edu/LDC2023T05>
DEFT English Light and Rich ERE Annotation<https://catalog.ldc.upenn.edu/LDC2023T04>
________________________________
In memoriam: Christopher Cieri 1963-2023
With deep sadness, LDC announces the passing of Christopher Cieri, our Executive Director. Chris led the Consortium for over 25 years, guiding its evolution from a small data repository and research hub to a prominent global data center.
An accomplished linguist, computer scientist, and a well-read humanist, Chris embodied the best qualities for executing the wide range of duties demanded by his leadership role. He was a valued colleague and friend and will be sorely missed.
All are welcome to visit our remembrance page<https://www.ldc.upenn.edu/christopher-cieri-1963-2023> for Chris.
________________________________
New publications:
Penn Korean Universal Dependency Treebank<https://catalog.ldc.upenn.edu/LDC2023T05> contains 5010 sentences and 132,041 tokens annotated in dependency format under the Universal Dependencies framework<https://universaldependencies.org/>. It is a conversion of Korean Treebank Annotations Version 2.0 (LDC2006T09)<https://catalog.ldc.upenn.edu/LDC2006T09>, which was produced in constituency format.
The source text is newswire stories from LDC's Korean Press Agency collection contained in Korean Newswire (LDC2000T45)<https://catalog.ldc.upenn.edu/LDC2000T45>. Sentences were automatically converted for dependency annotation; the output was manually checked. The corpus contains 112 files in CoNLL-U format<https://universaldependencies.org/format.html>, the Universal Dependencies standard, with a mapping to their counterpart in LDC2006T09.
2023 members can access this corpus through their LDC accounts. Non-members may license this data for a fee.
*
DEFT English Light and Rich ERE Annotation<https://catalog.ldc.upenn.edu/LDC2023T04> was developed by LDC and consists of 1190 English discussion forum, newswire, and proxy documents annotated for entities, relations, and events (ERE). Light ERE annotation labels entity mentions for the target set of entity, relation, and event types between and among those entities, including coreference. Rich ERE annotation expands types and tagging in the entities, relations, and events annotation tasks and replaces strict event coreference with a more loosely defined event hopper annotation.
902 documents were annotated following Light ERE annotation guidelines. 288 documents were labeled with Rich ERE annotation in a second pass after being annotated for Light ERE. The source data consists of English discussion forum web text collected by LDC for the DARPA BOLT program and contained in BOLT English Discussion Forums (LDC2017T11)<https://catalog.ldc.upenn.edu/LDC2017T11>; newswire documents published in various data sets released in the TAC KBP project (Text Analysis Conference Knowledge Base Population)<https://www.ldc.upenn.edu/collaborations/past-projects/tac-kbp>; and proxy documents intended to mimic government analysis reports of newswire content published in DEFT Narrative Text (LDC2016T07)<https://catalog.ldc.upenn.edu/LDC2016T07>.
2023 members can access this corpus through their LDC accounts. Non-members may license this data for a fee.
To unsubscribe from this newsletter, log in to your LDC account<https://catalog.ldc.upenn.edu/login> and uncheck the box next to "Receive Newsletter" under Account Options or contact LDC for assistance.
Membership Coordinator
Linguistic Data Consortium<ldc.upenn.edu>
University of Pennsylvania
T: +1-215-573-1275
E: ldc(a)ldc.upenn.edu<mailto:ldc@ldc.upenn.edu>
M: 3600 Market St. Suite 810
Philadelphia, PA 19104
Measuring Meanings | Computing Concepts:
Practices of Operationalization and their Implications for Text Studies
All details can also be found here: https://cretaverein.de/mmcc/
Since the work of physicist Percy Bridgman (1927, 5), ›operationalization‹ is used to refer to the practice of determining or measuring concepts by means of a »set of operations«. In Bridgman’s strong variant of operationalization, he regarded the meaning of concepts as synonymous with the operations used to measure it. In Bridgman’s view, such operational definitions are fundamental to all research in physics. The concept of length, for instance, would thus be defined by the operations which are necessary for measuring the length of a physical object. Early on, this position was intensively discussed (cf. Frank 1956), and also criticized for that, in extreme cases, each new measurement method of a concept is equivalent to a new operational definition: »it becomes a tautology that any measurement operation is the correct one for the concept associated with it« (Chang, Cartwright 2008, 367).
Text-oriented DH projects seem to align with a weaker variant of operationalization in that their activities are structured by clearly delineable sub-steps (cf. Pichler, Reiter 2022; Krautter 2022). Thereby, operationalization can both contribute to the definitional refining of (humanities’) concepts, and facilitate opportunities for their empirical examination. The workshop aims to address these questions from scientific, computational, and praxeological perspectives, and thus attempts to provide an overview of the different theoretical positions and practical approaches; in particular with regard to operationalization in the field of digital humanities and digital text analysis. We especially solicit contributions that develop their theoretical reflections by means of concrete data. Please refrain from submitting textual analyses that do not include a theoretical reflection on their operationalization practice.
Guiding questions include, but are not limited to:
• What is referred to as a concept in the text studying fields of the humanities? What is the role of such concepts in theory building?
• What is the function of quantitative, formal or computational analysis in terms of conceptualization in text studying fields?
• How does the practice of operationalization relate to traditional and current approaches to conceptualization in philosophy, e.g., Carnapian explication and conceptual engineering?
• What is the practice of operationalization in text studying fields of the humanities?
• How does operationalization interact with established machine learning workflows? Which understanding of operationalization is inherent in these workflows?
• How does operationalizing engage with interpreting?
• How do we compare and evaluate operationalizations?
• How can we conceptualize the ›agent‹ that conducts the measurement (e.g., computer vs. human)? What impact do different agents and their capacities have on our understanding of operationalization?
• What are the differences between expressing measurement rules in natural (such as annotation guidelines) and formal language in relation to the operationalized concepts? How do these as well as their guiding background assumptions affect our understanding of operationalization?
• Does the advent of large language models (such as BERT and GPT) change our notion of operationalization -- and if so, how?
Submission
We invite the submission of abstracts (1 page) in English on any of the above mentioned or closely related topics. Abstracts should be submitted in PDF format to axel.pichler(a)ts.uni-stuttgart.de; they do not need to be anonymised (non-blind).
Prior to the workshop, the accepted abstracts must be extended into full papers (5000–6000 words), which will be circulated before the workshop. At the workshop, each paper is presented briefly, followed by an in-depth-discussion.
The revised full papers will be published. Further details on publication will follow after acceptance. For the specific deadlines, please see the timeline below.
Timeline
Abstract submission deadline: May 1st, 2023
Notification: May 15 2023
Paper submission: August 31 2023
Workshop: September 25/26 2023
Venue
Cologne
Please contact Axel Pichler (axel.pichler(a)ts.uni-stuttgart.de) for further questions.
Organizers
Axel Pichler, University of Stuttgart
Benjamin Krautter, University of Cologne
Nils Reiter, University of Cologne
References
Bridgman, Percy W.: The Logic of Modern Physics. New York 1927.
Chang, Hasok / Cartwright, Nancy: Measurement. In: The Routledge Companion to Philosophy of Science, ed. by Stathis Psillos / Martin Curd. Abingdon, New York 2008, 367–375.
Krautter, Benjamin: Die Operationalisierung als interdisziplinäre Schnittstelle der Digital Humanities. In: Scientia Poetica 26 (2022), S. 215–244.
Pichler, Axel / Reiter, Nils: From Concepts to Texts and Back: Operationalization as a Core Activity of Digital Humanities. In: Journal of Cultural Analytics 7.4 (2022), https://doi.org/10.22148/001c.57195.
Frank, Philipp G. (eds.): The Validation of Scientific Theories. Boston 1956.
***Second Call for Papers***
The 7th Workshop on Online Abuse and Harms (WOAH)
Location: Toronto, Canada
Date: Thursday, July 13, 2023 (co-located with ACL 2023)
Website: https://www.workshopononlineabuse.com/
==========================================================
Important Dates
==========================================================
- Submission due: May 2, 2023
- ARR reviewed submission due: May 22, 2023
- Notification of acceptance: May 26, 2023
- Camera-ready papers due: June 2, 2023
- Workshop: July 13, 2023
All deadlines are 11:59 PM AoE time.
Overview
==========================================================
The Workshop on Online Abuse and Harms (WOAH) invites paper submissions from a wide range
of fields, including natural language processing, machine learning, computational social
sciences, law, politics, psychology, sociology and cultural studies. We explicitly
encourage interdisciplinary submissions, technical as well as non-technical submissions,
and submissions that focus on under-resourced languages. We also invite non-archival
submissions and civil society reports.
The topics covered by WOAH include, but are not limited to:
- New models or methods for detecting abusive and harmful online content;
- Biases and limitations of existing detection models or datasets for abusive and harmful
online content, particularly those in commercial use;
- New datasets and taxonomies for online abuse and harms;
- Dynamics of online abuse and harms, as well as their impact on different communities
- Social, legal, and ethical implications of detecting, monitoring and moderating online
abuse
In addition, we invite submissions related to the theme for this seventh edition of WOAH,
which will be *subjectivity and disagreement in abusive language data*. Hate speech and
other forms of abuse are highly subjective. By choosing this theme, we want to encourage
submissions that analyse, address or make use of this subjectivity. To match the theme and
complement thematic submissions, we have invited a strong lineup of relevant speakers.
Submission Guidelines
==========================================================
Submission is electronic, using the Softconf START conference management system.
*Submission link*: https://softconf.com/acl2023/WOAH/
The workshop will accept three types of papers.
1) Academic Papers (long and short): Long papers of up to 8 pages, excluding references,
and short papers of up to 4 pages, excluding references. Unlimited pages for references
and appendices. Accepted papers will be given an additional page of content to address
reviewer comments. Previously published papers cannot be accepted.
2) Non-Archival Submissions: Up to 2 pages, excluding references, to summarise and
showcase in-progress work and work published elsewhere.
3) Civil Society Reports: Non-archival submissions, with a minimum of 2 pages and no upper
limit. Can include work published elsewhere.
All submissions must use the official ACL 2023 style files. Submissions that do not
conform to the required styles, including paper size, margin width, and font size
restrictions, will be rejected without review. All submissions should adhere to the
workshop policies https://www.workshopononlineabuse.com/policies.html.
All submissions, except for civil society reports, must be fully anonymised.
Self-references that reveal the author's identity, e.g., "We previously showed
(Smith, 1991) ...", should be avoided. Instead, use citations such as "Smith
previously showed (Smith, 1991) ...".
Following the ACL 2023 guidelines, we believe that it is also important to discuss the
limitations of your work, in addition to its strengths. The “Limitations” section will
appear at the end of the paper, after the discussion/conclusions section and before the
references, and will not count towards the page limit.
Multiple Submissions Policy
==========================================================
The workshop allows for multiple submissions.
Papers that have been or will be presented at other venues may only be presented as
non-archival. Papers that are presented at the main conference (ACL 2023) can be presented
at the workshop as non-archival.
Organizers
==========================================================
Yi-Ling Chung, The Alan Turing Institute
Aida Mostafazadeh Davani, Google
Debora Nozza, Bocconi University
Paul Röttger, University of Oxford
Zeerak Talat, Digital Democracies Institute, Simon Fraser University
Please send any questions about the workshop to organizers(a)workshopononlineabuse.com
> [Apologies for cross-posting]
>
> ======================================================================
> CALL FOR PAPERS - SIMBig 2023
> ======================================================================
>
> SIMBig 2023 - 10th International Conference on Information Management and Big Data
> Where: Tecnológico de Monterrey, Mexico City, Mexico
> When: August 30 - September 01, 2023
> Website: https://simbig.org/SIMBig2023/ <https://simbig.org/SIMBig2023/>
>
> ======================================================================
>
> OVERVIEW
> ----------------------------------
>
> SIMBig 2023 seeks to present new methods of Artificial Intelligence (AI), Data Science, and related fields, for analyzing, managing, and extracting insights and patterns from large volumes of data.
>
>
> KEYNOTE SPEAKERS
> ----------------------------------
>
> Mona Diab, Meta AI, USA
> Carlos Coello, TEC Monterrey, Mexico
> Finale Doshi-Velez, Harvard University, USA
> Huan Liu, Arizona State University, USA
>
> IMPORTANT DATES
> ----------------------------------
>
> June 24, 2023 --> Full papers and short papers due
> July 28, 2023 --> Notification of acceptance
> August 11, 2023 --> Camera-ready versions
> August 30 - September 01, 2023 --> Conference held in Mexico DF, Mexico
>
> PUBLICATION
> ----------------------------------
>
> All accepted papers of SIMBig 2023 (tracks including) will be published with Springer CCIS Series <https://www.springer.com/series/7899>
> Best papers of SIMBig 2023 (tracks including) will be selected to submit an extension to be published in the Springer SN Computer Science Journal. <https://www.springer.com/journal/42979>
>
> TOPICS OF INTEREST
> ----------------------------------
>
> SIMBig 2023 has a broad scope. We invite contributions on theory and practice, including but not limited to the following technical areas:
>
> Artificial Intelligence
> Data Science
> Machine Learning
> Natural Language Processing
> Semantic Web
> Healthcare Informatics
> Biomedical Informatics
> Data Privacy and Security
> Information Retrieval
> Ontologies and Knowledge Representation
> Social Networks and Social Web
> Information Visualization
> OLAP and Business intelligence
> Data-driven Software Engineering
>
> SPECIAL TRACKS
> ----------------------------------
>
> SIMBig 2023 proposes 5 special tracks in addition to the main conference:
>
> SNMAM <https://simbig.org/SIMBig2023/en/snmam.html> - Social Network and Media Analysis and Mining
> ANLP - Applied Natural Language Processing
> CIIN - Cybersecurity and IoT for Intelligent Networks
> DISE - Data-drIven Software Engineering
> EE-AI-HPC - Efficiency Enhancement for AI and High-Performance Computing
>
> CONTACT
> ----------------------------------
>
> SIMBig 2023 General Chairs
>
> Juan Antonio Lossio-Ventura, National Institutes of Health, USA (juan.lossio(a)nih.gov <mailto:juan.lossio@nih.gov>)
> Hugo Alatrista-Salas, Pontificia Universidad Católica del Perú, Peru (halatrista(a)pucp.pe <mailto:halatrista@pucp.pe>)
>
International Conference on Human-Informed Translation and Interpreting Technology (HiT-IT 2023)
Naples, Italy, 7, 8 and 9 July 2023
*** SUBMISSION DEADLINE EXTENDED TO 30 APRIL 2023***
The International Conference on Human-Informed Translation and Interpreting Technology (HiT-IT 2023) will take place in Naples, Italy between 7 and 9 July 2023. The conference will be preceded by tutorials on 6 July 2023.
HiT-IT seeks to act as a meeting point for (and invites) researchers working in translation and interpreting technologies, practicing technology-minded translators and interpreters, companies and freelancers providing services in translation and interpreting as well as companies developing tools for translators and interpreters. In addition to the accepted papers for presentation, HiT-IT will feature invited talks by prominent experts as well as presentations and panels hosted by practitioners.
For more details and for the main conference topics please visit the conference website
http://hit-it-conference.org/
Submissions and publication
The conference invites the following types of submissions reporting original unpublished work.
User papers for industry and practitioners ranging between 2 and 4 pages (without references). References to related work are optional.
Academic submissions, in three different categories (have to follow formatting requirements, references to related work are required):
• (academic) full papers: describing original completed research. Allowed paper length: maximum 12 pages (without references).
• (academic) work-in-progress papers – describing work in progress, late breaking research, papers at a more conceptual stage, and other types of papers that do not fit in the ‘full’ papers category. Allowed paper length: maximum 7 pages (without references).
• (academic) demo papers – describing working systems. Allowed paper length: maximum 5 pages (without references). In addition to the papers, the authors will be expected to demonstrate the systems at the conference.
The conference will not consider the submission and evaluation of abstracts only.
The accepted papers will be published in the conference proceedings and made available online on the conference website. We plan to invite the authors of the best papers to submit extended versions to a special issue of a prestigious journal.
Important dates
Submission deadline: 30 April 2023
Notification of acceptance: 31 May 2023
Final version due: 10 June 2023
Early fee deadline: 20 June 2023
Conference dates: 7, 8 and 9 July 2023
Tutorials: 6 July 2023
Keynote speakers
Jochen Hummel (Coreon)
Tharindu Ranasinghe (Aston University)
Invited tutorials
Felix do Carmo (University of Surrey): Neural Machine Translation
Alina Karakanta (Leiden University Centre for Linguistics): Automatic subtitling
Conference Chairs
Gloria Corpas Pastor (University of Malaga)
Ruslan Mitkov (University of Wolverhampton)
Johanna Monti (University of Naples L’Orientale)
Constantin Orasan (University of Surrey)
Organising Committee
Dayana Abuin Rios (University of Malaga)
Khadija Ait Elqih (University of Naples l’Orientale)
Anastasia Bezobrazova (University of Malaga)
Meriem Boulekhoukh (University of Oran)
Rocío Caro Quintana (University of Wolverhampton)
Amal El Farhmat (University of Malaga)
Lilit Kharatian (University of Malaga)
Alfiya Khabibullina (University of Malaga)
Nikolai Nikolov (INCOMA Ltd.)
Daria Sokova (New Bulgarian University)
Giulia Speranza (University of Naples l’Orientale)
Sponsors
Pangeanic, El-Translations and Juremy are the official sponsors of the conference.
Venue
The conference will take place at the Palazzo del Mediterraneo, University of Naples
Further information and contact details
Registration for HiT-IT 2023 is now open. To register, please complete the registration form.
See the conference website (http://hit-it-conference.org/home ) for more details; you can also email 2023(a)hit-it-conference.org<mailto:2023@hit-it-conference.org>.
The deadline is approaching, April 25, 2023
*IACT’23: Human or AI? Calling for research papers on implicit authorship
disambiguation in IR *
*Call for Papers: The 1st International Workshop on Implicit Author
Characterization from Texts for Search and Retrieval (IACT’23) *
The workshop will be held in conjunction with the 46th International ACM
SIGIR Conference on Research and Development in Information Retrieval
Workshop website: https://en.sce.ac.il/news/iact23
July 27, 2023. Taipei, Taiwan.
*Paper submission deadline: April 25, 2023, AoE*
Submission link: https://easychair.org/conferences/?conf=iact23
To bring the research community's attention to the limitations of current
models in recognizing and characterizing AI vs. human authors, we organize
the first edition of IACT workshops under the umbrella of the SIGIR
conference. Research works submitted to the workshop should foster
scientific advances in all aspects of author characterization.
All papers must be original and not simultaneously submitted to another
journal or conference. The following paper categories are welcome:
- *Full research papers*: up to 8 pages. Original and high-quality
unpublished contributions to the theory and practical aspects of the
workshop topics.
- *Short research* *papers*: up to 5 pages. It can describe ongoing
research, resources, and demos.
- *Negative results* *papers*: up to 5 pages. Highlighting tested
hypotheses that did not get the expected outcome is also welcomed.
- *Position papers*: up to 5 pages. Discussing current and future
research directions.
The length constraints do not include references.
The submissions must be anonymous and will be peer-reviewed by at least two
program committee members.
The authors of accepted papers will be given 15 minutes for a short oral
presentation. The workshop will run as a hybrid event to allow virtual
attendance and meet the SIGIR format.
Research works submitted to the workshop should foster the scientific
advance on all aspects of implicit author information extraction from text,
including but not limited to the following:
- Differentiation between AI-generated content and human-generated
content and bot profiling
- Characterization of conversational agents
- Feature detection of authors for human vs. AI determination
- Prompt understanding and recognition in language models
- Personalized question answering and conversation generation
- Troll identification on social media
- Review authenticity estimation
- Multi-modal, multi-genre, and multilingual author analysis
- Character analysis, description, and representation in narrative texts
- Detecting implicit expressions of sentiment, emotion, opinion, and bias
- Transfer learning for implicit author characterization
- Implicit author characterization annotation schema
- Evaluation of implicit author characterization
- Author characterization in low-resource languages and under-studied
domains
- Accountability and regulation of AI-based information extraction,
retrieval, and content generation
- Copyright issues of AI-generated content
- Ethical and privacy implications of author characterization and
implicit information extraction
- Fairness and bias of AI-generated content
Organizing Committee:
- Marina Litvak - marinal(a)ac.sce.ac.il; Shamoon College of Engineering
Beer Sheva; Israel
- Irina Rabaev - irinar(a)ac.sce.ac.il; Shamoon College of Engineering
Beer Sheva; Israel
- Alípio Mário Jorge - amjorge(a)fc.up.pt; University of Porto; Porto,
Portugal
- Ricardo Campos - ricardo.campos(a)ipt.pt; Polytechnic Institute of Tomar
INESC TEC, Portugal; Porto, Portugal
- Adam Jatowt - adam.jatowt(a)uibk.ac.at; University of Innsbruck;
Innsbruck, Austria
Invited Speakers:
- Prof. Mark Last - Ben-Gurion University of the Negev, Israel
- Prof. Dr. Valia Kordoni - Humboldt-Universität Berlin, Germany
Contact:
- Dr. Marina Litvak: litvak.marina(a)gmail.com
- Dr. Irina Rabaev: irinar(a)ac.sce.ac.il
--
Best regards,
Marina Litvak
when it comes to corpora research the response time to queries such as:
* what is the character on the nth offset of a file
* which ones are all other characters preceding and proceeding that
one by m offsets or up to a certain char or pattern ...
* what is the intra- and inter-textuality of a given segment of characters
. . .
and many other related ones, should be "zero comma nada" (they should
run instantly), but I think this is virtually impossible because texts
these says (say, PDF files) are, basically, visually appealing
containers of streams of data displayed by rendering engines; HTML
files contain all that javascript cr@p, google goo, ads, insufferably
idiotic "we care about your privacy" road blocks, ...
I haven't found a convincing explanation as to why that is the case,
but I can't quite understand why is it that the MVC pattern is well
understood when it comes to software design, but people can't
apparently fully separate the text from its presentation when it comes
to documents.
"Web as corpus" folks:
https://www.researchgate.net/publication/276511711_Maristella_Gatto_Web_as_…
don't even attempt to address those issues. At the end of the day as
Borges said:
" ... el nombre es arquetipo de la cosa en las letras de 'rosa' está
la rosa y todo el Nilo en la palabra 'Nilo'"
so, let's get down to first manage to get one character in a text
after the other ...
lbrtchx
International workshop NLP for translation and interpreting applications (NLP4TIA)
Varna, Bulgaria 7/8 September 2023
https://nlp4tia.web.uah.es/
First Call for Papers
In the last two decades we have been able to witness a technological turn in translation and interpreting studies with Natural Language Processing (NLP) and deep learning playing more and more prominent part. There is already a growing number of NLP applications which are used to support the work of translators and interpreters. In addition, the recent advances in (and latest models of) deep learning have powered the further development and success of high performing Neural Machine Translation (NMT) systems.
Translation technology has revolutionised the translation profession and nowadays most professional translators employ tools such as translation memory (TM) systems in their daily work. Latest advances of Neural Machine Translation (NMT) have resulted in NMT not only becoming an integral part of most state-of-the art TM tools but also typical for the translation workflow of many companies, organisations and freelance translators.
Although translation has benefited more from technological advances, interpreting has also experienced a technological turn. However, it has not been until some years ago that soft technology has permeated interpreting practice and research. Computer assisted translation, MT and NLP tools have been adapted to be used by interpreters. In addition, corpus-based studies have also underpinned dialogue interpreting.
The increasing interest in NLP, MT and the automation of processes has brought us to multidisciplinary projects that deal with the development of models for automated oral communication. Machine interpreting has already been developed and is being improved, focusing on speed and accuracy matters. Either domain-specific (commercial, military, humanitarian) or general (Skype Translator), there is still a long way to go to render machine interpreting more human-like.
Many of the above recent developments have to do with the employment of Natural Language Processing tools and resources to support the work of translators and interpreters. This workshop is expected to discuss the growing importance of NLP in different translation and interpreting scenarios.
Workshop topics
The workshop invites submissions reporting original unpublished work on topics including but not limited to:
- NLP and MT for under-resourced languages;
- Translation Memory systems;
- NLP and MT for translation memory systems;
- NLP for CAT and CAI tools;
- Integration of NLP tools in remote interpreting platforms;
- NLP for dialogue interpreting;
- Development of NLP based applications for communication in public service settings (healthcare, education, law, emergency services);
- Corpus-based studies applied to translation and interpreting;
- Machine translation and machine interpreting;
- Resources for translation and machine translation;
- Resources for interpreting and interpreting technology application;
- Quality estimation of human and machine translation;
- Post-editing strategies and tools;
- Automatic post-editing of MT;
- NLP and MT for subtitling.
- Technology acceptance by interpreters and translators;
- Machine Translation and translation tools for literary texts;
- Evaluation of machine translation and translation and interpreting tools in general;
- The impact of the technological turn in translation and interpreting;
- Cognitive effort and eye-tracking experiments in translation and interpreting;
- Development of models for research and practice of translation and interpreting;
- Multidisciplinary cooperation in NLP applied to translation and interpreting.
Submissions and publication
Submissions must consist of full-text papers and should not exceed 7 pages excluding references, they should be a minimum of 5 pages long. The accepted papers will be published as NLP4TIA workshop e-proceedings with ISBN, will be assigned a DOI and will be also available at the time of the conference. The papers should be in English and should be submitted via the conference management system START using this link.
Authors of accepted papers will receive guidelines regarding how to produce camera-ready versions of their papers for inclusion in the proceedings.
Each submission will be reviewed by at least two programme committee members. Accepted papers will be presented orally as part of the programme of the workshop.
Submissions should be compliant with the below templates and should be uploaded as pdf files in START (START is configured to accept pdf files only). The following templates should be used: LaTeX at Overeaf, LaTeX , MS Office
Important dates
Deadline for paper submission: 10 July 2023
Acceptance notification: 5 August 2023
Final camera-ready version: 25 August 2023
Workshop camera-ready proceedings ready: 31 August 2023
NLP4TIA workshop: 7/8 September 2023
Workshop Chairs
Raquel Lázaro Gutiérrez (University of Alcala)
Antonio Pareja Lora (University of Alcala)
Ruslan Mitkov (University of Wolverhampton)
Programme Committee
Cristina Aranda (Big Onion)
Juanjo Arevalillo (Hermes Traducciones)
Silvia Bernardini (University of Bologna)
Gabriel Cabrera Méndez (Dualia Teletraducciones)
Matt Coler (University of Groningen)
Elena Davitti (University of Surrey)
Joanna Drugan (Heriot-Watt University)
Marie Escribe (LanguageWire)
Claudio Fantinuoli (Mainz University/KUDO Inc
Antonio García Cabot University of Alcala)
Adriana Jaime Pérez (Migralingua Voze)
Miguel Ángel Jiménez Crespo (Rutgers University)
Óscar Luis Jiménez Serrano (University of Granada)
Koen Kerremans (Free University Brussel)
Maria Kunilovskaya (Saarland University)
Els Lefever (Ghent University)
Pilar León Arauz (University of Granada)
Johanna Monti (University of Naples L'Orientale)
Elena Montiel Ponsoda (Plytecnic University Madrid)
Helena Moriz (University of Lisbon)
Elena Murgolo (Orbital 14)
Dora Murgu (Interprefy)
Constantin Orasan (University of Surrey)
María Teresa Ortego Antón (University of Valladolid)
Tharindu Ranasinghe (Aston University)
Celia Rico (Universidad Complutense de Madrid)
Caroline Rossi (University Grenoble les Alpes)
María del Mar Sánchez Ramos (Universiity of Alcala)
Miriam Seghiri (University of Malaga)
Vilelmini Sosoni (Ionian University)
Rui Manuel Sousa Silva (University of Porto)
Nicoletta Spinolo (University of Bologna)
Venue
The workshop will take place at hotel Cherno More in Varna.
Further information and contact details
Registration for NLP4TIA is now open and is done via the RANLP main conference page. To register, please complete the registration form.
The conference website (https://nlp4tia.web.uah.es/ ) will be updated on a regular basis. For further information, please email nlp4tia(a)uah.es<mailto:nlp4tia@uah.es>.
###################
########## Call for Papers
######## Special Session
###### EnGeoData'2023: Geospatial data analysis under the umbrella of One Health
#### https://simbig.org/engeodata/2023 <https://simbig.org/engeodata/2023>
###
## IEEE DSAA 2023
# The 10th IEEE International Conference on Data Science and Advanced Analytics
# October 9-13, 2023, Thessaloniki, Greece
### AIMS AND TOPICS
1. Abstract
Current context of urbanization, globalization, high mobility/trade, and climate change amid the health domain favors the (re-) emergence of known and unknown diseases. Thus, geospatial and environmental data analysis for One Health is crucial to provide insights into the connections between humans, animals, and environment. This type of analysis allows us to identify and monitor health issues that arise due to the interactions between these three areas. However, it is challenging due to: (1) the multi-modality of the data (e.g., unstructured, imaging, semantic, spatial, temporal, among others); and (2) the difficulty in choosing the "most appropriate” knowledge discovery process according to specific field needs (e.g., animal, plant or human health; crisis and disaster surveillance).
EnGeoData 2023 aims to provide high quality research facing the challenges mentioned above with theoretical and/or experimental approaches.
2. Topics
Topics of interest include (but are not limited to):
- Pre and post processing of environmental data
- Geographical information retrieval
- Spatial data mining, spatial data warehousing, and spatial data lake
- Knowledge discovery use-cases applied to environmental data
- Spatial text mining
- Spatial ontology
- Spatial recommendation and personalization
- Visual analytics for geo-spatial data
- Dedicated applications:
* Spatio-temporal analytics platform
* Agricultural decision support systems
* Urban traffic systems
* Trajectory analysis
* Land-use and urban policies
* Land-use and urban planning analysis
* Spatio-temporal analysis in ecology and agriculture
* Disease surveillance systems (One Health)
### SUBMISSION
All papers should be submitted electronically via EasyChair Submissions: https://easychair.org/my/conference?conf=dsaa2023 <https://easychair.org/my/conference?conf=dsaa2023> under the “Special Session” Track
- Paper Submission Deadline: May 22, 2023
- Paper Notification: July 17, 2023
- Paper Camera Ready Due: August 7, 2023
The length of each paper submitted to the special session should be no more than ten (10) pages and should be formatted following the standard 2-column U.S. letter style of IEEE Conference template. For further information and instructions, see the IEEE Proceedings Author Guidelines.
All submissions will be blind reviewed by the Program Committee on the basis of technical quality, relevance to the session’s topics of interest, originality, significance, and clarity. Author names and affiliations must not appear in the submissions, and bibliographic references must be adjusted to preserve author anonymity. Submissions failing to comply with paper formatting and authors anonymity will be rejected without reviews.
### CHAIRS
- Mathieu Roche, CIRAD, TETIS, France
- Antonio Lossio-Ventura, National Institutes of Health, USA
- Hamid Laga, Murdoch University, Australia
- Maguelonne Teisseire, INRAE, TETIS, France
For questions, please contact us at engeodata(a)teledetection.fr <mailto:engeodata@teledetection.fr>