We are delighted to invite you to ICNLSP 2024
<https://www.icnlsp.org/2024welcome/>, the 7th edition of the International
Conference on Natural Language and Speech Processing, which will be held at
University of Trento from October 19th to 20th, 2024 (*HYBRID*).
*Topics*
- Signal processing, acoustic modeling.
- Speech recognition (Architecture, search methods, lexical modeling,
language modeling, language model adaptation, multimodal systems,
applications in education and learning, zero-resource speech recognition,
etc.).
- Speech Analysis.
- Paralinguistics in Speech and Language (Perception of paralinguistic
phenomena, analysis of speaker states and traits, etc.).
- Spoken Dialog Systems and Conversational Analysis
- Speech Translation.
- Speech synthesis.
- Speaker verification and identification.
- Language identification
- Speech coding.
- Speech enhancement
- Speech intelligibility
- Speech Perception
- Speech Production
- Brain studies on speech
- Phonetics, phonology and prosody.
- Speech and hearing disorders.
- Paralinguistics of pathological speech and language.
- Speech technology for disordered speech/hairing.
- Cognition and natural language processing.
- Machine translation.
- Text categorization.
- Summarization.
- Sentiment analysis and opinion mining.
- Computational Social Web.
- Arabic dialects processing.
- Under-resourced languages: tools and corpora.
- Large language models.
- Arabic OCR.
- NLP tools for software requirements and engineering.
- Knowledge fundamentals.
- Knowledge management systems.
- Information extraction.
- Data mining and information retrieval.
- Lexical semantics and knowledge representation.
- Requirements engineering and NLP.
- NLP for Arabic heritage documents.
*Submission*
Papers must be submitted via the link:
https://cmt3.research.microsoft.com/ICNLSP2024/
<https://cmt3.research.microsoft.com/ICNLSP2024/>
Each submitted paper will be reviewed by three program committee members.The
reviewing process is double-blind. Authors can use the *ACL format*: *Latex
<https://www.icnlsp.org/ACL%202023%20Proceedings%20Template.zip>*or Word.
Authors have the choice to submit their papers as a full or short
paper. Long papers consist of up to 8 pages of content + references. Short
papers, up to 4 pages of content + references.
*Important dates*
*Submission deadline:* *30 June 2024 11:59 PM (GMT*)
*Notification of acceptance:* 15 September 2024
*Camera-ready paper due:* 25 September 2024
*Conference dates:* 19, 20 October 2024
*Publication*
*1- All accepted papers will be published in **ACL Anthology
<https://aclanthology.org/>**.*
*2- Selected papers will be published (after extension) in:*
* 2-a-* A *SPECIAL ISSUE*
<https://www.mdpi.com/journal/make/special_issues/POB4VNE0QP> of Machine
Learning and Knowledge Extraction Journal
<https://www.mdpi.com/journal/make> (MAKE), indexed in *Web of Science
<https://mjl.clarivate.com/search-results>*, *Scopus*
<https://www.scopus.com/sources.uri>, etc.
*Special issue title*:
<https://www.mdpi.com/journal/make/special_issues/POB4VNE0QP>
<https://www.mdpi.com/journal/make/special_issues/POB4VNE0QP>*Knowledge
Graphs and Large Language Models.
<https://www.mdpi.com/journal/make/special_issues/POB4VNE0QP>*
* 2-b-* Signals and Communication Technology (Springer), indexed in
*Scopus* <https://www.scopus.com/> and *zbMATH* <https://zbmath.org/>.
Dear all,
we are happy to invite you to participate in the Shared Task on Quality Estimation at WMT'24.
The details of the task can be found at: https://www2.statmt.org/wmt24/qe-task.html
New this year:
* We introduce a new language pair (zero-shot): English-Spanish
* Continuing from the previous edition, we will also analyse the robustness of submitted QE systems to a set of different phenomena which will span from hallucinations and biases to localized errors, which can significantly impact real-world applications.
* We also introduce a new task, seeking not only to detect but also to correct errors: Quality-aware Automatic Post-Editing! We invite participants to submit systems capable of automatically generating QE predictions for machine-translated text and the corresponding output corrections.
2024 QE Tasks:
Task 1 -- Sentence-level quality estimation
This task follows the same format as last year but with fresh test sets and a new language pair: English-Spanish. We will test the following language pairs:
* English to German (MQM)
* English to Spanish (MQM)
* English to Hindi (MQM & DA)
* English to Gujarati (DA)
* English to Telugu (DA)
* English to Tamil (DA)
More details: https://www2.statmt.org/wmt24/qe-subtask1.html
Task 2 -- Fine-grained error span detection
Sequence labelling task: predict the error spans in each translation and the associated error severity: Major or Minor.
We will test the following language pairs:
* English to German (MQM)
* English to Spanish (MQM)
* English to Hindi (MQM)
More details: https://www2.statmt.org/wmt24/qe-subtask2.html
Task 3 -- Quality-aware Automatic Post-editing
We expect submissions of post edits correcting detected error spans of the original translation. Although the task is focused on quality-informed APE, we also allow participants to submit APE output without QE predictions to understand the impact of their QE system. Submissions w/o QE predictions will also be considered official.
We will test the following language pairs:
* English to Hindi
* English to Tamil
More details: https://www2.statmt.org/wmt24/qe-subtask3.html
Important dates:
1. Test sets will be released on July 15th.
2. Participants can submit their systems by July 23rd on codalab.
3. System paper submissions are due by 20th August [aligned with WMT deadlines].
Note: Like last year, we aligned with the General MT and Metrics shared tasks to facilitate cross-submission on the common language pairs: English-German, English-Spanish, and English-Hindi (MQM).
We look forward to your submissions and feel free to contact us if you have any more questions!
Best wishes,
on behalf of the organisers.
The original post is here: https://www.informatik.tu-darmstadt.de/ukp/ukp_home/jobs_ukp/2021_associate…
Are you passionate about making a difference in the field of mental health through cutting-edge research in AI and Natural Language Processing? Do you have a strong background in computer science, data science, or a related field? If so, we invite you to join our dynamic and interdisciplinary team at the Technical University of Darmstadt!
Position: Full-Time Research Assistant (i.e., doctoral candidate or PhD student)
Duration: 1.10.2024 or soon afterward - 31.12.2027 with the possibility of extension.
Location: Department of Computer Science, Technical University of Darmstadt
Responsibilities:
- Conduct cutting-edge research in NLP with a focus on mental health applications.
- Focus on research topics, such as NLP and knowledge discovery for mental health, large language models for clinical applications, and multimodal clinical data analysis.
- Develop and implement algorithms for analyzing therapist-patient conversations.
- Collaborate with a diverse team of researchers from TU Darmstadt and other partner institutions.
Ecosystem: We are part of DYNAMIC, the newly approved interdisciplinary LOEWE-funded center “Dynamic Network Approach of Mental Health to Stimulate Innovations for Change.” Our mission is to advance the understanding and treatment of mental health disorders using AI, NLP, and multimodal data analysis.
Team: Dr. Shaoxiong Ji (https://www.helsinki.fi/~shaoxion/) will join TU Darmstadt this fall and establish a junior independent research group focusing on foundation models and their applications, such as healthcare. He has a wide range of research directions, including NLP for health, multilingual LLMs, and learning methods such as federated learning, multitask learning, and meta-learning. The newly established research group will closely collaborate with the research labs led by Prof. Iryna Gurevych and Prof. Kristian Kersting, and partners under the umbrella of the DYNAMIC project.
Qualifications:
- A Master’s degree in Computer Science, Data Science, AI, NLP, or a related field.
- Strong programming skills in Python or other relevant languages.
- Experience with deep learning frameworks
- Excellent problem-solving abilities and a passion for research.
- Previous experience in clinical NLP or multimodal data analysis is a plus but not required.
- Strong communication skills and the ability to work effectively in a collaborative environment.
What We Offer:
- An exciting opportunity to contribute to impactful research in mental health.
- A supportive and collaborative research environment.
- Opportunities for professional development and growth within the DYNAMIC project and beyond.
How to Apply: If you are enthusiastic about joining our team and contributing to groundbreaking research, please submit the following documents:
- Detailed CV
- Master’s degree certificates and the Bachelor and Master study transcripts
- Cover letter outlining your motivation and relevant experience
- Contact information for at least two academic or professional references
Please send your application to Shaoxiong Ji <shaoxiong.ji(a)outlook.com> by July 31st, 2024. After that, the positions will remain open until filled. We will consider applications as soon as they are submitted.
Join us in making a real-world impact on mental health through the power of AI and NLP!
Shared task on Multilingual Grammatical Error Correction (MultiGEC-2025)
We invite you to participate in the shared task on Multilingual Grammatical Error Correction, MultiGEC-2025, covering over 10 languages, including Czech, English, Estonian, German, Icelandic, Italian, Latvian, Slovene, Swedish and Ukrainian.
The results will be presented on March 5 (or 2), 2025, at the NLP4CALL workshop, colocated with the NoDaLiDa conference (https://www.nodalida-bhlt2025.eu/conference) to be held in Estonia, Tallinn, on 2--5 March 2025.
The publication venue for system descriptions will be the proceedings of the NLP4CALL workshop.
Official system evaluation will be carried out on CodaLab.
* TASK DESCRIPTION
In this shared task, your goal is to rewrite learner-written texts to make them grammatically correct or both grammatically correct and idiomatic, that is either adhering to the "minimal correction" principle or applying fluency edits.
For instance, the text
> My mother became very sad, no food. But my sister better five months later.
can be corrected minimally as
> My mother became very sad, and ate no food. But my sister felt better five months later.
or with fluency edits as
> My mother was very distressed and refused to eat. Luckily, my sister recovered five months later.
For fair evaluation of both approaches to the correction task, we will provide two evaluation metrics, one favoring minimal correction, one suited for fluency-edited output (read more under Evaluation).
We particularly encourage development of multilingual systems that can process all (or several) languages using a single model, but this is not a mandatory requirement to participate in the task.
* DATA
We provide training, development and test data for each of the languages. The training and development dataset splits will be made available through Github. Evaluation will be performed on a separate test set.
See website for more detailed information: https://github.com/spraakbanken/multigec-2025/
* EVALUATION
During the shared task, evaluation will be based on cross-lingually applicable automatic metrics, primarily:
- GLEU score (reference-based)
- Scribendi score (reference-free)
For comparability with previous results, we will also provide F0.5 scores.
After the shared task, we also plan on carrying out a human evaluation experiment on a subset of the submitted results.
* TIMELINE (preliminary)
- June 18, 2024 - first call for participation
- September 20, 2024 - second call for participation
- October 20, 2024 - third call for participation. Training and validation data released, CodaLab opens for team registrations
- October 30, 2024 - reminder. Validation server released online
- November 13, 2024 - test data released
- November 20, 2024 - system submission deadline (system output)
- November 29, 2024 - results announced
- December 20, 2024 - paper submission deadline with system descriptions
- January 20, 2025 - paper reviews sent to the authors
- February 7, 2025 - camera-ready deadline
- March 5 (or March 2), 2025 - presentations of the systems at the NLP4CALL workshop
* PUBLICATION
We encourage you to submit a paper with your system description to the NLP4CALL workshop special track. We follow the same requirements for paper submissions as the NLP4CALL workshop, i.e. we use the same template and apply the same page limit. All papers will be reviewed by the organizing committee. Upon paper publication, we encourage you to share models, code, fact sheets, extra data, etc. with the community through GitHub or other repositories.
* ORGANIZERS
- Arianna Masciolini, University of Gothenburg, Sweden
- Andrew Caines, University of Cambridge, UK
- Orphee De Clecrq, Ghent university, Belgium
- Murathan Kurfali, Stockholm University, Sweden
- Ricardo Muñoz Sánchez, University of Gothenburg, Sweden
- Elena Volodina, University of Gothenburg, Sweden
- Robert Östling, Stockholm University, Sweden
* DATA PROVIDERS (more languages to come)
- Czech: Alexandr Rosen, Charles University, Prague
- English: Andrew Caines, University of Cambridge
- Estonian:
-- Mark Fishel, University of Tartu, Estonia
-- Kais Allkivi-Metsoja, Tallinn University, Estonia
-- Kristjan Suluste, Eesti Keele Instituut, Estonia
- German:
-- Torsten Zesch, Fernuniversität in Hagen, Germany
-- Andrea Horbach, Fernuniversität in Hagen, Germany
- Icelandic: Isidora Glisič, University of Iceland
- Italian: Jennifer-Carmen Frey, Eurac Research Bolzano, Italy
- Latvian:
- Roberts Darģis, University of Latvia
- Ilze Auzina, University of Latvia
- Slovene: Špela Arhar Holdt, University of Ljubljana, Slovenia
- Swedish: Arianna Masciolini, University of Gothenburg, Sweden
- Ukrainian:
-- Oleksiy Syvokon, Microsoft and
-- Mariana Romanyshyn, Grammarly
* CONTACT
Please join the MultiGEC-2025 Google group (https://groups.google.com/g/multigec-2025) in order to ask questions, hold discussions and browse for already answered questions.
Join Veeva Systems , a pioneer in cloud solutions for the life sciences
industry, as a Senior/Principal Data Scientist focusing on NLP.
Your role will primarily involve developing LLM-based agents that are
specialized in searching and extracting detailed information about Key
Opinion Leaders (KOLs) in the healthcare sector.
You will craft an end-to-end human-in-the-loop pipeline to sift through a
large array of unstructured medical documents—ranging from academic
articles to clinical guidelines and meeting notes from therapeutic
committees.
You will also collaborate with over 2000 data curators and dedicated team
of software developers and DevOps engineers to refine these models and
deploy them into production environments.
*What You'll Do*
-
Adopt the latest technologies and trends in NLP to your platform
-
Develop LLM-based agents capable of performing function calls and
utilizing tools such as browsers for enhanced data interaction and
retrieval.
-
Experience with Reinforcement Learning from Human Feedback (RLHF)
methods such as Direct Preference Optimization (DPO) and Proximal Policy
Optimization (PPO) for training LLMs based on human preferences.
-
Design, develop, and implement an end-to-end pipeline for extracting
predefined categories of information from large-scale, unstructured data
across multi-domain and multilingual settings
-
Create a robust semantic search functionality that effectively answers
user queries related to various aspects of the data
-
Use and develop named entity recognition, entity-linking, slot-filling,
few-shot learning, active learning, question/answering, dense passage
retrieval and other statistical techniques and models for information
extraction and machine reading
-
Deeply understand and analyze our data model per data source and
geo-region and interpret model decisions
-
Collaborate with data quality teams to define annotation tasks, metrics,
and perform qualitative and quantitative evaluation
-
Utilize cloud infrastructure for model development, ensuring seamless
collaboration with our team of software developers and DevOps engineers for
efficient deployment to production
*Requirements*
-
4+ years of experience as a data scientist (or 2+ years with a Ph.D.
degree)
-
Master's or Ph.D. in Computer Science, Artificial Intelligence,
Computational Linguistics, or a related field.
-
Strong theoretical knowledge of Natural Language Processing, Machine
Learning, and Deep Learning techniques.
-
Proven experience working with large language models and transformer
architectures, such as GPT, BERT, or similar.
-
Familiarity with large-scale data processing and analysis, preferably
within the medical domain.
-
Proficiency in Python and relevant NLP libraries (e.g., NLTK, SpaCy,
Hugging Face Transformers).
-
Experience in at least one framework for BigData (e.g. Ray, Spark) and
one framework for Deep Learning (e.g. PyTorch, JAX)
- Experience working with cloud infrastructure (e.g., AWS, GCP, Azure)
and containerization technologies (e.g., Docker, Kubernetes) and
experience with bashing script
- Strong collaboration and communication skills, with the ability to
work effectively in a cross-functional team
- Used to start-up environments
- Social competence and a team player
- High energy and ambitious
- Agile mindset
*Application Links*
You can work remotely anywhere in the UK, The Netherlands or Spain and you
have to be a resident of one of the aforementioned countries and be legally
authorized to work there without requiring Veeva’s support for visa or
relocation. *If you do not meet this condition, but you think you are an
exceptional candidate please clarify it in a separate note and we will
consider it.*
Spain: https://jobs.lever.co/veeva/2bf92570-a680-40e8-96b0-a8629e3feac7
<https://jobs.lever.co/veeva/61dc60d9-c888-4636-836e-2a75ff9f0567>UK:
https://jobs.lever.co/veeva/f0e989b5-9d14-4f82-baaa-2fc56a76ba16
<https://jobs.lever.co/veeva/f0e989b5-9d14-4f82-baaa-2fc56a76ba16>
Netherlands:
https://jobs.lever.co/veeva/2bf92570-a680-40e8-96b0-a8629e3feac7
1.
2.
--
Ehsan Khoddam
Data Science Manager - Medical NLP
Link Data Science
Veeva Systems
m +31623213197
ehsan.khoddam(a)veeva.com
[apologies if you received multiple copies of this call]
We are pleased to invite abstract submissions for session 3, "Large
Language Models," at the upcoming "1st Conference of the German AI Service
Centers (KonKIS24)" with a focus on "Advancing Secure AI in Critical
Infrastructures for Health and Energy." Please visit the main event page
https://events.gwdg.de/event/615/ for more details.
We encourage submissions that align with the conference's theme,
particularly in the following areas:
- *Pretraining Techniques for LLMs*: Exploring foundational strategies
and algorithms.
- *Testing and Evaluating LLM Fitness*: Methods for assessing
performance on well-known tasks and benchmarks.
- *Application of LLMs in Scientific Research*: Case studies and
examples of LLMs driving discovery and innovation.
- *Innovative Insights Generation*: Strategies for leveraging LLMs to
generate novel insights and accelerate research outcomes.
- *Challenges and Solutions in LLM Application*: Discussing the
practical challenges and potential solutions in scientific research.
Accepted abstracts will be featured through short presentations during the
session. The conference will take place on September 18-19 in picturesque
Göttingen. For more information, to submit an abstract, book a stand, or
register, please visit the program homepage
https://events.gwdg.de/event/615/program.
Feel free to contact me (jennifer[dot]dsouza[at]tib[dot]eu) directly with
any questions about this session.
Dear all,
We are excited to announce the 7th FEVER workshop and shared task collocated with EMNLP 2024. The full CFP is here: https://fever.ai/workshop.html , below are some highlights:
New Shared Task: In this year’s workshop we will organise a new fact checking shared task AVeriTeC: A Dataset for Real-world Claim Verification with Evidence from the Web. It will consist of claims that are fact checked using evidence from the web. For each claim, systems must return a label (Supported, Refuted, Not Enough Evidence, Conflicting Evidence/Cherry-picking) and appropriate evidence. The evidence must be retrieved from the document collection provided by the organisers or from the Web (e.g. using a search API). For more information, see our shared task page<https://fever.ai/task.html>.
The timeline for it is as follows:
* Training/dev data release: April 2024
* Test data release: July 10, 2024
* Shared task deadline: July 20, 2024
* Shared task submission due: August 15, 2024
We invite long and short papers on all topics related to fact extraction and verification, including:
* Information Extraction
* Semantic Parsing
* Knowledge Base Population
* Natural Language Inference
* Textual Entailment Recognition
* Argumentation Mining
* Machine Reading and Comprehension
* Claim Validation/Fact checking
* Question Answering
* Information Retrieval and Seeking
* Theorem Proving
* Stance detection
* Adversarial learning
* Computational journalism
*
Descriptions of systems for the FEVER<http://fever.ai/2018/task.html>, FEVER 2.0<http://fever.ai/2019/task.html>, FEVEROUS<https://fever.ai/2021/task.html> and AVERITEC<https://fever.ai/dataset/averitec.html> Shared Tasks
Important dates:
* Submission deadline: August 15, 2024 (ARR and non-ARR submission deadline)
* Commitment deadline: September 23, 2024
* Notification: September 27, 2024
* Camera-ready deadline: October 4, 2024
* Workshop: November 15 or 16, 2024
All deadlines are 11.59 pm UTC -12h ("anywhere on Earth").
Feel free to contact us on our slack channel<https://join.slack.com/t/feverworkshop/shared_invite/zt-4v1hjl8w-Uf4yg~dift…> or via email: fever-organisers(a)googlegroups.com with any questions.
Looking forward to your participation!
--
The FEVER workshop organizers
Hi everyone,
Please find a request for participation in a very short study of one of my student's bachelor thesis below.
Best,
Dominik
-------- Weitergeleitete Nachricht --------
Betreff: Searching participants for my quick study
Datum: Thu, 13 Jun 2024 09:38:22 +0000
Von: Wolkober, Marcel <st163937(a)stud.uni-stuttgart.de>
An: dominik.schlechtweg(a)ims.uni-stuttgart.de <dominik.schlechtweg(a)ims.uni-stuttgart.de>
Hello!
For my bachelor thesis I need participants in my quick online study.
It will take approximately 5 to 10 minutes to complete and is in English. You can use your smartphone, but it's recommended to use a PC browser.
Here you can get to the study: https://semantic-nlp-captcha.de/study <https://semantic-nlp-captcha.de/study> .
Everything else will be explained there. If you have troubles on mobile, activate the desktop mode.
It would be of great help if you can forward this study to others, thanks!
Best wishes,
Marcel Wolkober
Apologies for crossposting.
Call for Papers
Information Processing & Management (IPM), Elsevier
-
CiteScore: 14.8
-
Impact Factor: 8.6
Guest editors:
-
Omar Alonso, Applied Science, Amazon, Palo Alto, California, USA.
E-mail: omralon(a)amazon.com
-
Stefano Marchesin, Department of Information Engineering, University of
Padua, Padua, Italy. E-mail: stefano.marchesin(a)unipd.it
-
Gianmaria Silvello, Department of Information Engineering, University
of Padua, Padua, Italy. E-mail: gianmaria.silvello(a)unipd.it
Special Issue on “Large Language Models and Data Quality for Knowledge
Graphs”
In recent years, Knowledge Graphs (KGs), encompassing millions of
relational facts, have emerged as central assets to support virtual
assistants and search and recommendations on the web. Moreover, KGs are
increasingly used by large companies and organizations to organize and
comprehend their data, with industry-scale KGs fusing data from various
sources for downstream applications. Building KGs involves data management
and artificial intelligence areas, such as data integration, cleaning,
named entity recognition and disambiguation, relation extraction, and
active learning.
However, the methods used to build these KGs involve automated components
that could be better, resulting in KGs with high sparsity and incorporating
several inaccuracies and wrong facts. As a result, evaluating the KG
quality plays a significant role, as it serves multiple purposes – e.g.,
gaining insights into the quality of data, triggering the refinement of the
KG construction process, and providing valuable information to downstream
applications. In this regard, the information in the KG must be correct to
ensure an engaging user experience for entity-oriented services like
virtual assistants. Despite its importance, there is little research on
data quality and evaluation for KGs at scale.
In this context, the rise of Large Language Models (LLMs) opens up
unprecedented opportunities – and challenges – to advance KG construction
and evaluation, providing an intriguing intersection between human and
machine capabilities. On the one hand, integrating LLMs within KG
construction systems could trigger the development of more context-aware
and adaptive AI systems. At the same time, however, LLMs are known to
hallucinate and can thus generate mis/disinformation, which can affect the
quality of the resulting KG. In this sense, reliability and credibility
components are of paramount importance to manage the hallucinations
produced by LLMs and avoid polluting the KG. On the other hand,
investigating how to combine LLMs and quality evaluation has excellent
potential, as shown by promising results from using LLMs to generate
relevance judgments in information retrieval.
Thus, this special issue promotes novel research on human-machine
collaboration for KG construction and evaluation, fostering the
intersection between KGs and LLMs. To this end, we encourage submissions
related to using LLMs within KG construction systems, evaluating KG
quality, and applying quality control systems to empower KG and LLM
interactions on both research- and industrial-oriented scenarios.
Topics include but are not limited to:
-
KG construction systems
-
Use of LLMs for KG generation
-
Efficient solutions to deploy LLMs on large-scale KGs
-
Quality control systems for KG construction
-
KG versioning and active learning
-
Human-in-the-loop architectures
-
Efficient KG quality assessment
-
Quality assessment over temporal and dynamic KGs
-
Redundancy and completeness issues
-
Error detection and correction mechanisms
-
Benchmarks and Evaluation
-
Domain-specific applications and challenges
-
Maintenance of industry-scale KGs
-
LLM validation via reliable/credible KG data
Submission guidelines:
Authors are invited to submit original and unpublished papers. All
submissions will be peer-reviewed and judged on originality, significance,
quality, and relevance to the special issue topics of interest. Submitted
papers should not have appeared in or be under consideration for another
journal.
Papers can be submitted *up *to 1 September 2024. The estimated publication
date for the special issue is 15 January 2025.
Papers submission via IP&M electronic submission system:
https://www.editorialmanager.com/IPM
To submit your manuscript to the special issue, please choose the article
type:
"VSI: LLMs and Data Quality for KGs".
More info here:
https://www.sciencedirect.com/journal/information-processing-and-management…
Instructions for authors:
https://www.sciencedirect.com/journal/information-processing-and-management…
Important dates:
-
Submissions close: 1 September 2024
-
Publication date (estimated): 15 January 2025
References:
Weikum G., Dong X.L., Razniewski S., et al. (2021) Machine knowledge:
creation and curation of comprehensive knowledge bases. Found. Trends
Databases, 10, 108–490.
Hogan A., Blomqvist E., Cochez M. et al. (2021) Knowledge graphs. ACM
Comput. Surv., 54, 71:1–71:37.
B. Xue and L. Zou. 2023. Knowledge Graph Quality Management: A
Comprehensive Survey. IEEE Trans. Knowl. Data Eng. 35, 5 (2023), 4969 – 4988
G. Faggioli, L. Dietz, C. L. A. Clarke, G. Demartini, M. Hagen, C. Hauff,
N. Kando, E. Kanoulas, M. Potthast, B. Stein, and H. Wachsmuth. 2023.
Perspectives on Large Language Models for Relevance Judgment. In Proc. of
the 2023 ACM SIGIR International Conference on Theory of Information
Retrieval, ICTIR 2023, Taipei, Taiwan, 23 July 2023. ACM, 39 – 50.
S. MacAvaney and L. Soldaini. 2023. One-Shot Labeling for Automatic
Relevance Estimation. In Proc. of the 46th International ACM SIGIR
Conference on Research and Development in Information Retrieval, SIGIR
2023, Taipei, Taiwan, July 23-27, 2023. ACM, 2230 – 2235.
X. L. Dong. 2023. Generations of Knowledge Graphs: The Crazy Ideas and the
Business Impact. Proc. VLDB Endow. 16, 12 (2023), 4130 – 4137.
S. Pan, L. Luo, Y. Wang, C. Chen, J. Wang, and X. Wu. 2023. Unifying Large
Language Models and Knowledge Graphs: A Roadmap. CoRR abs/2306.08302 (2023).
--
Stefano Marchesin, PhD
Assistant Professor (RTD/a)
Information Management Systems (IMS) Group
Department of Information Engineering
University of Padua
Via Gradenigo 6/a, 35131 Padua, Italy
Home page: http://www.dei.unipd.it/~marches1/
For the full text: https://nllpw.org/workshop/call/
Following the success of the first five editions of the NLLP workshop (NAACL 2019, KDD 2020, EMNLP 2021, EMNLP 2022, EMNLP 2023), we aim to bring researchers and practitioners from NLP, machine learning and other artificial intelligence disciplines together with legal practitioners and researchers. We welcome submissions describing original work on legal data, as well as data with legal relevance, such as:
Applications of NLP to legal tasks including, but not limited to:
Legal Citation Resolution
Case Outcome Analysis and Prediction
Models of Legal Reasoning
E-Discovery
Lexical and other Data Resources for the Legal Domain
Bias and Privacy
Applications of Large Language Models (LLMs) to Legal Data and Tasks
Experimental results using and adapting NLP methods for legal data including, but not limited to:
Classification
Information Retrieval
Anomaly Detection
Clustering
Knowledge Base Population
Multimedia Search
Link Analysis
Entity Recognition and Disambiguation
Training and Using Embeddings
Parsing
Dialogue and Discourse Analysis
Text Summarization and Generation
Relation and Event Extraction
Anaphora Resolution
Question Answering
Query Understanding
Combining Text with Structured Data
Tasks:
Description of new legal tasks for NLP
Structured overviews of a specific task with the goal of identifying new areas for research
Position papers presenting new visions, challenges and changes to existing research practices
Resources:
Creation of curated and/or annotated data sets that can be publicly released and used by the community to advance the field
Demos:
Descriptions of systems which use NLP technologies for legal text;
Industrial Research:
Industrial applications
Papers describing research on proprietary data
Interdisciplinary position papers:
Legal or socio-legal analyses relating to the role NLP can play in the legal field
Critical reflections on the legality and ethics of data collection and processing practices
Critical reflections about the benefits and challenges of Large Language Models (LLMs) from a legal and regulatory perspective
Critical reflections on the legality and ethics of data collection and processing practices
Submission
------------------------------------
We accept papers reporting original (unpublished) research of two types:
Long papers (max 8 pages + references)
Short papers (max 4 pages + references)
Appendices and acknowledgements do not count against the maximum page limit and should be formatted according to the guidelines below.
To submit a paper, please access the submission link https://softconf.com/emnlp2024/nllp/
Conference proceedings will be published on the ACL Anthology.
Shared Task
Together with Darrow.ai we organize the LegalLens Shared task. More information is provided here https://www.codabench.org/competitions/3052/
Participants will be invited to describe their system in a paper for the NLLP workshop proceedings. The task organizers will write an overview paper that describes the task and summarizes the different approaches taken, and analyzes their results.
More information on the submission of description papers will follow.
Ethics section
The NLLP workshop adheres to the same standards regarding ethics as the EMNLP 2024 conference. Authors will be allowed extra space after the 8th page (4th for short papers) for an optional broader impact statement or other discussion of ethics. Note that an ethical considerations section is not required, but papers working with sensitive data or on sensitive tasks that do not discuss these issues will not be accepted.
Non-archival option
The authors have the option of submitting previously unpublished research as non-archival, meaning that only the abstract will be published in the conference proceedings. We expect these submissions to describe the same quality of work as archival submissions. These will be reviewed following the same procedure as archival submissions. This option accommodates publication of the work or a superset at a later date in a conference or journal which does not allow previously archived work and to encourage presentation and feedback on mature, yet unpublished work. Non-archival submissions should adhere to the same formatting and length constraints as archival submissions.
Dual Submission and Pre-print Policy
Papers that have been or will be submitted to workshops, conferences or journals during the review period must indicate so at submission time. Authors of papers accepted for presentation at the NLLP 2024 workshop must notify the organizers by the camera-ready deadline as to whether the paper will be presented or withdrawn.
If the preliminary version of a paper was posted in arXiv, the authors should NOT mention it as their own paper in the submission. Papers that violate the double-blind review requirements will be desk rejected.
Exception: Submissions with the non-archival option are excepted from these requirements.
ACL Rolling Review Submissions
Our workshop also welcomes submissions from ACL Rolling Review (ARR). Authors of any papers that are submitted to ARR and have their meta review ready may submit their papers and reviews for consideration for the workshop until 27 September 2024. This should include submissions to ARR for the 15 August deadline. The decision of publication will be announced by 8 October 2024. The committment should be done via the workshop submission website: https://softconf.com/emnlp2024/nllp/ ("ACL Rolling Review Committment" submission type)
EMNLP 2024 Submissions
Authors of any papers that have been reviewed for EMNLP 2024 and were rejected have the opportunity to send their paper and reviews to be considered for publication in the NLLP workshop proceedings. The deadline for submitting papers and reviews is 27 September 2024. The decision of publication will be announced by 8 October 2024. The submission should be done via the workshop submission website: https://softconf.com/emnlp2024/nllp/ ("EMNLP 2024 Submission with reviews" submission type)
Double-Blind reviewing
The review process is double-blind. Submitted papers must not include author names and affiliations and they must be written in a way so that they do not break the double-blind reviewing process. If the preliminary version of a paper was posted in arXiv, the authors should NOT mention it as their own paper in the submission. Papers that violate the double-blind review requirements will be desk rejected.
Submission Style & Format Guidelines
Paper submissions must use the official ACL style templates, which are available here (Latex and Word). Please follow the paper formatting guidelines general to "*ACL" conferences available here.
Authors may not modify these style files or use templates designed for other conferences. Submissions that do not conform to the required styles, including paper size, margin width, and font size restrictions, will be rejected without review.
All long, short and theme papers must follow the ACL Author Guidelines.
Important deadlines
Submission deadline ― 3 September 2024
Submission of EMNLP papers with reviews and ARR committment ― 27 September 2024
Notification for direct submissions, ARR and EMNLP papers ― 8 October 2024
Camera ready due ― 15 October 2024 (tentative)
Workshop ― 15 or 16 November 2024
All deadlines are 11.59pm UTC -12h
Presentation
Presentation format for each paper and schedule will be announced between acceptance notification and the camera-ready deadline.
At least one author of each accepted paper must register for the NLLP 2024 workshop by the registration deadline in order for the submission to be published in the proceedings.