Job Opening for Data Scientist with a focus on natural language processing
Application link: https://bit.ly/3QAkC1M
Application deadline: 31 May 2024
The South African Centre for Digital Language Resources (SADiLaR) is looking for a data scientist with a focus on natural language processing (permanent position). As a Data Scientist at the South African Centre for Digital Language Resources (SADiLaR) you will have the opportunity to initiate and lead projects focusing on Human Language Technology and Digital Humanities stemming from your own research interests. You will work closely together with a team of researchers as part of SADiLaR's extended network, both on your own and commissioned projects. Dissemination of project results at national and international conferences will be encouraged and supported. This position is crucial for research and development in Human Language Technology and Digital Humanities, fields that form the essence of SADiLaR, which is a national Research Infrastructure supported by the Department of Science and Innovation. Read more about SADiLaR at https://www.sadilar.org.
Key responsibilities:
- Research: Research in the area of Human Language Technology and Digital Humanities.
- Project work: Initiating and contributing to Human Language Technology and Digital Humanities projects.
- Teaching: Teaching in the area of Human Language Technology and Digital Humanities.
- Mentorship: Mentorship of researchers in the field of Human Language Technology and Digital Humanities.
Minimum requirements:
- A PhD (NQF level 10) in one of the following fields: Computational Linguistics, Natural Language Processing, Human Language Technology, Digital Humanities, Data Science, Computer Science, Information Technology, Artificial Intelligence, or related fields. The PhD should have a focus on computational aspects of linguistics.
- A minimum of (five) 5 years' experience in the use of Python (other programming languages used within the computational linguistics or Digital Humanities domain can also be considered).
- Evidence of peer-reviewed academic publications.
- A minimum of (three) 3 years' experience as a supervisor/co-supervisor of students or playing a mentorship/supervising role for individuals.
- A minimum of (three) 3 years' experience with using and/or developing computational tools.
- A minimum of (three) 3 years experience related to research within the domain of Language Technology or Digital Humanities.
- A minimum of (one) 1 year experience related to teaching or training within the domain of Language Technology or Digital Humanities.
More information can be found at the application link.
For informal inquiries please contact: Menno van Zaanen <menno.vanzaanen(a)nwu.ac.za>
Menno will be attending LREC-COLING, so please feel free to connect with him for a discussion.
Application link: https://bit.ly/3QAkC1M
--
Prof Menno van Zaanen menno.vanzaanen(a)nwu.ac.za<mailto:menno.vanzaanen@nwu.ac.za>
Professor in Digital Humanities
South African Centre for Digital Language Resources https://www.sadilar.org<https://www.sadilar.org/>
________________________________
NWU PRIVACY STATEMENT:
http://www.nwu.ac.za/it/gov-man/disclaimer.html
DISCLAIMER: This e-mail message and attachments thereto are intended solely for the recipient(s) and may contain confidential and privileged information. Any unauthorised review, use, disclosure, or distribution is prohibited. If you have received the e-mail by mistake, please contact the sender or reply e-mail and delete the e-mail and its attachments (where appropriate) from your system.
________________________________
We are pleased to announce the Third and Final Call for Papers, Presentations, Tutorials, Workshops and Best Thesis Award Submissions for AMTA 2024.
https://amtaweb.org/amta-2024-call-for-proposals/
AMTA 2024 will be a three-day hybrid event under the auspices of the Association for Machine Translation in the Americas to be held on Monday, 30 September 2024 through Wednesday, 2 October 2024 at the Renaissance Schaumburg Convention Center in Chicago, Illinois, USA.
Like all AMTA events, AMTA 2024 will bring together researchers, practitioners, and providers of MT and related cross-lingual technology from academia, industry, and government. It will include keynote talks, panel discussions, individual presentations, and demonstrations from technology providers.
The organizing committee of AMTA 2024 is seeking proposals for papers and presentations on all topics related to research, development, application, and evaluation of Machine Translation and cross-lingual technologies. Our goal is to have a broad and engaging program that appeals to the various constituents of the MT community (researchers, developers, users, and language professionals).
Accepted papers and presentations will be presented as 20-minute talks (15-minute presentations, plus 5 minutes for questions).
Please visit https://amtaweb.org/amta-2024-call-for-proposals/ for a sample list of topics of interest, additional conference details, and complete submission guidelines.
Workshops and tutorials
Traditionally, workshops and tutorials associated with the AMTA conference were held on a pre- and post-conference day. AMTA 2024, however, will feature a day of workshops and tutorials held completely virtually ahead of the main conference. For information about submitting a proposal for a workshop or tutorial, please visit https://amtaweb.org/amta-2024-cfp-workshops-tutorials/.
Best Thesis Award
The AMTA Board of Directors is also pleased to sponsor the first edition of the AMTA Best Thesis Award, which aims to highlight the achievements of a recent PhD graduate whose thesis has focused on a topic or topics related to machine translation. This award includes a $1000 USD prize, free conference registration and AMTA membership, among other benefits. For more information about eligibility and submission details, please visit https://amtaweb.org/amta-2024-cfp-best-thesis-award/.
Important dates
Submission deadline: 6 June 2024, 23:59 Anywhere on Earth (UTC-12)
Notification of acceptance for papers, presentations and tutorials: 19 July 2024
Notification of Best Thesis Award: 31 July 2024
Final “camera-ready” papers for proceedings and materials for tutorials: 16 August 2024
Please direct any questions you may have to organizing.committee(a)amtaweb.org.
We look forward to receiving your submission!
AMTA Board of Directors
Venue: ACL 2024
Location: Bangkok, Thailand
Date: August 15, 2024
Papers Due: May 14, 2024 (23:59 AOE)
Website: https://sites.google.com/view/textgraphs2024
OpenReview Site: https://openreview.net/group?id=aclweb.org/ACL/2024/Workshop/TextGraphs-17<https://openreview.net/group?id=aclweb.org/ACL/2024/Workshop/TextGraphs-17#…>
Workshop Description
For the past seventeen years, the workshops in the TextGraphs series have published and promoted the synergy between the field of Graph Theory (GT) and Natural Language Processing (NLP). The mix between the two started small, with graph-theoretical frameworks providing efficient and elegant solutions for NLP applications. Graph-based solutions initially focused on single-document part-of-speech tagging, word sense disambiguation, and semantic role labeling. They became progressively larger to include ontology learning and information extraction from large text collections. Nowadays, graph-based solutions also target Web-scale applications such as information propagation in social networks, rumor proliferation, e-reputation, multiple entity detection, language dynamics learning, and future events prediction, to name a few.
We plan to encourage the description of novel NLP problems or applications that have emerged in recent years, which can be enhanced with existing and new graph-based methods. We widen the workshop topics beyond the familiar graph domain, encompassing a broader range of less examined structured data domains as well. The seventeenth edition of the TextGraphs workshop aims to extend the focus on exploring rising topics of large language models (LLMs) prompting from the unique perspective of GT. Therefore, our workshop aims to foster stronger, mutually advantageous connections between NLP and structured data, tackling key challenges inherent in each field.
TextGraphs-17 invites submissions on (but not limited to) the following topics:
* Knowledge Graphs Meet LLMs. A proper utilization of graph-based methods for reasoning over a Knowledge Graph (KG) is a prospective way to overcome critical limitations of the existing LLMs which lack interpretability and factual knowledge and are prone to the hallucination problem. Vice versa, the incorporation of LLM knowledge learnt from large textual collections may help many graph-related tasks, such as KG completion and graph representation learning. Thus, we are highly interested in novel research on the joint use of KG and LLM for an improved processing of either the NLP or graph domain (preferably both).
* Chain Prompting of LLMs. Recent studies show that prompting strategies like Chain-of-Thought and Graph-of-Thought enhance language understanding and generation tasks compared to the traditional few-shot methods. We welcome submissions developing advanced prompting schemes and software for LLMs and other pre-trained machine learning models.
* Learning from Structured Data. We greet novel efforts to build a bridge between NLP and various structured data formats including relational and non-relational databases, as well as standardized data formats (such as XML, JSON, RDF, etc.)
* Interpretability of NLP Systems. The question of interpretability poses a fundamental challenge for the practical application of NLP methods. We invite researchers to adopt structured data and employ graph-based methods to shed light on decision-making and logic behind modern LLMs. Any work on applying a KG or any other structured knowledge to explore and evaluate factual awareness, treating the interpretability problem from the GT perspective, or any other topic that utilizes graphs and other structured data to make LLMs more understandable, is met with appreciation.
Important dates
- Papers due: May 14, 2024
- Notification of acceptance: June 15, 2024
- Camera-ready papers due: July 1, 2024
- Conference date: August 15, 2024
Submission
We invite submissions of up to eight (8) pages maximum, plus bibliography for long papers and four (4) pages, plus bibliography, for short papers.
The ACL 2024 templates must be used; these are provided in LaTeX and also Microsoft Word format. Submissions will only be accepted in PDF format.
This year, TextGraph submission is managed through OpenReview. Submit papers by the end of the deadline day (timezone is UTC-12; AoE) via the submission link on our site: https://openreview.net/group?id=aclweb.org/ACL/2024/Workshop/TextGraphs-17<https://openreview.net/group?id=aclweb.org/ACL/2024/Workshop/TextGraphs-17#…>
Shared Task
We invite participation in the task of Knowledge Graph Question Answering (KGQA). We will ask the participants to analyze candidate answers with text and graph features. For each query-answer candidate, a graph characterizing paths in Wikidata from entity from the query to the answer entity will be given.
Contact
Please direct all questions and inquiries to our official e-mail address (textgraphsOC(a)gmail.com<mailto:textgraphsOC@gmail.com>) or contact any of the organizers via their individual emails. Also you can join us on Facebook: https://www.facebook.com/groups/900711756665369.
Organizers
- Dmitry Ustalov, JetBrains
- Arti Ramesh, Binghamton University
- Alexander Panchenko, Skolkovo Institute of Science and Technology
- Yanjun Gao, University of Wisconsin-Madison
- Andrey Sakhovskiy, Skolkovo Institute of Science and Technology
- Elena Tutubalina, Artificial Intelligence Research Institute
- Gerald Penn, University of Toronto
- Marco Valentino, Idiap Research Institute
Exciting Opportunity for Summer 2024 NLP Internship at Ulster University!
Are you passionate about Natural Language Processing (NLP) and eager to
join a cutting-edge team at Ulster University? We're thrilled to announce
an internship opportunity for our Core NLP team, and we're looking for
enthusiastic individuals to join us.
Key Details:
1. Duration: 2.5 months, remote mode, from mid-May to July 2024. Open to
graduates from both 2024 and 2025.
2. Focus: You'll be involved in building data pipelines, cleaning,
processing, and analyzing diverse datasets.
3. Requirements: Prior hands-on experience with Machine Learning frameworks
and models is essential.
4. Responsibilities: You'll contribute to the design, training,
integration, testing, and optimization of NLP (LLMs) and ML models.
5. Bonus Skills:Experience with LLM finetuning, RAG-based methods, and
deployments is highly desirable.
How to Apply:
If you're excited about research and scientific dissemination and meet the
qualifications above, we'd love to hear from you! Please apply directly
through the link
https://forms.gle/Unah7Nt522KcDYoC6
The University of Amsterdam invites applications for postdoctoral positions
on the intersection of ML, NLP, and Computer Vision. The research is
funded by NWO (Dutch Science Foundation) grant of Ivan Titov (
http://ivan-titov.org/). The postdocs and PhD students will be employed by
the University of Amsterdam and will be members of the Institute for Logic,
Language, and Computation <https://www.illc.uva.nl/> and the Faculty of
Science. The collaboration is envisaged with researchers at the University
of Amsterdam (e.g., Efstratios Gavves, and Wilker Aziz), as well as at the
University of Edinburgh (e.g., Edoardo Ponti, Hakan Bilen, Sidharth N.,
Pasquale Minervini and Kenny Smith).
The research will focus primarily on the following directions (or their
intersections):
1) Modular and decentralized learning in multimodal settings
2) Learning from language in grounded settings: exploiting knowledge
embedded in language and language models to help solve decision-making
applications and produce generalizable and interpretable models for these
tasks
3) Emergent communication and collaboration: developing agents, which learn
to communicate with each other to solve problems, while
maintaining transparency to humans and improving with human feedback.
*Application deadline: May 31, 2024*
https://www.illc.uva.nl/NewsandEvents/News/Positions/newsitem/14960/Two-Pos…
<https://www.illc.uva.nl/NewsandEvents/News/Positions/newsitem/14960/Two-Pos…>
For informal enquiries please contact: Ivan Titov (titov(a)uva.nl)
Ivan is attending ICLR, so please feel free to connect with him for a
discussion.
<https://www.illc.uva.nl/NewsandEvents/News/Positions/newsitem/14960/Two-Pos…>
<https://www.illc.uva.nl/NewsandEvents/News/Positions/newsitem/14960/Two-Pos…>
Call for Participation: The 4th Workshop on Human Evaluation of NLP Systems
(HumEval’24)
Date: 21 May 2024 (full day)
Venue: Lingotto Conference Centre, Turin, Italy
Registration
Registration is mandatory for attending the workshop. Find more info about
registration on the LREC-COLING 2024 website: https://lrec-coling-2024.org/
Workshop description
Human evaluation plays a central role in NLP, from the large-scale
crowd-sourced evaluations to the much smaller experiments routinely
encountered in conference papers. Yet there is growing unease about how
human evaluations are conducted in NLP. Researchers have pointed out the
less-than-perfect experimental and reporting standards that prevail (van
der Lee et al., 2019 <https://aclanthology.org/W19-8643/>; Gehrmann et al.,
2023 <https://www.jair.org/index.php/jair/article/view/13715/26927>), and
that low-quality evaluations with crowdworkers may not correlate well with
high-quality evaluations with domain experts (Freitag et al., 2021
<https://aclanthology.org/2021.tacl-1.87>). Only a small proportion of
papers provide enough detail for reproduction of human evaluations, and in
many cases the information provided is not even enough to support the
conclusions drawn (Belz et al., 2023
<https://aclanthology.org/2023.insights-1.1>).
The HumEval workshop (previously at EACL 2021, ACL 2022, and RANLP 2023)
aim to create a forum for current human evaluation research and future
directions, a space for researchers working with human evaluations to
exchange ideas and begin to address the issues human evaluation in NLP
faces in many respects, including experimental design, meta-evaluation and
reproducibility.
Programme
Find the detailed programme on the workshop website:
https://humeval.github.io/2024/programme
Invited speakers
Mark Diaz (Google Research)
Sheila Castilho (ADAPT/DCU)
Organising Committee
Simone Balloccu, Charles University, CZ
Anya Belz, ADAPT Centre, Dublin City University, Ireland
Rudali Huidrom, Dublin City University, Ireland
Ehud Reiter, University of Aberdeen, UK
João Sedoc, New-York University
Craig Thomson, ADAPT Centre, Dublin City University, Ireland
For questions and comments regarding the workshop please contact Simone
Balloccu at balloccu(a)ufal.mff.cuni.cz and humeval.ws(a)gmail.com.
--
Kind regards, Simone Balloccu.
International Conference ‘New Trends in Translation and Technology’ (NeTTT’2024)
Varna, Bulgaria, 3-6 July 2024 (https://nettt-conference.com/)
Call for ‘Last minute results’ submissions
In view of the special track of the NeTTT'24 event on Future of Translation Technology in the Era of LLMs and Generative AI and the latest dynamic developments with LLMs, we would like to call on researchers and users/companies to submit ‘‘Last minute results” of ongoing studies in the form of short 4-to-page submissions (The conference will not consider and evaluate abstracts only). The idea is to fast-track the reviewing process for these submissions so that the results presented at the event are as up-to-date as possible.
The presentations can be either in oral or poster format.
Submission deadline: 5 June 2024
Notification: 12 June 2024
Submission is done via the Softconf START conference management system at https://softconf.com/n/nettt2024.
We invite the authors to comply with the Springer format, following the templates:
* LaTeX<https://resource-cms.springernature.com/springer-cms/rest/v1/content/192386…>,
* Overleaf<https://nettt-conference.com/wp-content/uploads/2024/03/Overleaf_Springer_C…>,
* Word<https://nettt-conference.com/wp-content/uploads/2024/03/Word_splnproc2311.p…>.
Registration
Conference registration is open on https://nettt-conference.com/fees-registration/
Venue
The conference will take place at Conference Hotel Cherno More<https://www.chernomorebg.com/en/conference-centre.html>, Varna, situated only 200 m away from the fine sandy Black Sea beach.
Further information and contact details
The conference website is https://nettt-conference.com<https://nettt-conference.com/> and will be updated on a regular basis. For further information, please contact us at nettt2024(a)nettt-conference.com<mailto:nettt2024@nettt-conference.com>
*Apologies for cross-posting*
Fifth Workshop on Gender Bias in Natural Language Processing
Bangkok, Thailand, on August 16, 2024
https://genderbiasnlp.talp.cat/
Final Call for Papers and Updated Dates
Gender bias, among other demographic biases (e.g. race, nationality, religion), in machine-learned models is of increasing interest to the scientific community and industry. Models of natural language are highly affected by such biases, which are present in widely used products and can lead to poor user experiences. There is a growing body of research into improved representations of gender in NLP models. Key example approaches are to build and use balanced training and evaluation datasets (e.g. Webster et al., 2018), and to change the learning algorithms themselves (e.g. Bolukbasi et al., 2016). While these approaches show promising results, there is more to do to solve identified and future bias issues. In order to make progress as a field, we need to create widespread awareness of bias and a consensus on how to work against it, for instance by developing standard tasks and metrics. Our workshop provides a forum to achieve this goal.
Topics of interest
We invite submissions of technical work exploring the detection, measurement, and mediation of gender bias in NLP models and applications. Other important topics are the creation of datasets, identifying and assessing relevant biases or focusing on fairness in NLP systems. Finally, the workshop is also open to non-technical work addressing sociological perspectives, and we strongly encourage critical reflections on the sources and implications of bias throughout all types of work.
In addition this year we are organising a Shared Task on Gender Bias Machine Translation evaluation.
Paper Submission Information
Submissions will be accepted as short papers (4-6 pages) and as long papers (8-10 pages), plus additional pages for references, following the ACL 2024 guidelines. Supplementary material can be added, but should not be central to the argument of the paper. Blind submission is required.
Each paper should include a statement which explicitly defines (a) what system behaviors are considered as bias in the work and (b) why those behaviors are harmful, in what ways, and to whom (cf. Blodgett et al. (2020)). More information on this requirement, which was successfully introduced at GeBNLP 2020, can be found on the workshop website. We also encourage authors to engage with definitions of bias and other relevant concepts such as prejudice, harm, discrimination from outside NLP, especially from social sciences and normative ethics, in this statement and in their work in general.
Non-archival option
The authors have the option of submitting research as non-archival, meaning that the paper will not be published in the conference proceedings. We expect these submissions to describe the same quality of work and format as archival submissions.
Updated dates:
May 24, 2024: Workshop Paper Due Date
June 21, 2024: Notification of Acceptance
July 5, 2024: Camera-ready papers due
August 16, 2024: Workshop Dates
Keynote Speakers.
Isabelle Augenstein, University of Copenhagen
Hal Daumé III, University of Maryland and Microsoft Research NYC
Organizers.
Christine Basta, Alexandria University
Marta R. Costa-jussà, FAIR, Meta,
Agnieszka Falénska, University of Stuttgart
Seraphina Goldfarb-Tarrant, Cohere
Debora Nozza, Bocconi University
The Department of Computer Science at the IT University of Copenhagen is
offering a Postdoc position in Natural Language Processing/Computational
Linguistics*,* with a start date of *1 September 2024* or as soon as
possible. The *application deadline is 31* *May** 2024.* Applications for
the position can be submitted via ITU job portal
<https://candidate.hr-manager.net/ApplicationInit.aspx?cid=119&ProjectId=181…>
.
*Proposed project title: *Efficiency and Robustness in Language Model
Pre-training
*Proposed project description.* Recent generative systems based on
pre-trained language models are remarkably fluent, but this is achieved by
extreme volumes of computation and training data. This means not only high
energy costs, but also training on data that is problematic in various
ways: copyright, harmful social stereotypes, non-representative sampling,
misinformation, junk SEO texts, pornography, and contamination with NLP
datasets used for evaluation.
This project will create an ambitious resource for research on transfer
learning, in which pre-training data is held constant, and evaluation takes
into account how much similar data was observed in training, and in what
ways it was similar. This resource will encourage the development of more
efficient and robust approaches, since it will not be possible to improve
benchmark scores by simply training on more data.
The ideal candidate will have a strong background in Computational
Linguistics/Natural Language Processing and experience developing NLP
resources, as well as core skills in programming in Python and machine
learning.
The position is funded for 1 year, and it is our intention to find
additional funding to extend this postdoc to a 2- or 3-year position.
Besides research, the postdoc will gain experience with organization of an
international workshop and shared task and build up their international
network. For those interested in pursuing an academic career, it is also
possible to:
- gain experience in applying for external funding with professional
support (either for the continuation of the postdoc’s own position, e.g.
Marie Curie postdoctoral fellowship, or by contributing to PI’s grant
proposals);
- supervise Master students solo, and/or assist in supervising a PhD
student;
- undertake a formal teacher training program, including teaching guest
lectures in the relevant data science courses at the ITU computer science
department.
The successful candidate will be a member of the national Pioneer Centre
for Artificial Intelligence <https://aicentre.dk/>, a 5-university Danish
research endeavor, and of the NLPnorth <https://nlpnorth.github.io/>research
group at the IT University’s Computer Science Department. Both the centre
and research group are highly international and well-funded, working on a
broad range of research topics.
The project will be supervised by Associate Professor Anna Rogers
<https://annargrs.github.io/> (arog(a)itu.dk), to whom inquiries about the
project can be directed. The candidates attending LREC/COLING 2024 are
welcome to reach out and set up a meeting during the conference.
--
Best regards,
Anna Rogers
Associate Professor
IT University of Copenhagen
http://annargrs.github.io/