I’m sharing the following information about "Corpus linguistics & applied linguistics research 2023", a series of talks organised by the University of Murcia from 11 to 30 October 2023. The talks will take place via ZOOM.
This is a free event. Registration links will be provided for each of the talks below. After registration, you’ll receive an email with the webinar link.
Contrastive approaches in corpus linguistics research
Dr Niall Curry, Manchester Metropolitan University
October 11, 18:00 (Madrid time) / 17:00 (UK time)
Registration link: https://umurcia.zoom.us/webinar/register/WN_d68rw3V_TnOGNWDg6sXHnw
Using Corpus Linguistics to Interpret the Law
Dr Jesse Egbert, Northern Arizona University
October 18, 18:00 (Madrid time) / 17:00 (UK time)
Registration link: https://umurcia.zoom.us/webinar/register/WN_uAmw72l1T-en-4y0LnZffg
Multiple correspondence analysis and corpus linguistics research
Dr Isobelle Clarke, Lancaster University
October 25, 17:30 (Madrid time) / 16:30 (UK time)
Registration link: https://umurcia.zoom.us/webinar/register/WN_s0aPEXAFTQe_App0qS7Erg
The Core Metadata Schema for L2 data: Collaborative efforts towards improved data findability, metadata quality and study comparability in L2 research
Dr Magali Paquot, UCLouvain
October 30, 18:00 (Madrid time) / 17:00 (UK time)
Registration link: https://umurcia.zoom.us/webinar/register/WN_a6Wkw7llSG2HrvJ9yIGKvQ
You can check out the 2021 and 2022 talks here:
https://www.youtube.com/channel/UCKjKIIQL6u1mXD2V9ZaT-_Q/featured
Please feel free to share this information with anyone interested in these research areas.
Best wishes,
Pascual Pérez-Paredes
https://webs.um.es/pascualf
[Apologies for cross-posting]
A postdoc-level position as Research Fellow in Natural Language Processing is available in the Language Technology Group (LTG) at the University of Oslo (UiO), Norway. The 3-year position is affiliated with a new research project focusing on event extraction in the domain of armed conflicts, a cross-disciplinary collaboration bridging NLP and political science / conflict research.
For more information, please see the full announcement here:
https://www.jobbnorge.no/en/available-jobs/job/251514/researcher-in-natural…
The closing date is November 2nd, 2023.
Please do not hesitate to contact me for any further information.
Best regards,
-erik
--
Erik Velldal
Language Technology Group
Section for Machine Learning
Department of Informatics, University of Oslo
Postdoctoral position in computational linguistics with specialisation in language grounding to vision, robotics, and beyond
University of Gothenburg, Sweden
Project Description: The broad focus of this position is computational modelling of language (computational linguistics, natural language processing or language technology) in the context of data from other modalities such as vision, perception and action from the perspective of human-centred AI. The topics relevant for this position are computational modelling of language and perception, human-robot interaction, situated spoken dialogue systems, and computational representation of meaning (semantics).
The work will be done within the Cognitive Systems group (lead by Simon Dobnik, https://www.gu.se/en/about/find-staff/simondobnik ) which one of the four research groups within The Centre for Linguistic Theory and Studies in Probability (CLASP), https://www.gu.se/clasp and is devoted to research and advanced training in the application of probabilistic modelling and machine learning methods to core issues in linguistic theory and cognition.
The postdoctoral candidate will will have an opportunity further their scientific skills and advance the scientific filed by conducting research in collaboration with the research group by connecting ideas from several of the following areas:
- Computational semantics,
- Grounding language in action and perception,
- Generation and understanding of spatial language,
- Generation of image descriptions, visual question answering, visual dialogue
- Referring in situated dialogue,
- Situated agents / robots and instruction generation and following,
- Machine learning with neural networks,
- Cross-domain model transfer,
- Learning from small data,
- Combining top-down (expert-driven) and bottom-up (dataset-driven) models,
- Reasoning and inference including Bayesian inference,
- Model interpretation, testing and evaluation of unwanted social bias,
- Crowd-sourcing for collection and evaluation of research data.
Application Deadline: November 21, 2023 (end of the day, GMT+1).
Formal announcement and application procedure: https://web103.reachmee.com/ext/I005/1035/job?site=7&lang=UK&validator=9b89… (English) and https://web103.reachmee.com/ext/I005/1035/job?site=6&lang=SE&validator=3038… (Swedish)
Contact: For more information about the research / project focus relevant to the position,
send an email to Simon Dobnik, Professor of Computational Linguistics, simon.dobnik(a)gu.se <mailto:simon.dobnik@gu.se>
For other questions,
please contact Sharid Loáiciga, Associate Senior Lecturer, +46(0) 31-786 59 42, sharid.loaiciga(a)gu.se <mailto:sharid.loaiciga@gu.se>
—
Simon Dobnik
Professor of Computational Linguistics
CLASP & FLoV, University of Gothenburg
https://www.gu.se/en/about/find-staff/simondobnik
BCS SEARCH INDUSTRY AWARDS 2023
*** FINAL CALL ***
We are delighted to announce this year's Search Industry Awards, celebrating the best search innovations of 2023. Presented by the Information Retrieval Specialist Group of the BCS, these awards recognize people, projects, and organisations around the world that have excelled in the design of search and information retrieval products and services.
If you know of any people, projects, or products that deserve recognition, let us know by submitting a nomination. Alternatively, if you're involved with something special yourself, you can submit an application today.
CATEGORIES
This year we are offering five awards:
1. Best search user experience recognises outstanding application of user-centred design principles (to a web site / mobile app / information resource, etc.). Previous winners include:
* Boston University/Reza Rawassizadeh, Yi Rong for their work on ODSearch
* LexisNexis for their work on Open question answering on Lexis+
2. Most promising start up (or new enterprise) recognises the innovative and disruptive potential of a business model, technology, or solution. Previous winners include:
* Giotto AI, an all-in-one platform to automatize, digitalize, and standardize the data collection, analysis and writing of a Clinical Evaluation Report
* Resolute.AI, an AI driven platform to search major FDA databases in the public domain in a federated way
* Search|hub for their work on developing and commercializing a system that infuses human understanding into existing search applications
* ContextFlow, for their work on developing and commercializing the radiology image search technology developed in the EU FP7 Khresmoi project
3. Best open source project rewards the remarkable contributions of individuals and communities in fostering collaboration, transparency, and the democratization of knowledge. Previous winners include:
* CiteSeerX, one of the largest open source academic search engines with over 10 million documents
4. Search professional of the year is made to an individual who has made a significant contribution to the discipline through their work and professionalism. Previous winners include:
* Adam Tocock, Library Assistant at NHS
* Stuart Mackie, Lead Data Scientist at BiP Solutions
5. Best paper / presentation (at Search Solutions). Previous winners include:
* Filip Radlinski, Google: “Challenges with Really Understanding Natural Language in Conversational Recommendation”
* Olivia Foulds, University of Strathclyde: “Crossing the 49th Parallel in Data and Information Science”
* Mark Harwood, Elastic: “Tackling toxic content with the elastic stack”
* Frederic Fol Leymarie, DynAikon Ltd: “Human Visual Perception + Computer Vision to provide greater User Control for Shape-based Search”
The last award is open only to presenters at Search Solutions, and will be judged on the day of the event. For all others, apply today!
JUDGING PANEL
Winners will be selected by our panel of judges (details to be announced shortly).
AWARDS CEREMONY
The awards ceremony will take place during Search Solutions in November 2023. Winners will receive a framed certificate and a public listing on the IRSG Awards site.
APPLY
We’ve designed the application process to be simple to complete:
https://forms.gle/AUGsRBV9W2JE8S3x9
If you are unsure which category to apply for, or have questions about the application process, contact us via the address below. For further details, see: https://www.bcs.org/membership-and-registrations/member-communities/informa…
Nominations will remain open until 31st October.
CONTACT
If you have any questions on the above, please contact the IRSG vice chair at tgr2uk+irsg(a)gmail.com
ABOUT IRSG
The IRSG is a Specialist Group of BCS. Its mission is to provide a focus for the European IR community, facilitate communication between researchers and practitioners and promote the adoption of IR research within industry. We host a major European conference (ECIR) and provide an associated programme of workshops, seminars and events. The IRSG is free to join via the BCS website, which provides access to further IR articles, events and resources.
BCS is the industry body for IT professionals. With members in over 100 countries around the world, BCS is the leading professional and learned society in the field of computers and information systems.
*** First Call for Papers ***
21st International Conference on Software and Systems Reuse (ICSR 2024)
June 10-12, 2024, 5* St. Raphael Resort and Marina, Limassol, Cyprus
https://cyprusconferences.org/icsr2024/
(*** Submission Deadline: 12th February, 2024 AoE ***)
The International Conference on Software and Systems Reuse (ICSR) is a biannual conference
in the field of software reuse research and technology. ICSR is a premier event aiming to
present the most recent advances and breakthroughs in the area of software reuse and to
promote an intensive and continuous exchange among researchers and practitioners.
The guiding theme of this edition is Sustainable Software Reuse.
We invite submissions on new and innovative research results and industrial experience
reports dealing with all aspects of software reuse within the context of the modern software
development landscape. Topics include but are not limited to the following.
1 Technical aspects of reuse, including
• Reuse in/for Quality Assurance (QA) techniques, testing, verification, etc.
• Domain ontologies and Model-Driven Development
• Variability management and software product lines
• Context-aware and Dynamic Reuse
• Reuse in and for Machine Learning
• Domain-specific languages (DSLs)
• New language abstractions for software reuse
• Generative Development
• COTS-based development and reuse of open source assets
• Retrieval and recommendation of reusable assets
• Reuse of non-code artefacts
• Architecture-centric reuse approaches
• Service-oriented architectures and microservices
• Software composition and modularization
• Sustainability and software reuse
• Economic models of reuse
• Benefit and risk analysis, scoping
• Legal and managerial aspects of reuse
• Reuse adoption and transition to software reuse
• Lightweight reuse approaches
• Reuse in agile projects
• Technical debt and software reuse
2 Software reuse in industry and in emerging domains
• Reuse success stories
• Reuse failures, and lessons learned
• Reuse obstacles and success factors
• Return on Investment (ROI) studies
• Reuse in hot topic domains (Artificial Intelligence, Internet of Things, Virtualization,
Network functions, Quantum Computing, etc.)
We welcome research (16 pages) and industry papers (12 pages) following the Springer
Lecture Notes in Computer Science format. Submissions will be handled via
EasyChair (https://easychair.org/my/conference?conf=icsr2024). Submissions will be
**double-blindly** reviewed, meaning that authors should:
• Omit all authors’ names and affiliations from the title page
• Do not include the acknowledgement section, if you have any, in the submitted paper
• Refer to your own work in the third person
• Use anonymous GitHub, Zenondo, FigShare or equivalent to provide access to artefacts
without disclosing your identity
Both research and industry papers will be reviewed by members of the same program
committee (check the website for details). Proceedings will be published by Springer in
their Lecture Notes for Computer Science (LNCS) series. An award will be given to the best
research and the best industry papers.
The authors of selected papers from the conference will be invited to submit an extended
version (containing at least 30% new material) to a special issue in the Journal of Systems and
Software (Elsevier). More details will follow.
IMPORTANT DATES
• Abstract submission: January 22, 2024, AoE
• Full paper submission: January 29, 2024, AoE
• Notification: March 8, 2024, AoE
• Camera Ready: March 22, 2024, AoE
• Author Registration: March 22, 2024 AoE
ORGANISATION
Steering Committee
• Eduardo Almeida, Federal University of Bahia, Brazil
• Goetz Botterweck, Lero, University of Limerick, Ireland
• Rafael Capilla Sevilla, Universidad Rey Juan Carlos, Spain
• John Favaro, Trust-IT, Italy
• William B. Frakes, IEEE TCSE committee on software reuse, USA
• Martin L. Griss, Carnegie Mellon University, USA
• Oliver Hummel, University of Applied Sciences, Germany
• Hafedh Mili, Université du Québec à Montréal, Canada
• Nan Niu, University of Cincinnati, USA
• George Angelos Papadopoulos, University of Cyprus, Cyprus
• Claudia M.L. Werner, Federal University of Rio de Janeiro, Brazil
General Chair
• George A. Papadopoulos, University of Cyprus, Cyprus
Program Co-Chairs
• Achilleas Achilleos, Frederick University, Cyprus
• Lidia Fuentes, University of Malaga, Spain
We seek a candidate for a fully funded PhD position in Multilingual Natural Language Processing (NLP) at the Institute of Formal and Applied Linguistics (ÚFAL), Charles University, Prague. This opportunity offers the chance to work alongside Jindřich Libovický on an innovative project focused on enhancing multilingual language representations by fostering language neutrality and expanding language coverage.
The prospective PhD thesis topics include:
* Investigating the encoding of cultural aspects of meaning, such as moral values, in language representations, including methodologies to promote cultural awareness in multilingual NLP systems.
* Improving zero- or few-shot cross-lingual transfer between languages with and without using parallel text data.
* Developing new subword tokenization methods that will be more suitable for cross-lingual alignment than current statistical heuristics.
The specific research topic is open for negotiation, allowing you to contribute your unique ideas and interests.
ÚFAL boasts a vibrant community at the forefront of cutting-edge natural language processing research. You will have ample opportunities for collaboration and knowledge exchange with esteemed researchers, particularly the groups led by Ondřej Bojar, Daniel Zeman, Pavel Pecina, Zdeněk Žabokrtský, and Jan Hajič. Charles University is the premier institution in Czechia, attracting exceptional talents from both the local and international academic spheres.
To apply for this opportunity, kindly submit the following documents to <surname> at ufal dot mff dot cuni dot cz:
* A compelling cover letter introducing yourself and explaining your interest in the position.
* An up-to-date CV highlighting your academic achievements and relevant experience.
* A concise research plan addressing your past, current, and future research interests.
The names and contact details of two references who can provide insights into your qualifications and potential.
Fluency in English is essential, and knowledge of Czech is not required. While candidates with a strong background in natural language processing and machine learning are preferred, we also welcome applicants with diverse academic backgrounds.
Please ensure that your application reaches us by October 15, 2023. The expected starting date for the position is March 1, 2024.
Thanks,
Jindřich Libovický
------
Charles Univeristy, Institute of Formal and Applied Linguistics
Malostranské náměstí 25
118 00 Praha
Czech Republic
Email: <my_surname> at ufal dot mff dot cuni dot cz
Web: https://ufal.mff.cuni.cz/jindrich-libovicky
Dear Colleagues,
We are glad to announce the call for papers of JADT 2024, 11th
International Conference on the
Statistical Analysis of Textual Data, that will be held in Brussels
(Belgium), from June 25 to 27,
2024 (organised by the by the SeSLA – Séminaire des Sciences du Langage
of the UCLouvain Saint-Louis,
Brussels LASLA, with the collaboration of the LASLA – Laboratoire
d’Analyse statistique des Langues
anciennes of the University of Liege).
This biennial conference, which has constantly been gaining importance
since its first
occurrence in Barcelone (1990), is open to all scholars and researchers
working in the field of
textual data analysis; ranging from lexicography to the analysis of
political discourse, from
information retrieval to marketing research, from computational
linguistics to sociolinguistics,
from text mining to content analysis. After the success of the previous
meetings
(http://lexicometrica.univ-paris3.fr), the three-day conference in
Brussels will continue to
provide a workshop-style forum through technical paper sessions, invited
talks, and panel
discussions.
The conference website is https://jadt-2024.sciencesconf.org/?lang=en.
*Conference topics*
The themes of interest of the conference concern the application of
statistical models and tools in the following domains:
• Textometry, Statistical Analysis of Textual Data
• Exploratory Textual Data Analysis
• Corpus and Quantitative Linguistics
• Natural Language Processing
• Text Corpora Encoding
• Statistical Analysis of Unstructured and Structured Data
• Text Categorisation, Fuzzy Classification and Visualization
• Information Retrieval and Information Extraction
• Text Mining, Web Mining, Semantic Web
• Stylometry, Discourse analysis
• Software for Textual Data Analysis
• Machine Learning for Textual Data Analysis
• Multilingual and parallel corpora
*Important dates*
• Title and Abstract (max 20 rows): November 30, 2023
• First Version of Paper (Full-text): February 1, 2023
• Notification to Authors and opening of the registration: March 1, 2024
*Contacts*
For further information or enquiries, jadt-2024-slb(a)uclouvain.be
Hoping to see your active participation in the conference, we send you
our best regards
Anne Dister
Dominique Longrée
--
Dr. Serge Heiden, slh(a)ens-lyon.fr, https://www.textometrie.org
ENS de Lyon / UMR IHRIM 5317
Dear all,
We are offering a task on Multimodal Understanding of Smells in Texts and
Images (MUSTI) within the MediaEval benchmark on the detection of
smell-oriented relations in historical images and historical texts in
English, French, German, and Italian. Our focus is on determining whether
an image and text refer to the same smell source(s) and what are the smell
sources that make this pair related. The data sets contain several thousand
pairs of images and texts annotated by native speakers who are experts in
olfactory information processing.
This is the second iteration of MUSTI. We are adding a zero-shot task this
year. Please refer to the following publications for an overview [1] and
baselines [2] for more information.
Registration to participate is now open, and submissions will be due in
late November (the deadline for submission to be adjusted). You will have
the chance to present your work at the 14th Annual MediaEval Workshop 1-2
February 2024, which is collocated with MMM 2024 in Amsterdam, the
Netherlands, with remote participation also possible.
More information can be found on
MUSTI: https://multimediaeval.github.io/editions/2023/tasks/musti/
MediaEval: https://multimediaeval.github.io/editions/2023/
MMM: https://mmm2024.org
We would like to invite you to participate in this task. Please let us know
if you have any questions. We are looking forward to hearing from you!
Best regards,
Ali Hürriyetoğlu (on behalf of organizing committee)
[1] Hürriyetoğlu, A., Paccosi, T., Menini, S., Mathias, Z., Pasquale, L.,
Kiymet, A., ... & van Erp, M. (2022). MUSTI-Multimodal Understanding of
Smells in Texts and Images at MediaEval 2022. In Proceedings of MediaEval
2022 CEUR Workshop. URL:
https://cris.fbk.eu/retrieve/a8af3795-9055-4cee-8465-f4194a05c5d5/paper9634…
[2] Akdemir, K., Hürriyetoğlu, A., Troncy, R., Paccosi, T., Menini, S.,
Zinnen, M., Christlein, V. (2022). Multimodal and Multilingual
Understanding of Smells using VilBERT and mUNITER. In Proceedings of
MediaEval 2022 CEUR Workshop. URL:
https://cris.fbk.eu/retrieve/6112053b-b228-41e9-b763-c33acc876da9/paper6505…
Dear all,
In August 2022, the UC Berkeley Library and Internet Archive were awarded a
grant from the National Endowment for the Humanities (NEH) to study legal
and ethical issues in cross-border text and data mining (TDM).
The project, entitled Legal Literacies for Text Data Mining – Cross-Border
(“LLTDM-X”), supported research and analysis to address law and policy
issues faced by U.S. digital humanities practitioners whose text data
mining research and practice intersects with foreign-held or -licensed
content, or involves international research collaborations. Information in
this email is culled from their blog post announcement (
https://buildinglltdm.org/2023/10/02/wrapping-up-our-neh-funded-project-to-…
)
In early 2023, we hosted a series of three online round tables with
U.S.-based cross-border TDM practitioners and law and ethics experts from
six countries.
The round table conversations were structured to illustrate the empirical
issues that researchers face, and also for the practitioners to benefit
from preliminary advice on legal and ethical challenges. Upon the
completion of the round tables, the LLTDM-X project team created a
hypothetical case study that (i) reflects the observed cross-border LLTDM
issues and (ii) contains preliminary analysis to facilitate the development
of future instructional materials.
We also charged the experts with providing responsive and tailored written
feedback to the practitioners about how they might address specific
cross-border issues relevant to each of their projects. This project lead
to two significant outcomes:
*1: Case study* The Project Team developed a hypothetical case study
reflective of “typical” cross-border LLTDM issues that U.S.-based
practitioners encounter. The CASE STUDY
<https://docs.google.com/document/d/1KUK9HoNrLwEMcV_lNDKjxSnH3ipvhUWU2jbJj2g…>
examines
needs and concerns regarding cross-border copyright, contracts, and privacy
& ethics variables across two distinct paradigms: first, a situation where
U.S.-based researchers perform all TDM acts in the U.S., and second, a
situation where U.S.-based researchers engage with collaborators abroad, or
otherwise perform TDM acts in both U.S. and abroad.
<https://docs.google.com/document/d/1KUK9HoNrLwEMcV_lNDKjxSnH3ipvhUWU2jbJj2g…>
*2: White paper *The project team developed a WHITE PAPER
<https://docs.google.com/document/d/1KR0sXEg3M2eJDV0ZvZDafv-6J2XY2lA9vF9GHYz…>
providing
a comprehensive description of the project, including origins and goals,
contributors, activities, and outcomes. Of particular note are several
project takeaways and recommendations, which they hope will help inform
future research and action to support cross-border text data mining.
This is a particularly US-centric perspective on an important topic, but
there was significant input from legal experts across the world, including:
- Andrew Charlesworth, University of Bristol (privacy)
- Juan Carlos Fernández-Molina, Universidad de Granada (licensing)
- Sean Fiil-Flynn, American University Washington College of Law
(copyright)
- Lucie Guibault Dalhousie University (copyright, licensing)
- Heidi McKee, Miami University of Ohio (ethics)
- Argyri Panezi, IE Law School & Stanford University (privacy)
- James Porter, Miami University of Ohio (ethics)
- Matthew Sag, Emory University School of Law (copyright)
- Ben White, Bournemouth University (copyright)
- Fernando Esteban de la Rosa, Universidad de Granada (licensing)
- João Quintais, University of Amsterdam (copyright)
- Ryan Calo, University of Washington (privacy)
While I understand this may not be of interest to all on the corpora-list,
I do hope this can provide some useful starting points for those of us
worried and/or stymied by the copyright question in their research,
especially for those of us with collaborators or research interests that
exist beyond the USA.
Thank you,
Heather Froehlich
--
Dr Heather Froehlich
w // http://hfroehli.ch
t // @heatherfro