May 2023 - Corpora - ELRA lists

CFP: 1st Workshop on Readability for Low Resourced Languages (RLRL 2023)
by El-Haj, Mo 11 May '23

11 May '23

Free registration is now open https://bit.ly/3pwUwlG - tickets are limited and open to non-authors. Call for Abstracts: 1st Workshop on Readability for Low Resourced Languages (RLRL 2023) Dear all, join us for an exciting online workshop where experts in natural language processing will come together to discuss the latest research and innovative approaches to assessing the readability of low-resource languages. The workshop will take place as a free online event on September 5, 2023, and is being hosted jointly by Lancaster University, Sheffield Hallam University and King Saud University. We welcome researchers and practitioners to submit abstract proposals of up to 500 words for talks related to the development of a Readability Framework for low-resource languages. The extended versions of the accepted abstracts will appear in the Computing Research Repository (CoRR), subject to the number of abstracts received and being in English. Although the workshop will be conducted in English, for the first time, we are accepting submissions in a different language, starting with Arabic. Arabic speaking authors, are encouraged to submit their abstracts in Arabic. Presentations will be recorded with subtitles pre-added in by the authors. The organisers will live translate the Q&As. We plan to extend this option to other languages in future events. Due to lack of resources we are unable to provide live translation from English to Arabic during the workshop but we are open for free-of-charge solutions (please send suggestion to the organisers directly). The ultimate goal of the workshop is to discuss best practices and state-of-the-art AI-based approaches to create mathematical representations of expected readability levels at different school grade or cognitive ability levels. The workshop will also focus on utilising classifiers that are intuitive for humans to understand and adjust, enabling the analysis and improvement of the decision-making criteria. We welcome abstracts on work that is still in progress or that does not yet have conclusive results. We encourage authors to share their work at various stages of development to facilitate discussions and collaboration during the workshop. Important Dates: - Due date for workshop abstract submission: July 17, 2023 - Notification of abstract acceptance to authors: August 1, 2023 - Workshop date: September 5, 2023 (online event) We are pleased to announce the following keynote speakers for the workshop: - Professor Laurence Anthony - Faculty of Science and Engineering at Waseda University, Japan. - Dr Violetta Cavalli-Sforza - School of Science and Engineering at Al Akhawayn University, Morocco. The main objectives of the workshop are three-fold: 1- Increase awareness of the importance of readability in low-resource languages and its impact on language learning and literacy. 2- Discuss the challenges of readability in low-resource languages, such as limited resources and lack of standardization, and brainstorm strategies for addressing these challenges. 3- Foster a community of practice among participants, allowing them to share their experiences and best practices for addressing readability issues in low-resource languages. Abstract submission: Abstract submission page is now open, please submit abstracts of no more than 500 words either in English or Arabic https://easychair.org/conferences/?conf=rlrl2023 Topics of interest include, but are not limited to: - Machine learning for text readability - Applications of readability assessment - Readability in low-resource languages - Comprehensibility measures - Mathematical representations of readability levels - Text simplification for low-resource languages - Readability and comprehensibility in language learning - The effects of text simplification on readability - Readability frameworks for indigenous languages - Updating readability representations We look forward to your contributions and to a productive and enlightening workshop on September 5, 2023. RLRL 2023 Organisers: - Dr Mo El-Haj (SCC/DSI/UCREL, Lancaster University) - Dr Abdel-Karim Al Tamimi (CSSE, Sheffield Hallam University) - Prof. Hend Al Khalifa (iWAN, King Saud University) https://wp.lancs.ac.uk/acc/rlrl2023/ Best wishes, Mahmoud --------------------- Dr Mo El-Haj Senior Lecturer in NLP Co-Director of UCREL NLP Group Strategic Lead of Arabic and Financial NLP Research School of Computing and Communications, Lancaster University https://www.lancaster.ac.uk/staff/elhaj

1 0

[Deadline Extended]: The SIGIR '23 Workshop on Knowledge Discovery from Unstructured Data in Financial Services (KDF)
by Xiaomo Liu 11 May '23

11 May '23

*Updated the deadlines* - Paper submission deadline: May 21, 2023 AoE - Submission notification date: May 31, 2023 *==============================================================================================* *The SIGIR '23 Workshop on Knowledge Discovery from Unstructured Data in Financial Services (KDF)* Artificial intelligence (AI) and information retrieval (IR) systems and techniques have been widely adopted in financial services to tackle various tasks, such as information retrieval from business documents, retrieval from non-textual content like tables and graphs, recommending financial products and services to customers, providing decision support for investment practices, automating of due diligence protocols, detecting fraudulent transactions, financial sentiment analysis on social media, and understanding Environmental, Social and Governance (ESG) impact on investment practices. Knowledge from IR systems can help augment human intelligence. However, discovering and extracting the knowledge conveyed inside unstructured financial data, like SEC filings, prospectuses, business reports, and other enterprise documents are extremely challenging due to the massive volume of data, large variation in the data format, low signal-to-noise ratio, scarcity of expert annotated datasets, task ambiguity, hurdles regarding data integrity and privacy, robustness against domain shift, and high-performance requirements set by industry and regulatory standards. Manual extraction of knowledge is usually inefficient, error-prone, and inconsistent, so it is one of the key technical bottlenecks for financial services companies to accelerate their operating productivity. These challenges and issues call for robust artificial intelligence, information retrieval, and machine learning algorithms and systems to help. The automated processing of unstructured data to discover knowledge from complex financial documents requires bringing together a suite of techniques such as natural language processing, information retrieval, semantic analysis, and complex reasoning. In addition, how knowledge is captured and represented, synthesized across diverse sources, and used within AI systems, is crucial to developing effective solutions in financial services. Furthermore, based on the reflections and feedback from our past KDF workshops, the 2023 workshop is particularly interested in multi-modal understanding of financial documents, retrieving and reasoning over tabular data within financial documents, and financial domain-specific representation learning. The workshop will be composed of three components: invited talks, paper presentations, along with a shared task competition. We cordially welcome researchers, practitioners, and students from academic and industrial communities who are interested in the topics to participate and/or submit their original work. *The workshop will be a hybrid event – supporting both in-person and virtual participation.* Topics of Interest The topics of the workshop include, but are not limited to, the following areas: - AI and IR technologies for business document understanding for financial corpora, including searching and question answering systems, understanding and reasoning over non-textual content such as tables and graphs; - representation learning, and distributed representation learning and encoding in natural language processing for financial documents; - language modeling on financial corpora including tabular and numerical data, and multi-modal modeling; - multi-source knowledge integration and fusion, and knowledge alignment and integration from heterogeneous data; - reconciling unstructured knowledge with structured knowledge and human expertise; - named-entity disambiguation, recognition, resolution, relationship discovery, ontology learning and extraction in financial and business documents; - AI-assisted domain data tagging, labeling, and annotation for IR tasks; automatic data extraction from financial filings and quality verification; - corporate ESG event discovery, evaluation, and impact assessment; - event discovery from alternative data and impact on corporate equity pricing; - AI and IR systems for financial risk assessment on financial legal documents such as contracts and prospectuses; - verifying facts and statements generated by large pre-trained language models using IR and knowledge discovery; - IR or QA techniques and applications on financial documents leveraging large language models. Submission GuidelinesWe invite submissions of relevant work that be of interest to the workshop. All submissions must be original contributions that have not been previously published and that are not currently under review by other conferences or journals. Submissions will be peer reviewed, single-blinded. Submissions will be assessed based on their novelty, technical quality, significance of impact, interest, clarity, relevance, and reproducibility. All submissions must be in PDF format and follow the current ACM two-column conference format https://www.acm.org/publications/proceedings-template. We accept two types of submissions:· full research paper: no longer than 9 pages (including references, proofs, and appendixes).· short/poster paper: no longer than 4 pages (including references, proofs, and appendixes).Submission will be accepted via Microsoft CMT https://cmt3.research.microsoft.com/KDF2023/. All accepted submissions will be presented in the workshop. Submission will be non-archival, and the authors may post their work on arXiv or other online repositories.Important Dates - Paper abstract due (optional): May 1, 2023 AoE - Paper submission deadline: May 21, 2023 AoE - Submission notification date: May 31, 2023 - Workshop: July 27, 2023 Organizing Committee· Sameena Shah - JPMorgan AI Research· Xiaodan Zhu - Queen's University· Wenhu Chen - University of Waterloo· Manling Li - University of Illinois Urbana-Champaign· Armineh Nourbakhsh - JPMorgan AI Research· Xiaomo Liu - JPMorgan AI Research · Zhiqiang Ma - JPMorgan AI Research· Charese Smiley - JPMorgan AI Research· Yulong Pei - JPMorgan AI Research· Akshat Gupta - JPMorgan AI ResearchWorkshop Website http://kdf-workshop.github.io/kdf23 *Contact* For general inquiries about KDF, please write to the organizers at kdf.workshop(a)gmail.com. -- Best, Xiaomo

1 0

eLex 2023: Invisible Lexicography - Call for papers
by Miloš Jakubíček 10 May '23

10 May '23

(apologies for multiple postings) *CALL FOR PAPERS* <https://elex.link/elex2023/call-for-papers/> *eLex 2023: Electronic lexicography in the 21st century.* The topic of next year's conference is Invisible Lexicography. Dates: 27-29 June 2023 (with workshops on June 26th) Venue: Hotel Passage, Brno, Czechia Deadline for abstract submissions: January 31st 2023 Conference website: https://elex.link/elex2023/ Language of the conference: English Format: The conference will be organized as a hybrid event and while we encourage everyone to participate on-site, we plan to provide live streaming and recording of the event for registered participants. Looking forward to seeing you all in Brno, Miloš Jakubíček in the name of the organising committee

1 4

IACT’23@SIGIR: deadline extension to May 23
by Marina Litvak 10 May '23

10 May '23

[Apologies if you receive multiple copies of this CFP] ===================================== --*Call for Papers: The 1st International Workshop on Implicit Author Characterization from Texts for Search and Retrieval (IACT’23) * The workshop will be held in conjunction with the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. Workshop website: https://en.sce.ac.il/news/iact23 July 27, 2023. Taipei, Taiwan. *Paper submission deadline: Extended to May 23, 2023, AoE * Submission link: https://easychair.org/conferences/?conf=iact23 To bring the research community's attention to the limitations of current models in recognizing and characterizing AI vs. human authors, we organize the first edition of IACT workshops under the umbrella of the SIGIR conference. Research works submitted to the workshop should foster scientific advances in all aspects of author characterization. All papers must be original and not simultaneously submitted to another journal or conference. The following paper categories are welcome: - *Full research papers*: up to 8 pages. Original and high-quality unpublished contributions to the theory and practical aspects of the workshop topics. - *Short research* *papers*: up to 5 pages. It can describe ongoing research, resources, and demos. - *Negative results* *papers*: up to 5 pages. Highlighting tested hypotheses that did not get the expected outcome is also welcomed. - *Position papers*: up to 5 pages. Discussing current and future research directions. The length constraints do not include references. The submissions must be anonymous and will be peer-reviewed by at least two program committee members. The authors of accepted papers will be given 15 minutes for a short oral presentation. The workshop will run as a hybrid event to allow virtual attendance and meet the SIGIR format. Research works submitted to the workshop should foster the scientific advance on all aspects of implicit author information extraction from text, including but not limited to the following: - Differentiation between AI-generated content and human-generated content and bot profiling - Characterization of conversational agents - Feature detection of authors for human vs. AI determination - Prompt understanding and recognition in language models - Personalized question answering and conversation generation - Troll identification on social media - Review authenticity estimation - Multi-modal, multi-genre, and multilingual author analysis - Character analysis, description, and representation in narrative texts - Detecting implicit expressions of sentiment, emotion, opinion, and bias - Transfer learning for implicit author characterization - Implicit author characterization annotation schema - Evaluation of implicit author characterization - Author characterization in low-resource languages and under-studied domains - Accountability and regulation of AI-based information extraction, retrieval, and content generation - Copyright issues of AI-generated content - Ethical and privacy implications of author characterization and implicit information extraction - Fairness and bias of AI-generated content Organizing Committee: - Marina Litvak - marinal(a)ac.sce.ac.il; Shamoon College of Engineering Beer Sheva; Israel - Irina Rabaev - irinar(a)ac.sce.ac.il; Shamoon College of Engineering Beer Sheva; Israel - Alípio Mário Jorge - amjorge(a)fc.up.pt; University of Porto; Porto, Portugal - Ricardo Campos - ricardo.campos(a)ipt.pt; Polytechnic Institute of Tomar INESC TEC, Portugal; Porto, Portugal - Adam Jatowt - adam.jatowt(a)uibk.ac.at; University of Innsbruck; Innsbruck, Austria Invited Speakers: - Prof. Mark Last - Ben-Gurion University of the Negev, Israel - Prof. Dr. Valia Kordoni - Humboldt-Universität Berlin, Germany Contact: - Dr. Marina Litvak: litvak.marina(a)gmail.com - Dr. Irina Rabaev: irinar(a)ac.sce.ac.il -- Best regards, Marina Litvak

1 0

PhD position: Realistic Conversational Agent, Paris area, France
by Gaël de Chalendar 10 May '23

10 May '23

The goal of this PhD will be to create a realistic conversational agent with a "personality". The generation model will have to take into account a description of the agent's knowledge about the world, about itself and about the conversation in progress in order to generate coherent responses. This external knowledge must be easily replaced or updated. The conversational agent produced will be developed within the framework of the Cortex² European project aiming, among other things, to produce tools that facilitate the experience of online meetings. - Supervisor: Gaël de Chalendar (CEA) - Director : Nasredine Semmar (CEA) - Employer : CEA, French public research organization - Location : Nano Innov, Paris-Saclay University, Palaiseau, France - Working conditions : full time, telecommuting (100 days/year, max 3 days/ week), 5 weeks of annual leave + 23 other days, subsidized meals, partial reimbursement of public transportation, social security coverage - Detailed description of the subject: https://instn.cea.fr/these/agent-conversationnel-realiste/ - Deadline for applications: June 4, 2023 - Start of the PhD: from September 1st, 2023 - Duration of the PhD: 3 years To candidate: gael.de-chalendar(a)cea.fr -- Gael de Chalendar CEA LIST Laboratoire d'Analyse Sémantique Texte et Image (Text and Image Semantic Analysis Laboratory) CEA Saclay Nano-INNOV DRT/LIST/DIASI/SIALV BAT 861 PC 184 F-91191 Gif-sur-Yvette Cedex Tél.:+33.1.69.08.01.50 Fax:+33.1.69.08.01.15 Email : Gael.D.O.T.de-Chalendar.A(a)T.cea.D.O.T.fr

1 0

Job Opening for Senior Data Scientist (Remote - NLP for medical domain and life sciences)
by Ehsan Khoddam Mohammadi 10 May '23

10 May '23

*Title:* Senior Data Scientist (Remote) Application links *Netherlands* <https://jobs.lever.co/veeva/a6c967ac-5bbb-412b-9c3d-b72d709b8da7> [ https://jobs.lever.co/veeva/a6c967ac-5bbb-412b-9c3d-b72d709b8da7] *Germany <https://jobs.lever.co/veeva/e73b2147-5e3c-41cf-8f9e-db64dcdd1d3a> *[ https://jobs.lever.co/veeva/e73b2147-5e3c-41cf-8f9e-db64dcdd1d3a] Linkedin: https://www.linkedin.com/posts/activity-7061693573410254848-wJ6R *What You'll Do* - Adopt the latest technologies and trends in NLP to your platform - Experience with training, fine-tuning, and serving Large Language Models - Design, develop, and implement an end-to-end pipeline for extracting predefined categories of information from large-scale, unstructured data across multi-domain and multilingual settings - Create a robust semantic search functionality that effectively answers user queries related to various aspects of the data - Use and develop named entity recognition, entity-linking, slot-filling, few-shot learning, active learning, question/answering, dense passage retrieval, and other statistical techniques and models for information extraction and machine reading - Deeply understand and analyze our data model per data source and geo-region and interpret model decisions - Collaborate with data quality teams to define annotation tasks and metrics and perform a qualitative and quantitative evaluation. We have more than 1900 curators! - Utilize cloud infrastructure for model development, ensuring seamless collaboration with our team of software developers and DevOps engineers for efficient deployment to production *Requirements* - 4+ years of experience as a data scientist (or 2+ years with a Ph.D. degree) - Master's or Ph.D. in Computer Science, Artificial Intelligence, Computational Linguistics, or a related field - Strong theoretical knowledge of Natural Language Processing, Machine Learning, and Deep Learning techniques - Proven experience working with large language models and transformer architectures, such as GPT, BERT, or similar - Familiarity with large-scale data processing and analysis, preferably within the medical domain - Proficiency in Python and relevant NLP libraries (e.g., NLTK, SpaCy, Hugging Face Transformers) - Experience in at least one framework for BigData (e.g., Ray, Spark) and one framework for Deep Learning (e.g., PyTorch, JAX) - Experience working with cloud infrastructure (e.g., AWS, GCP, Azure) and containerization technologies (e.g., Docker, Kubernetes) and experience with bashing script - Strong collaboration and communication skills, with the ability to work effectively in a cross-functional team - Used to start-up environments - Social competence and a team player - High energy and ambitious - Agile mindset *You can work remotely anywhere in Germany or The Netherlands, but you have to live in Germany or The Netherlands and be legally authorized to work there without requiring Veeva's support for a visa or relocation. If you do not meet this condition but you think you are an exceptional candidate, please clarify it in a separate note, and we will consider it.About Link: Our product offers real-time academic, social, and medical data to build comprehensive profiles. These profiles help our life-science industry partners find the right experts to accelerate the development and adoption of new therapeutics. We accelerate clinical trials and equitable care. We are proud that our work helps patients receive their most urgent care sooner.* *About Veeva:* Veeva is a mission-driven organization that aspires to help our customers in Life Sciences and Regulated industries bring their products to market, faster. We are shaped by our values: Do the Right Thing, Customer Success, Employee Success, and Speed. Our teams develop transformative cloud software, services, consulting, and data to make our customers more efficient and effective in everything they do. Veeva is a work anywhere company. You can work at home, at a customer site, or in an office on any given day. As a Public Benefit Corporation, you will also work for a company focused on making a positive impact on its customers, employees, and communities. Application links *Netherlands* <https://jobs.lever.co/veeva/a6c967ac-5bbb-412b-9c3d-b72d709b8da7> [ https://jobs.lever.co/veeva/a6c967ac-5bbb-412b-9c3d-b72d709b8da7] *Germany <https://jobs.lever.co/veeva/e73b2147-5e3c-41cf-8f9e-db64dcdd1d3a> *[ https://jobs.lever.co/veeva/e73b2147-5e3c-41cf-8f9e-db64dcdd1d3a] Linkedin: https://www.linkedin.com/posts/activity-7061693573410254848-wJ6R Ehsan Khoddam Data Science Manager at Veeva Systems Inc.

1 0

Last CFP and Extended Deadline: CLEF 2023 - Conference and Labs of the Evaluation Forum
by Anastasia Giachanou 10 May '23

10 May '23

Last Call for papers and Extended Deadline for CLEF 2023: Conference and Labs of the Evaluation Forum 18-21 September 2023, Thessaloniki, Greece https://clef2023.clef-initiative.eu Important Dates (Time zone: Anywhere on Earth) - Submission of Long, Short, Best of 2023 Labs Papers: 12 May, 2023 21st of May (extended) - Notification of Acceptance: 9 June, 2023 - Camera Ready Copy due: 30 June, 2023 - Conference: 18-21 September, 2023 Aim and Scope The CLEF Conference addresses all aspects of Information Access in any modality and language. The CLEF conference includes presentation of research papers and a series of workshops presenting the results of lab-based comparative evaluation benchmarks. CLEF 2023 is the 14th CLEF conference continuing the popular CLEF campaigns which have run since 2000 contributing to the systematic evaluation of information access systems, primarily through experimentation on shared tasks. The CLEFconference has a clear focus on experimental IR as carried out within evaluation forums (e.g., CLEF Labs, TREC, NTCIR, FIRE, MediaEval, RomIP, SemEval, and TAC) with special attention to the challenges of multimodality, multilinguality, and interactive search also considering specific classes of users as children, students, impaired users in different tasks (e.g., academic, professional, or everyday-life). We invite paper submissions on significant new insights demonstrated on IR test collections, on analysis of IR test collections and evaluation measures, as well as on concrete proposals to push the boundaries of the Cranfield style evaluation paradigm. All submissions to the CLEF main conference will be reviewed on the basis of relevance, originality, importance, and clarity. CLEF welcomes papers that describe rigorous hypothesis testing regardless of whether the results are positive or negative. CLEF also welcomes past runs/results/data analysis and new data collections. Methods are expected to be written so that they are reproducible by others, and the logic of the research design is clearly described in the paper. The conference proceedings will be published in the Springer Lecture Notes in Computer Science (LNCS). Topics Relevant topics for the CLEF 2023 Conference include but are not limited to: - Information access in any language or modality: information retrieval, image retrieval, question answering, information extraction and summarisation, search interfaces and design, infrastructures, etc. - Analytics for information retrieval: theoretical and practical results in the analytics field that are specifically targeted for information access data analysis, data enrichment, etc. - User studies either based on lab studies or crowdsourcing. - Past results/run deep analysis both statistically and fine grain based. - Evaluation initiatives: conclusions, lessons learned, impact and projection of any evaluation initiative after completing their cycle. - Evaluation: methodologies, metrics, statistical and analytical tools, component based, user groups and use cases, ground-truth creation, impact of multilingual/multicultural/multimodal differences, etc. - Technology transfer: economic impact/sustainability of information access approaches, deployment and exploitation of systems, use cases, etc. - Interactive information retrieval evaluation: interactive evaluation of information retrieval systems using user-centered methods, evaluation of novel search interfaces, novel interactive evaluation methods, simulation of interaction, etc. - Specific application domains: information access and its evaluation in application domains such as cultural heritage, digital libraries, social media, health information, legal documents, patents, news, books, and in the form of text, audio and/or image data. - New data collection: presentation of new data collection with potential high impact on future research, specific collections from companies or labs, multilingual collections. - Work on data from rare languages, collaborative, social data. Format Authors are invited to electronically submit original papers, which have not been published and are not under consideration elsewhere, using the LNCS proceedings format: http://www.springer.com/it/computer-science/lncs/conference-proceedings-gui… Two types of papers are solicited: - Long papers: 12 pages max (excluding references). Aimed to report complete research works. - Short papers: 6 pages max (excluding references). Position papers, new evaluation proposals, developments and applications, etc. Review Process Authors of long and short papers are asked to submit the following TWO versions of their manuscript: Methodology version: This version does NOT report anything related to the results of the study. At this stage, the manuscripts will be evaluated based on the importance of the problem addressed and the soundness of the methodology. Manuscripts can include an introduction, description of the proposed methodology and datasets used. However, there should be no result and discussion sections. The authors should also remove mentions of results in the included sections (e.g., abstract, introduction) Experimental version: This is the full version of the manuscript that contains all the sections of the paper including the experiments and results. Papers will be peer-reviewed by 3 members of the program committee in two stages. At the first stage, the members will review the methodology version of the manuscripts based on originality and methodology. At the second stage, the full version of the manuscripts that passed from the first sage will be reviewed. Selection will be based on originality, clarity, and technical quality. The deadline for the submission of both versions is 12th of May 21st of May. Paper submission Papers should be submitted in PDF format to the following address: https://easychair.org/my/conference?conf=clef2023 - Submit the methodology version at the Methodology Track - Submit the experimental version at the Experimental Track Organisation General Chairs Evangelos Kanoulas, University of Amsterdam, the Netherlands Theodora Tsikrika, Information Technologies Institute, CERTH, GR Stefanos Vrochidis, Information Technologies Institute, CERTH, GR Avi Arampatzis, Democritus University of Thrace, Greece Program Chairs Anastasia Giachanou, Utrecht University, the Netherlands Dan Li, Elsevier Evaluation Lab Chairs Mohammad Aliannejadi, University of Amsterdam, the Netherlands Michalis Vlachos, University of Lausanne, Switzerland Lab Mentorship Chair Jian-Yun Nie, University of Montreal, Canada

1 0

Job: Senior Manager, Digital Humanities Outreach
by heather froehlich 10 May '23

10 May '23

Hi all, Sharing a job that was shared with me... this is a good role for people who are interested in bringing text analysis methods to a wide range of stakeholders. *Please contact Amy Kirchhoff (amy.kirchhoff(a)ithaka.org <amy.kirchhoff(a)ithaka.org>) with any questions that you might have.* Short summary: ITHAKA, the non-profit that brought you JSTOR, is hiring a Senior Manager for Digital Humanities Outreach. The position is integrated into our Outreach team (the staff who talk to librarians about ITHAKA services) and will work very closely with Constellate (constellate.org). Constellate helps users across all disciplines learn essential text analysis and data skills. More details: *Senior Manager, Digital Humanities Outreach* *Location: Remote* *Link for info / application:* https://www.ithaka.org/job/4247846005/?gh_jid=4247846005 *The Role* ITHAKA is seeking a tech savvy and teamwork-oriented higher education professional to be our Outreach subject matter expert for ITHAKA’s emerging technologies that serve digital humanities research and instruction within academic libraries. ITHAKA’s mission is to expand access to knowledge and education around the world. Our services—Artstor, JSTOR, Portico, and Ithaka S+R— enable people everywhere to learn, to grow, and to overcome historical barriers to education. ITHAKA Senior Manager for Digital Humanities & Primary Sources, Outreach role provides an opportunity to have a transformative impact on the lives of students, teachers, researchers, and on the scholarly communications landscape more generally. *Responsibilities* As Senior Manager for Digital Humanities Resources Outreach you will be the internal and external subject matter expert responsible for achieving participation goals,helping to drive adoption and usage for new programs and services in the Digital Humanities, focusing on text and data analytics education and support. You will collaborate internally with colleagues to contribute to the ongoing development for the products and services you support as well as provide training and knowledge sharing for those products across the ITHAKA Outreach team. - Through direct faculty and library outreach, seek opportunities to integrate ITHAKA’s emerging technologies and services into digital humanities programs (through faculty and library outreach), supporting teaching with primary sources, information and data literacy, and text analytics. - Lead efforts to achieve participation goals for assigned products and services through direct customer activities such as: meetings (virtual, on-site visits, and industry trade shows) for new or at-risk participating institutions; following up on marketing-generated and inbound leads, and supporting the broader Outreach team through reviewing opportunities and internal training for staff. - Act as the subject matter expert for the Outreach team on products/services as assigned and how they can be used to support digital scholarship and teaching with primary sources. - Support introduction of assigned new products and services by contributing to beta and trial evaluation programs, collaborating with business owners to provide feedback on business models and validating end-user workflows. - Collaborate with ITHAKA Marketing, Outreach leadership, and business owners to determine the most effective value proposition for new products and services related to digital humanities; support effective go-to-market (GTM) planning, including the creation of materials to support launch. - Primary Outreach liaison to the JSTOR Labs team, helping to develop and provide feedback for incubated ideas. - Develop and maintain a network of relationships with digital humanities library staff, teaching faculty, and other key members of the academic community to stay abreast of important trends and best practices in digital and information literacy. - Ability to travel is required for this role, expectations in the 20-30% range. Experience and Skills - Minimum 5 years of experience in higher education or libraries, either at an academic institution or vendor. - Demonstrated expertise in digital humanities scholarship and primary source research. - An advanced degree (MLS, M.A. or PhD in a humanities field) and/or professional experience in teaching and research is a plus. - Excellent relationship building and communication skills, including the ability to effectively demonstrate digital content and platform tools via webinar and in person workshops. - Ability to work well as a team member; a willingness to actively participate in team projects and share ideas. - Strong attention to detail and a keen eye for accuracy. - Ability to perform with minimal supervision and to prioritize diverse work assignments. - Committed to our organizational values of belonging, evidence, speed, teamwork, and trust - Comfortable working with data (via tools such as Tableau) to extract learnings, make decisions, and demonstrate evidence. - Familiarity working with the standard Microsoft office suite (including intermediate Excel skills) and working in a CRM (such as Salesforce). Compensation & Benefits At ITHAKA we believe in openness and equity. Part of living those values is our commitment to clarity about salary ranges, so candidates know what to expect. The starting salary for this position ranges from $86,184 to $107,730 per year. Starting pay may vary with job-related knowledge, skills, and experience. Our total compensation package for benefits-eligible employees includes employer-paid medical, dental, and vision plans, an employer-paid 10% retirement contribution, paid parental and caregiver leave, 22 days of paid time off, 11 paid holidays, up to 12 sick days, gym reimbursement, and more. -- Dr Heather Froehlich w // http://hfroehli.ch t // @heatherfro

1 0

8th Law and Corpus Linguistics Conference
by jesse.egbert＠yahoo.com 09 May '23

09 May '23

The 8th annual Law and Corpus Linguistics conference will be held on October 13, 2023 at Brigham Young University in Provo, Utah, USA. The keynote address will be delivered by Dean Gordon Smith. We look forward to panels on a wide range of topics. One of our panels will focus on the intersection between intellectual property law and corpus linguistics. The papers for this panel will be published in a symposium issue of the BYU Law Review. Proposals are invited for individual papers and panels. We are open to submissions on a broad range of topics, including but not limited to: -applications of corpus linguistics to the constitutional, statutory, contract, patent, copyright, trademark, probate, administrative, and criminal law in any state or nation; -philosophical, normative, and pragmatic commentary on the use of corpus linguistics in the law; -triangulation between corpus linguistics and other empirical methods in legal interpretation; -the relationship between corpus linguistics and pragmatics (e.g. implicature, presupposition, sociolinguistic context); -corpus-based analysis of legal discourse or topics; -best practices in corpus design and corpus linguistic methods in legal settings. The proposal deadline is May 31, 2023. Proposals should include an abstract of no more than 750 words and complete contact information for presenters. Please send proposals to byulawcorpus(a)law.byu.edu. More information can be found at: https://corpusconference.byu.edu/2023-home/.

1 0

First Call for Demos: PROPOR 2024 - 16th International Conference on Computational Processing of Portuguese [Apologies for cross-postings]
by Iria de Dios Flores 09 May '23

09 May '23

******************************************************** *PROPOR 2024: 16th International Conference on Computational Processing of Portuguese * Universidade de Santiago de Compostela (Santiago de Compostela - Galicia) March 14th to 15th 2024 https://propor2024.citius.gal/ ******************************************************** The International Conference on Computational Processing of Portuguese (PROPOR), whose next edition will take place for the first time in Galicia, birthplace of the Portuguese language, is the main event in the area of natural language processing that is focused on theoretical and technological issues of written and spoken Portuguese and Galician (considered as a local variety of the former). The meeting has been a very rich forum for the exchange of ideas and partnerships for the research and industry communities dedicated to the automated processing of this language, promoting the development of methodologies, resources and projects that can be shared among researchers and practitioners in the field. The PROPOR 2024 demonstration program committee *invites* *submissions for demonstrations*. Following the spirit of previous PROPOR editions, the demonstration track aims at bringing together academia and industry, creating a forum where more than written or spoken descriptions of research are available. Thus, demos should allow attendees to try and test them during their presentation in a dedicated session that will provide a more informal and interactive setting. Products, systems, or tools are examples of acceptable demos. Both early-research prototypes and mature systems may also be considered. *Important dates:* - Demos Submission: *December 10 2023 * - Notification of acceptance or rejection: *January 21 2024* - Camera-ready demo paper: *January 28 2024* - Conference: *March 14 and 15 2024* *Topics:* The areas of interest include all topics related to theoretical and applied issues of written and spoken Portuguese and Galician, such as, but not limited to, the same topics as for the conference paper submission: - Natural language processing tasks (e.g. parsing, word sense disambiguation, coreference resolution) - Natural language processing applications (e.g. question answering, subtitling, summarization, sentiment analysis) - Natural language generation - Information extraction and information retrieval - Speech technologies (e.g. spoken language generation, speech and speaker recognition, spoken language understanding) - Speech applications (e.g. spoken language interfaces, dialogue systems, speech-to-speech translation) - Resources, standardization and evaluation (e.g. corpora, ontologies, lexicons, grammars) - NLP-oriented linguistic description or theoretical analysis - Distributional semantics and language modeling - Portuguese language varieties and dialect processing (including the language varieties of Angola, Brazil, Cape Verde, East Timor, Galicia, Guinea-Bissau, Macau, Mozambique, Portugal, São Tomé, and Principe) - Multilingual studies, methods, applications, and resources including Portuguese/Galician The systems may be of the following kinds: - Natural Language Processing systems or system components - Application systems using language technology components - Software tools for computational linguistics research - Software for demonstration or evaluation - Development tools *Submissions:* Submissions should consist of a non-anonymous brief description document of up to* three pages *of content, including references. Developers must outline the main characteristics of their system/product/tool, provide sufficient details to allow its evaluation, and give information on how they plan to demonstrate it. Developers are encouraged to focus their description on the relevance of the computational processing component of Portuguese or Galician in the proposed system. Submissions should be written in English. At submission time, only PDF format is accepted. For the final versions, authors of accepted papers will be given one extra content page to take the reviews into account. Authors of accepted papers will be requested to send the source files for the production of the proceedings. All submitted papers must conform to the official ACL style guidelines. ACL provides style files for LaTeX and Microsoft Word that meet these requirements. They can be found at: - LaTeX styelesheet: https://github.com/acl-org/acl-style-files/tree/master/latex - MS Word stylesheet: https://github.com/acl-org/acl-style-files/tree/master/word *The URL for paper submission will be available soon.* *Publication:* Accepted demo papers are expected to be published by ACL as a volume in ACL Anthology (https://aclanthology.org/) as part of the PROPOR 2024 proceedings. They will be available online. To ensure publication, at least one author of each accepted paper must complete an adequate registration for PROPOR 2024 by the early registration deadline. *Presentation format:* Accepted demos will be presented at a designated demo session with an optional accompanying poster. Developers should make sure they could run their demos properly. Thus, it is the authors’ responsibility to provide the necessary technical conditions (i.e. equipment) for the demo at the conference. Note that the local organizers will not provide any hardware or software. Free high-speed Internet access will be available. There will be *a best demo award* for the best-presented project. Further details on the date, time, and instructions of the demonstration session(s) will be determined and provided at a later date. *Demo chairs*: - Marlo Souza (Universidade Federal da Bahia, Brazil) - Iria de-Dios-Flores (Universidade de Santiago de Compostela, Spain) -- *Iria de-Dios-Flores (PhD)* *https://sites.google.com/view/iriadediosflores/ <https://sites.google.com/view/iriadediosflores/>*

1 0

2026

2025

2024

2023

2022

Corpora May 2023