Dear Colleagues,
Please find below the description of two internship positions for 2nd year Masters (M2) students at the voice technology startup Vivoka located in Metz, France. The typical duration of the internship is around 5-6 months starting from March 2024.
*About Vivoka*
Founded in 2015 and awarded two CES Innovation Awards, Vivoka https://vivoka.com/en/ has created and sells the Voice Development Kit (VDK), the very first solution allowing a company to design a voice interface in a simple, autonomous, and quick way. Moreover, this interface is embedded: it can be deployed on devices without an Internet connection and fully preserves privacy. Accelerated by the COVID-19 health crisis and the need for "no-touch" interfaces, Vivoka is now optimizing this technology by developing its own speech and language processing solutions that are able to compete with the most efficient current technologies. The internship would be carried out as part of Vivoka's R&D team. The interns will benefit from the startup spirit of Vivoka, where they will interact with the researchers and Ph.D. students of the R&D team, and the engineers responsible for integrating their results into the VDK.
Internship Requirements:
-
M2 in Computer Science with a specialization in Machine Learning (ML) or Natural Language Processing (NLP) -
Prior knowledge and/or experience with ML/NLP. -
Experience with Python programming and frameworks like PyTorch.
*1. Robust Dialogue State Tracking for Dialog Management in Conversational AI *
Context
Conversational systems improve user experience by steering interactions to understand users’ needs and respond by providing informed answers, assistance, invoking services, etc. Unlike non-task-oriented dialogue systems that focus on open-domain conversations, such as chit-chats, task-oriented conversational systems enable users to accomplish certain tasks using the information provided during conversations. One of the critical aspects of conversational systems is the design of dialogue management that allows robust, intelligent, engaging conversations [1, 2, 3]. The focus of this internship is dialogue management in task-oriented conversational systems.
In task-oriented dialogue systems, the dialogue state is the component of a dialogue manager that serves as a summary of the entire conversation up to the present turn. It maintains all the essential information that the system needs to give informed responses to the user’s queries. This information comprises mainly the user’s intents (e.g. flight_booking), slots, i.e. information needed to fulfill the intent (e.g. departure and arrival cities), and dialogue acts, i.e. hidden actions in user utterances to indicate their specific communicative function (e.g. request, statement, etc.) [3]. The dialogue states are estimated and tracked by the Dialogue State Tracking (DST) model [4]. Based on the dialogue states, the conversational agent generates subsequent actions to sustain the ongoing conversation. In real-world conversations, the range of potential values for slots is often dynamic and unbounded, such as movie_titles or usernames. Consequently, in recent years, there has been an active focus on open-vocabulary approaches to DST [3]. These approaches involve estimating the possible values for slots from the ongoing conversation and language understanding results, without relying on a predefined set of categories. This research area represents a critical advancement toward DST with zero-shot generalization, which means that adding new intents and slots can be achieved without the need for collecting new data or extensive retraining.
This internship aims to explore dialogue management in conversational systems with a particular focus on robust DST approaches that can achieve few-shot or zero-shot generalization. In real use cases, the disfluent nature of spontaneous conversations poses an additional set of challenges for Dialogue Management. The internship will focus on the challenges that are encountered while building robust task-oriented DST approaches meant for real-world applications of conversational systems.
Objectives and Expected Outcomes:
-
Perform a literature review of Dialogue Management -
Implement a state-of-the-art Dialogue State Tracking approach in PyTorch -
Improve the implemented DST approach to perform few/zero-shot generalization -
Perform experiments to examine the challenges with real-world conversations for dialogue management -
Perform experiments to examine the generalizability of the implemented DST approach
*References:*
1.
M. McTear, Z. Callejas, and D. Griol, “The Conversational Interface: Talking to Smart Devices https://link.springer.com/book/10.1007/978-3-319-32967-3”, 1st ed. Springer Publishing Company, Incorporated, 2016. 2.
Z. Zhang, M. Huang, Z. Zhao, F. Ji, H. Chen, and X. Zhu, “Memory- augmented dialogue management for task-oriented dialogue systems https://dl.acm.org/doi/abs/10.1145/3317612,” ACM Transactions on Information Systems (TOIS), 2019. 3.
H. Brabra, M. Báez, B. Benatallah, W. Gaaloul, S. Bouguelia and S. Zamanirad, “Dialogue Management in Conversational Systems: A Review of Approaches, Challenges, and Opportunities https://ieeexplore.ieee.org/document/9447005,” in IEEE Transactions on Cognitive and Developmental Systems, vol. 14, no. 3, pp. 783-798, 2022 4.
Jason Williams, Antoine Raux, Deepak Ramachandran, and Alan Black. 2013. “The Dialog State Tracking Challenge https://aclanthology.org/W13-4065/”. In Proceedings of the SIGDIAL 2013 Conference, pages 404–413, Association for Computational Linguistics, 2013.
*2.* *Data Augmentation for Low Resource Slot Filling and Intent Classification*
Context:
Neural-based models have achieved outstanding performance on slot and intent classification when fairly large in-domain training data is available. However, as new domains are frequently added, creating sizable data is expensive. Some approaches [1, 2] suggest a set of augmentation methods involving word span and sentence level operations, alleviating data scarcity problems.
We target more complex state-of-the-art augmentation approaches that allow models to achieve competitive performance on small (English and French) data. Furthermore, we will investigate the exploitation of pretrained Large Language Models such as [3] for data augmentation, and how it can affect slot filling and intent classification performance for those languages.
Objectives and Expected Outcomes:
-
Experiments on low and large resource data -
Implement different approaches to augment data for slot filling and intent classification -
Evaluate the quality of the generated data -
Evaluate the effect of data augmentation on slot filling and intent classification -
Integrate the tool into our NLU system -
Develop a Python module for Data Augmentation dedicated to the task -
Evaluate the module on several real use cases.
References:
1.
Jason W. Wei and Kai Zou. 2019. "EDA: easy data augmentation techniques for boosting performance on text classification tasks https://aclanthology.org/D19-1670/". In Kentaro Inui, Jing Jiang, Vincent Ng, and Xiaojun Wan, editors, Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019, pages 6381–6387. Association for Computational Linguistics 2.
Marzieh Fadaee, Arianna Bisazza, and Christof Monz. 2017. "Data augmentation for low-resource neural machine translation https://aclanthology.org/P17-2090/". In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 567–573, Vancouver, Canada, July. Association for Computational Linguistic 3.
Ray, Partha Pratim. "ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. https://www.sciencedirect.com/science/article/pii/S266734522300024X" Internet of Things and Cyber-Physical Systems (2023).
Please submit your applications to tulika.bose@vivoka.com or firas.hmida@vivoka.com. Please feel free to share this call for applications with any interested students.
Best Regards, Tulika Bose AI Researcher Vivoka