3-year postdoctoral position in NLP - University of Oslo, Norway - Corpora

8 Mar 2024


      (Apologies for cross-posting)
A postdoctoral position in NLP (large language models in particular) is 
available at the University of Oslo, Norway. This position is funded 
jointly by the Integreat research center (https://www.integreat.no/) and 
the DSTrain program (https://www.uio.no/dscience/english/dstrain/).
We offer 3 year contracts. The deadline for the applications is April 
14, 2024; the applicants should come up with a short project description 
and get in touch with the contact person beforehand to discuss this 
description.
Please apply here: 
https://www.jobbnorge.no/en/available-jobs/job/255679/dstrain-msca-postdocto...
The contact person is myself, and the project should roughly fall within 
the topic of "Separation of Linguistic and Factual Knowledge in Large 
Language Models". See the description below:
During the training, modern large language models learn both the 
language structure and the knowledge about the world as one indivisible 
whole. Decoupling these two components is a challenging, but extremely 
promising field of research. Having more control over what is stored in 
the model weights should allow to optimize the model better. In 
particular, it might be possible not to learn world knowledge every time 
from scratch, when training a model for a particular language.
The question is whether it is possible to develop an architecture where 
a neural network is limited to learning the linguistic structure, while 
the knowledge about the world is stored in and retrieved from an 
external knowledge graph. Will such a model focus on learning 
language-specific linguistic skills without spending its parameters on 
time-dependent and language-agnostic factual knowledge? Is it also true 
that such language models will be less prone to hallucinations and bias?
The selected applicants will start their postdoctoral projects no later 
than October 1, 2024. The Postdoctoral Research Fellows will be employed 
at the Faculty of Mathematics and Natural Sciences. The appointments 
will be full-time positions lasting for three years, with 10% of the 
time dedicated to mandatory duties, typically teaching. It is not 
possible to be appointed for more than one Postdoctoral Research 
Fellowship at the University of Oslo.
References:
[1] Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, 
N., Küttler, H., Lewis, M., Yih, W.-t., Rocktäschel, T., et al. (2020). 
Retrieval-augmented generation for knowledge-intensive NLP tasks. In 
Advances in Neural Information Processing Systems, volume 33, pages 
9459–9474.
[2] Borgeaud, S., Mensch, A., Hoffmann, J., Cai, T., Rutherford, E., 
Millican, K., Van Den Driessche, G. B., Lespiau, J.-B., Damoc, B., 
Clark, A., et al. (2022). Improving language models by retrieving from 
trillions of tokens. In International conference on machine learning, 
pages 2206–2240. PMLR.
[3] Pan, S., Luo, L., Wang, Y., Chen, C., Wang, J., and Wu, X. (2023). 
Unifying large language models and knowledge graphs: A roadmap.
-- 
Andrey
Language Technology Group (LTG)
University of Oslo