Postdoctoral reasearcher in LLM

il y a 3 semaines


Jouques, France CEA Temps plein

Description de l'offre

You will be recruited to work on the Eurofusion project "Towards Tokamak operations Conversational AI Interface Using Multimodal Large Language Models".

The main objectives are:

Develop an LLM workflow capable of processing diverse data formats—textual, visual, and numerical—to support session leaders/coordinators for tokamak operations before and during experiments, and continuously integrating information from new experiment. Create an interactive conversational AI tool to assist session leaders/coordinators during tokamak operations, by providing quick access to critical information and operational insights.

The scientific objectives and the main tasks you will be working on are:

Build a workflow Tokamak_LLM conversational model: that integrates multi-source data from tokamak operations, including experimental logs in various formats. Conduct a comprehensive training plan to select suitable open-source LLMs, focusing on model-data compatibility and computational demands, hyper parameter optimization. Enhance capabilities to interact with a variety of databases, and APIs to fetch and synthesize information from multiple sources seamlessly. Improve the reliability and accuracy of outputs focus on minimizing the occurrence of hallucinations (false or misleading information generated by the models) through advanced detection mechanisms and reliance on verified data sources.

Moyens / Méthodes / Logiciels

Linux systems, Pytorch/Tensorflow, Python

Profil du candidat

PhD in computer science or related field, with a focus on machine learning, natural language processing (NLP), large language models (LLM), multimodal foundation models and generative AI. Strong theoretical and practical knowledge of natural language processing, including experience with state-of-the-art architectures. Mastery of deep learning frameworks (e.g. PyTorch, TensorFlow, etc.) and libraries commonly used in natural language processing and generative AI. Strong programming skills in Python and ability to write clean, efficient and well-documented code. Excellent problem-solving skills, analytical mind and perseverance in problem-solving. Strong communication skills and ability to work effectively in a collaborative research environment and in an international network of collaborators. A good knowledge of English is essential.

You will work in a group with expertise in a wide range of fields, from IT infrastructure administration to scalable AI model development.

You will interact with research engineers who are experts in the fields required to operate a tokamak in an international context.

You will benefit from 100 days' paid teleworking.

In line with CEA's commitment to the integration of disabled people, this position is open to all. The CEA offers accommodation and/or organizational possibilities for the integration of disabled workers.