Post-doctoral fellow in model-based reinforcement learning

il y a 6 heures


Palaiseau, Île-de-France CHEManager International Temps plein

Who we are ?
Télécom Paris, part of the IMT (Institut Mines-Télécom) and a founding member of the Institut Polytechnique de Paris, is one of France's top 5 general engineering schools.

The
mainspring
of Télécom Paris is to train, imagine and undertake to design digital models, technologies and solutions for a society and economy that respect people and their environment.

We are looking for our future postdoctoral researcher
in model-based reinforcement learning
to join the Computer Science and Networks (INFRES) department at Telecom Paris.

Reinforcement learning (RL) has emerged as a useful paradigm for training agents to perform complex tasks. Model-based RL (MBRL), in particular, promises greater sample efficiency and sophisticated planning capabilities by enabling an agent to learn a predictive model of its environment. However, the direct application of current MBRL methods to safety-critical domains, such as, autonomous robotics, transportation, or industrial control, is hindered by unresolved challenges. The core scientific challenge: The limitations of current world models. Standard approaches to MBRL typically learn a monolithic, "black-box" world model, often using a large neural network as function approximators. While these models can be highly effective for prediction within their training distribution, they suffer from two key limitations for deployment in sociotechnical systems:

  • Brittleness and unpredictable failures: Learned models are prone to unpredictable failures when the agent encounters unseen states or dynamics (i.e., distributional shift). These failures are difficult to anticipate and can lead to unsafe behavior, as the model's predictions are no longer reliable.
  • Lack of verifiability: The learned models are opaque and do not come with formal guarantees. It is not possible to prove that the model will consistently respect fundamental constraints of the real world or be aligned with expected values, such as physical laws, safety rules, or logical invariants. This lack of verifiable correctness is a major barrier to building trustworthy and well-calibrated autonomous systems.

Research focus: Verifiable world models. The research will focus on developing a new class of structured, verifiable world models that integrate the flexibility of deep learning with the rigor of formal methods and compositional reasoning. The core research thrusts of this position are:

  • Structured, neurosymbolic models: The research will investigate model architectures that are not learned from a blank slate. Instead, they will be designed to incorporate explicit symbolic knowledge. This could include known physical laws, logical rules, or safety constraints, which are treated as fixed, verifiable components of the model. The learning process then focuses on modeling the more complex, unknown aspects of the environment around these established truths.
  • Compositional reasoning for safety: We will explore how a complex world model can be constructed by composing smaller, more specialized sub-models. A key research question is how to formally verify properties of the composite model based on the known properties of its individual components. This provides a modular and scalable path to certifying that the agent's internal model of the world is, and remains, consistent with its safety specifications.
  • Model adaptation: A truly intelligent agent must be able to adapt its understanding of the world from experience. This research will develop a framework for safe model adaptation. This involves creating MBRL algorithms where the agent can propose updates to its own world model structure, but these updates are only accepted after a formal verification step confirms that the new model still adheres to its core safety properties.
  • Multitask learning: Task decomposition allows agents to learn transversal skills that can be useful in different contexts. Shared representations, multitask and multiobjective RL paradigms improve generalization. The research in this area will explore how to capture task decomposition in world models to enable multitask specifications with verifiable guarantees.

The successful candidate will lead the solution of these open problems through the development and implementation of RL algorithms. They will have the opportunity to make a significant impact in the field of trustworthy and well-calibrated artificial intelligence (AI) through international collaborations (e.g., UT Austin, MIT).

Your Main Responsabilities

  • To carry out research missions in the field of model-based RL.
  • To ensure supervision and tutoring missions
  • To contribute to the reputation of the School, the Institut Mines-Télécom and the Institut Polytechnique de Paris

We are looking for a candidate with a solid theoretical understanding of reinforcement learning, accompanied by a strong foundation in mathematics. You must also have proven experience in programming reinforcement learning agents, particularly with tools such as JAX, PyTorch, Gym, etc.

A proven ability to publish in leading scientific conferences and journals is essential, as is an aptitude for sharing and disseminating your knowledge within the team. Finally, you must be fluent in English in order to thrive in an international environment. You hold a PhD or equivalent. Your level of English is professional.

Why join us?
You'll Be Working In a Fast-growing, Pleasant, Green And Accessible Environment (especially For People With Disabilities) Just 20 Km From Paris (RER B And C Suburban Train Lines, Close To Major Roads, Shared Shuttle Departing From Porte D'Orléans). You Will Benefit From

  • 49 days annual leave (CA + RTT)
  • flexible working hours (depending on department activity)
  • telecommuting 1 to 3 days/week possible
  • 75% public transport pass reimbursement
  • Proximity to numerous sports facilities, concierge service, underground parking, in-house catering, etc.
  • Good to know: our social security contributions are lower than in the private sector

Other Information
Application deadline:
January 10, 2026
Job type :
24 months fixed-term contract
Job Description Here
Scientific contact person : Georgios Bakirtzis (-)

Administrative contact person : Najoua Kharmaze

Funding: This postdoctoral position is partially supported by the chair Architecture of Complex Systems - Dassault Aviation, Naval Group, Dassault Systèmes, KNDS France, Agence de l'Innovation de Défense, Institut Polytechnique de Paris.
Our recruitment is based on skills, without distinction of origin, age, gender identity, or sexual orientation, and all our positions are open to individuals with disabilities.



  • Palaiseau, Île-de-France CEA Temps plein

    General information Organisation The French Alternative Energies and Atomic Energy Commission (CEA) is a key player in research, development and innovation in four main areas :• defence and security,• nuclear energy (fission and fusion),• technological research for industry,• fundamental research in the physical sciences and life sciences.Drawing...


  • Palaiseau, Île-de-France CEA Temps plein

    General information Organisation The French Alternative Energies and Atomic Energy Commission (CEA) is a key player in research, development and innovation in four main areas :• defence and security,• nuclear energy (fission and fusion),• technological research for industry,• fundamental research in the physical sciences and life sciences.Drawing...


  • Palaiseau, Île-de-France CEA Temps plein

    General information Organisation The French Alternative Energies and Atomic Energy Commission (CEA) is a key player in research, development and innovation in four main areas :• defence and security,• nuclear energy (fission and fusion),• technological research for industry,• fundamental research in the physical sciences and life sciences.Drawing...


  • Palaiseau, Île-de-France ECOLE NATIONALE SUPERIEURE DE TECHNIQUES AVANCEES Temps plein

    Modélisation du comportement en fatigue des composites d'hydrogel architecturés // Modelling of fatigue behaviours of architected hydrogel compositesRéf ABG-135048ADUM-68477Sujet de Thèse13/01/2026Contrat doctoralÉcole nationale supérieure de techniques avancéesLieu de travailPalaiseau - Ile-de-France - FranceIntitulé du sujetModélisation du...


  • Palaiseau, Île-de-France CEA Temps plein

    General information Organisation The French Alternative Energies and Atomic Energy Commission (CEA) is a key player in research, development and innovation in four main areas :• defence and security,• nuclear energy (fission and fusion),• technological research for industry,• fundamental research in the physical sciences and life sciences.Drawing...

  • Chercheur post-doctoral

    il y a 4 heures


    Palaiseau, Île-de-France Choisir le Service Public Temps plein

    Informations générales Organisme de rattachement CNRS   Référence UMR9001-FEDPAN-003   Date de début de diffusion /01/2026 Date de parution /01/2026 Date de fin de diffusion /02/2026 VersantFonction Publique de l'Etat CatégorieCatégorie A (cadre) Nature de l'emploiEmploi ouvert uniquement aux contractuels Domaine / MétierRecherche -...


  • Palaiseau, Île-de-France Institut Mines-Télécom Temps plein

    Télécom Paris, an international multidisciplinary center for education, research, and innovation, is a leader in the digital world.We are looking for an Assistant/Associate Professor in Video Processing and Animation with Deep Learning to jointhe Multimedia (MM)teamin the Image, Data, Signal (IDS) department. Yo will also take an active part in the HiParis...


  • Palaiseau, Île-de-France CEA Temps plein

    General information Organisation The French Alternative Energies and Atomic Energy Commission (CEA) is a key player in research, development and innovation in four main areas :• defence and security,• nuclear energy (fission and fusion),• technological research for industry,• fundamental research in the physical sciences and life sciences.Drawing...


  • Palaiseau, Île-de-France ORAILIX AI Research Ecole Polytechnique Temps plein

    Find all information and the application process here: Research AreaArtificial Intelligence, Deep Learning and Large Language ModelsPositionMonge positions are full-time tenure track faculty positions, in the form of a three-years contract renewable once. The Monge faculty member is expected to defend his habilitation a diriger les recherches before coming...


  • Palaiseau, Île-de-France Inria Temps plein

    Le descriptif de l'offre ci-dessous est en AnglaisType de contrat : CDDNiveau de diplôme exigé : Thèse ou équivalentFonction : Post-DoctorantNiveau d'expérience souhaité : Jeune diplôméA propos du centre ou de la direction fonctionnelleThe Inria Saclay Research Centre was established in 2008. It has developed as part of the Saclay site in partnership...