Senior ML Systems Engineer, Frameworks

il y a 2 jours


Paris, Île-de-France Cohere Temps plein

Who are we?

Our mission is to scale intelligence to serve humanity. We're training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI.

We obsess over what we build. Each one of us is responsible for contributing to increasing the capabilities of our models and the value they drive for our customers. We like to work hard and move fast to do what's best for our customers.

Cohere is a team of researchers, engineers, designers, and more, who are passionate about their craft. Each person is one of the best in the world at what they do. We believe that a diverse range of perspectives is a requirement for building great products.

Join us on our mission and shape the future

We're looking for a senior engineer to help build, maintain and evolve the training framework that powers our frontier-scale language models. This role sits at the intersection of large-scale training, distributed systems, and HPC infrastructure. You will design and maintain the core components that enable fast, reliable, and scalable model training — and build the tooling that connects research ideas to thousands of GPUs.

If you enjoy working across the full stack of ML systems, this role gives you the opportunity and autonomy to have massive impact.

What You'll Work On
  • Build and own the training framework responsible for large-scale LLM training.

  • Design distributed training abstractions (data/tensor/pipeline parallelism, FSDP/ZeRO strategies, memory management, checkpointing).

  • Improve training throughput and stability on multi-node clusters (e.g., GB200/300, AMD, H200/100).

  • Develop and maintain tooling for monitoring, logging, debugging, and developer ergonomics.

  • Collaborate closely with infra teams to ensure Slurm setups, container environments, and hardware configurations support high-performance training.

  • Investigate and resolve performance bottlenecks across the ML systems stack.

  • Build robust systems that ensure reproducible, debuggable, large-scale runs.

You Might Be a Good Fit If You Have
  • Strong engineering experience in large-scale distributed training or HPC systems.
    Deep familiarity with JAX internals, distributed training libraries, or custom kernels/fused ops.

  • Experience with multi-node cluster orchestration (Slurm, Ray, Kubernetes, or similar).

  • Comfort debugging performance issues across CUDA/NCCL, networking, IO, and data pipelines.

  • Experience working with containerized environments (Docker, Singularity/Apptainer).

  • A track record of building tools that increase developer velocity for ML teams.

  • Excellent judgment around trade-offs: performance vs complexity, research velocity vs maintainability.

  • Strong collaboration skills — you'll work closely with infra, research, and deployment teams.

Nice to Have
  • Experience with training LLMs or other large transformer architectures.

  • Contributions to ML frameworks (PyTorch, JAX, DeepSpeed, Megatron, xFormers, etc.).

  • Familiarity with evaluation and serving frameworks (vLLM, TensorRT-LLM, custom KV caches).

  • Experience with data pipeline optimization, sharded datasets, or caching strategies.

  • Background in performance engineering, profiling, or low-level systems.

Bonus: paper at top-tier venues (such as NeurIPS, ICML, ICLR, AIStats, MLSys, JMLR, AAAI, Nature, COLING, ACL, EMNLP).

Why Join Us
  • You'll work on some of the most challenging and consequential ML systems problems today.

  • You'll collaborate with a world-class team working fast and at scale.

  • You'll have end-to-end ownership over critical components of the training stack.

  • You'll shape the next generation of infrastructure for frontier-scale models.

  • You'll build tools and systems that directly accelerate research and model quality.

Sample Projects:

  • Build a high-performance data loading and caching pipeline.

  • Implement performance profiling across the ML systems stack

  • Develop internal metrics and monitoring for training runs.

  • Build reproducibility and regression testing infrastructure.

  • Develop a performant fault-tolerant distributed checkpointing system.

If some of the above doesn't line up perfectly with your experience, we still encourage you to apply

We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants from all backgrounds and are committed to providing equal opportunities. Should you require any accommodations during the recruitment process, please submit an Accommodations Request Form, and we will work together to meet your needs.

Full-Time Employees at Cohere enjoy these Perks:

An open and inclusive culture and work environment 

Work closely with a team on the cutting edge of AI research 

Weekly lunch stipend, in-office lunches & snacks

Full health and dental benefits, including a separate budget to take care of your mental health 

100% Parental Leave top-up for up to 6 months

Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement

Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend

6 weeks of vacation (30 working days)


  • Senior ML/AI Engineer

    il y a 1 semaine


    Paris, Île-de-France Collective Temps plein

    Budget: Selon profilSenior ML/AI Engineer – Demand SensingContexteNous recherchons un profil senior ML/AI Engineer pour accompagner un acteur majeur du luxe sur la mise en place d'un algorithme de demand sensing, basé sur des signaux internes (ventes, stocks, promotions, historiques) et surtout externes (social listening, tendances marché, météo,...

  • Senior AI/ML Engineer

    il y a 4 jours


    Paris, Île-de-France Continuity Temps plein

    "Our Mission: build AI-driven solutions for smarter risk detection in the B2B Insurance world"Continuity is a team of 42 talented professionals, with more than 60% dedicated to R&D, developing specialized AI for underwriting risk analysis.As the preferred partner of insurers in the commercial P&C space, we enable them to make smarter, faster underwriting...

  • Senior AI/ML Engineer

    il y a 6 jours


    Paris, Île-de-France Continuity Temps plein

    "Our Mission: build AI-driven solutions for smarter risk detection in the B2B Insurance world"Continuity is a team of 42 talented professionals, with more than 60% dedicated to R&D, developing specialized AI for underwriting risk analysis.As the preferred partner of insurers in the commercial P&C space, we enable them to make smarter, faster underwriting...

  • Senior ML Engineer

    il y a 4 jours


    Paris, Île-de-France ChapsVision Temps plein

    PRÉSENTATION DE CHAPSVISIONFondé en 2019,ChapsVisionest un éditeur de logiciels français en forte croissance, membre duNEXT40de la French Tech, regroupant les 40 start-ups les plus prometteuses.Avec plus de1200 collaborateurset1000 clients grands comptes et régaliens, ChapsVision ambitionne de devenir un leader européen dutraitement souverain de la...

  • Senior ML Engineer

    il y a 2 jours


    Paris, Île-de-France ChapsVision France Temps plein

    PRÉSENTATION DE CHAPSVISIONFondé en 2019, ChapsVision est un éditeur de logiciels français en forte croissance, membre du NEXT40 de la French Tech, regroupant les 40 start-ups les plus prometteuses.Avec plus de 1200 collaborateurs et 1000 clients grands comptes et régaliens, ChapsVision ambitionne de devenir un leader européen du traitement souverain...


  • Paris, Île-de-France SOFTEAM Temps plein

    Vous êtes un expert en Machine Learning Ops et souhaitez intégrer un leader de la transformation numérique, spécialisé dans les secteurs de la Banque, du Luxe, de l'Assurance, de la Finance et de l'Énergie ? RejoignezSOFTEAM, filiale de Docaposte, labellisée "HappyIndex AtWork" 2022 pour la 5e année consécutive CE QUE NOUS RECHERCHONSNous recrutons...

  • Senior ML Ops Engineer

    il y a 7 jours


    Paris, Île-de-France Doctolib Temps plein

    What you'll doAt Doctolib, we're on a mission to transform the way healthcare is delivered by leveraging the power of AI.We are seeking a highly skilled, motivated, and collaborative Senior MLOps Engineer to join our ML Platform Team. The successful candidate will play a pivotal role in developing, deploying, and maintaining machine learning models and...


  • Paris, Île-de-France Avance Consulting Temps plein

    Job Description:1. Deployment Engineering:•Lead the end-to-end technical implementation of the Agentic platform in enterprise environments.•Design and build robust integration pipelines, connecting customer data sources, APIs, and systems of record to the platform.•Deploy and scale machine learning models in production, ensuring performance,...

  • ML Engineer

    il y a 8 heures


    Paris, Île-de-France Wiremind Temps plein

    Since 2014, Wiremind has positioned itself as atech companytransforming the world of transport and events with a 360° approach combiningUX, software, and AI.Our expertise lies primarily inoptimizing and marketingour clients' capacity. We work on various projects such asticket forecasting and pricing,3D optimization of air freight or scraping competitor...

  • Lead ML Engineering

    il y a 4 jours


    Paris, Île-de-France Mindquest Temps plein

    Définir l?architecture et la feuille de route technique en tenant compte des besoins de performance, scalabilité et sécurité.Accompagner les membres de l?équipe pour garantir la qualité du code, des modèles et des bonnes pratiques ML/engineering.Concevoir, construire et maintenir des pipelines ML robustes, depuis la collecte des données jusqu?à...