DevOps Engineer, Hpc Services

il y a 1 semaine


Paris, France Mistral AI Temps plein

**About Mistral**

At Mistral AI, we believe in the power of AI to simplify tasks, save time, and enhance learning and creativity. Our technology is designed to integrate seamlessly into daily working life.

We democratize AI through high-performance, optimized, open-source and cutting-edge models, products and solutions. Our comprehensive AI platform is designed to meet enterprise needs, whether on-premises or in cloud environments. Our offerings include le Chat, the AI assistant for life and work.

We are a dynamic, collaborative team passionate about AI and its potential to transform society.

Our diverse workforce thrives in competitive environments and is committed to driving innovation. Our teams are distributed between France, USA, UK, Germany and Singapore. We are creative, low-ego and team-spirited.

**Role Summary**

We are building one of Europe’s largest AI infrastructure offering that will provide our customers a private and integrated stack in every form factor they may need — from bare-metal servers to fully-managed PaaS. As a DevOps Engineer, you will join a fast growing team to help building, scaling and automating our computing management stack. You will be responsible for building fault-tolerant and reliable infrastructure to support both our internal processes and customer platform.

Location: France

Reporting line: Software Architect, HPC

**What you will do**

As a DevOps Engineer in the HPC services team, your primary responsibility will be to engineer robust and dependable infrastructure that supports both our internal operations and customer-facing platforms.

Key Responsibilities:

- Design, build and maintain scalable, highly available and fault-tolerant infrastructures
- Build, scale and automate the full lifecycle of compute nodes, from bootstrapping to decommissioning
- Design and develop new workflows and tooling to improve to the reliability, availability and performance of our systems (automation scripts, API-based features, web apps, dashboards, etc.)
- Drive continuous improvement in infrastructure automation, deployment, and orchestration (CI/CD, containerization, orchestration, monitoring, logging and alerting systems...)
- Operate systems and troubleshoot issues in production environments (interrupts, on-call responses, users admin, data extraction, infrastructure scaling, etc.)
- Participate occasionally in on-call rotations to respond to incidents and perform root cause analysis to prevent future occurrences
- Collaborate closely with R&D to streamline build systems, scale testing workflows and make sure our inference and model training environments are always highly available and seamlessly replicable across several HPC clusters
- Collaborate with the security team to ensure infrastructure adheres to best security practices and compliance requirements

**About you**
- 7+ years of experience in a DevOps/SRE role
- Exposure to highly available distributed systems and site reliability issues in critical environments (issue root cause analysis, in-production troubleshooting, on-call rotations...)
- Proficiency in scripting languages (Python, Go, Bash...) and knowledge of software development best practices
- Hands-on experience with CI/CD, containerization and orchestration tools (Docker, Kubernetes..)
- Proven experience troubleshooting complex K8s cluster issues and performing system upgrades
- Familiarity with infrastructure-as-code tools like Terraform or CloudFormation
- Knowledge of monitoring, logging, alerting and observability tools (Prometheus, Grafana, ELK Stack, Datadog...)
- Experience working against reliability KPIs (observability, alerting, SLAs)
- Strong understanding of networking, security, and system administration concepts
- Excellent problem-solving and communication skills
- Self-motivated and able to work well in a fast-paced startup environment

Now, it would be ideal if you also had experience with:

- HPC workload managers (Slurm)
- Distributed storage systems (Lustre, Ceph)

Location & Remote

This position is based primarily at our Paris HQ and we encourage going to the office regularly to create bonds and smooth communication. Our remote policy aims to provide flexibility, improve work-life balance, and increase productivity.
- Onboarding: a more pronounced in-person presence during the first months is key to ensure strong cohesion and good onboarding,
- Run mode: a minimum onsite presence (4 days per month minimum) is necessary to maintain connection and create serendipity.

What we offer

Competitive cash salary and equity
- ️ Health insurance

Transportation allowance

Sport allowance

Meal vouchers

Parental : Generous parental leave policy


  • HPC Systems Administrator

    il y a 2 semaines


    Paris, France Groupe EOLEN Temps plein

    OverviewHPC Systems Administrator / Systems DevOps Engineer Alliance Services Plus - GECI International | Permanent Contract | France & BeneluxAbout Alliance Services PlusA specialist in High Performance Computing (HPC) and Artificial Intelligence for over 15 years, Alliance Services Plus operates on the largest national and industrial HPC infrastructures in...

  • Ingénieur DevOps

    il y a 4 semaines


    Paris, France HCLTech Temps plein

    Rejoignez HCLTech !HCLTech est une entreprise technologique mondiale comptant plus de 225 000 collaborateurs présents dans 60 pays, proposant des solutions digitales, d’ingénierie et cloud. L’entreprise accompagne de grands secteurs tels que les services financiers, l’industrie, la santé, les télécoms, etc.Son chiffre d’affaires sur 12 mois,...


  • Paris, France GECI Int. Temps plein

    Contexte du poste Vous souhaitez rejoindre un environnement où l’innovation le calcul intensif et l'intelligence artificielle se rencontrent Notre société experte en HPC IA et Cloud accompagne ses clients dans la structuration de leurs équipes techniques et le déploiement de projets à forte valeur ajoutée. Dans le cadre de notre croissance nous...


  • Paris, Île-de-France GECI Int. Temps plein

    Contexte du posteVous souhaitez rejoindre un environnement où l'innovation, le calcul intensif et l'intelligence artificielle se rencontrent ?Notre société, experte en HPC, IA et Cloud, accompagne ses clients dans la structuration de leurs équipes techniques et le déploiement de projets à forte valeur ajoutée.Dans le cadre de notre croissance, nous...

  • Ingénieur DevOps

    il y a 2 semaines


    Paris, Île-de-France HCLTech Temps plein

    Rejoignez HCLTech HCLTech est une entreprise technologique mondiale comptant plus de collaborateurs présents dans 60 pays, proposant des solutions digitales, d'ingénierie et cloud. L'entreprise accompagne de grands secteurs tels que les services financiers, l'industrie, la santé, les télécoms, etc.Son chiffre d'affaires sur 12 mois, clôturé en...

  • Devops Engineer

    il y a 2 semaines


    Paris, France Aqr8 Temps plein

    En tant qu'Ingénieur DevOps, vous jouerez un rôle essentiel dans l'automatisation et l'optimisation de nos processus de développement et de déploiement. Vous travaillerez en étroite collaboration avec les équipes de développement et d'exploitation pour garantir des livraisons continues et de haute qualité.Vos missions principales :• Automatisation...

  • DevOps Engineer

    il y a 2 semaines


    Paris, France TalentHawk Temps plein

    My client is an agile innovative software firm who are currently searching for a DevOps Engineer to join their team in Paris You will be trained to become part of a team that develops, deploys and maintains a DevSecOps infrastructure for R&D. You will play a key role in providing a flexible infrastructure ("infrastructure as code") for the R&D engineers...

  • AWS Cloud Engineer – DevOps

    il y a 2 semaines


    Paris, France Reply Temps plein

    Une entreprise spécialisée dans le cloud recherche un Ingénieur DevOps & Systèmes HPC pour participer à la modernisation des infrastructures informatiques sur AWS. Le candidat idéal aura un Master en cloud computing et des compétences avancées en Python, ainsi qu'une connaissance des outils comme Terraform et Kubernetes. Ce rôle offre l'occasion de...

  • DevOps Engineer

    il y a 2 semaines


    Paris, France DeFinitive Temps plein

    DevOps Engineer – Scaling High-Throughput Blockchain DataSeeking a DevOps Engineer to help power one of the most ambitious reward engines in DeFi - processing hundreds of millions of data points today and scaling to hundreds of billions.You’ll be at the heart of our infrastructure, designing systems that index, process, and deliver vast volumes of...

  • Ingénieur DevOps

    il y a 2 semaines


    Paris, France Cyberr® Temps plein

    Prototyper et mettre en œuvre des architectures systèmes production ready en trouvant un équilibre acceptable pour les utilisateurs actuels et en adéquation avec la cible cloudMigrer les applications d’Openshift 3 à Openshift 4Maintenance de la production actuelle et résolution d’incidentsAccompagner à la transformation des équipes vers...