Observability Administrator

il y a 2 jours


Paris, Île-de-France Blackfluo Temps plein

Job Description:

  • Location: Fully remote, Central Europe Time Zone
  • Start date: To be defined
  • Languages: English is mandatory
The primary objective of this role is to design, implement, upgrade and maintain a robust observability infrastructure using Elastic, Prometheus and Grafana, with complementary capabilities provided by SCOM and Checkmk. The resource will work closely with our DevOps, IT, and development teams to ensure comprehensive monitoring, alerting, and visualization of our systems. The resource should have advanced experience in complex enterprise environments. Canonical Observability Stack (COS) will be used, therefore advanced experience in COS would be ideal.

Duties and Responsibilities:

  • Assess current monitoring and observability setup and identify gaps.
  • Design, implement and upgrade Prometheus-based monitoring solutions in on-premises setup with multi-tenant and several support teams design.
  • Configure and maintain Grafana dashboards for real-time visualization with multi-tenant and several support teams design.
  • Integrate Prometheus with other systems and tools (e.g., Loki, Mimir, Tempo, Thanos).
  • Design, implement and upgrade Elastic (ELK Stack) for on-premises setups.
  • Develop and document monitoring and logging strategies and best practices.
  • Set up alerts and notification mechanisms to preemptively address system issues.
  • Train internal staff on the use and maintenance of Prometheus, Grafana, and Elastic.
  • Provide ongoing support and improvements to the observability framework.
  • Ensure high availability and performance of the monitoring and logging systems.
  • Provide stand-by services on a rotation basis during weekends, holidays and outside of normal working hours. 
  • Perform other duties as required.

Required Qualifications & Experience

  • At least 5 years in a similar role
  • Proven experience in deploying and managing Elastic, Prometheus and Grafana in on-premises setup with multi-tenant and multi-support teams design.
  • Strong understanding of observability concepts and best practices, including APM.
  • Experience with related technologies (e.g., Kubernetes, Docker, Kibana, Mimir, Loki, Tempo, Thanos, on-premises infrastructure).
  • Proficiency in scripting and automation (e.g., Bash, Python).
  • DevOps experience and practice.
  • Familiarity with infrastructure-as-code tools (e.g., Ansible, Terraform).
  • Experience with log management and tracing solutions (e.g., Loki, ELK stack, Jaeger).
  • Knowledge of other monitoring tools is desirable, especially SCOM and Checkmk.
  • Programming skills is desirable, especially .NET C# and Python.

Education and Certifications:

  • Bachelor's or master's degree in information technology is desirable.
  • Monitoring certifications in SCOM, Checkmk, Elastic, Prometheus, Grafana is desirable. Linux and/or Windows System Administration
  • Network Administration


  • Paris, Île-de-France Doctolib Temps plein

    What you'll do We are looking for a PostgreSQL Database Administrator to join the Database Optimization team within our Platform team The Database Optimization team ensures the reliability and performance of our datastores while fostering a platform mindset in our transition to the "you build it, you run it" paradigm. It is a...


  • Paris, Île-de-France Doctolib Temps plein

    What you'll doWe are looking for aPostgreSQL Database Administratorto join theDatabase Optimizationteam within ourPlatformteamThe Database Optimization team ensures the reliability and performance of our datastores while fostering a platform mindset in our transition to the "you build it, you run it" paradigm. It is a multidisciplinary team made of both Site...

  • Senior ML Infrastructure

    il y a 3 jours


    Paris, Île-de-France Pathway Temps plein

    About PathwayPathway is shaking the foundations of artificial intelligence by introducing the world's first post-transformer model that adapts and thinks just like humans.Pathway's breakthrough architecture (BDH) outperforms Transformer and provides the enterprise with full visibility into how the model works. Combining the foundational model with the...

  • Senior ML Infrastructure

    il y a 3 jours


    Paris, Île-de-France Pathway Temps plein

    About PathwayPathway is shaking the foundations of artificial intelligence by introducing the world's first post-transformer model that adapts and thinks just like humans. Pathway's breakthrough architecture (BDH) outperforms Transformer and provides the enterprise with full visibility into how the model works. Combining the foundational model with the...

  • Senior ML Infrastructure

    il y a 2 jours


    Paris, Île-de-France Pathway Temps plein

    About PathwayPathway is shaking the foundations of artificial intelligence by introducing the world's first post-transformer model that adapts and thinks just like humans. Pathway's breakthrough architecture (BDH) outperforms Transformer and provides the enterprise with full visibility into how the model works. Combining the foundational model with the...

  • Software Engineer

    il y a 2 jours


    Paris, Île-de-France Scality Temps plein

    About Scality:   Scality is one of the most prominent FrenchTech startups, recognized throughout the industry for its technical leadership and its open-source contributions. Selected for the French Tech 120  #FT120, Scality is a worldwide leader in the space of software-defined storage. Scality has over 300 customers in more than 30 countries, including...


  • Paris, France Canonical Temps plein

    Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Our customers include the world's leading public cloud and silicon providers,...

  • Junior Software Developer

    il y a 3 jours


    Paris, France Canonical Temps plein

    Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Our customers include the world’s leading public cloud and silicon providers, and...

  • Security Engineer

    il y a 4 jours


    Paris, France Deepki Temps plein

    Join Our Dynamic Team as a Security Engineer at Deepki. Deepki, leading the way in ESG SaaS for real estate owners and accelerating the transition towards net zero and sustainability. As part of our mission to support the construction and industry sectors in their transition to zero carbon, we are excited to announce an opening for a Security...

  • Software Engineer

    il y a 3 jours


    Paris, France Scality Temps plein

    About Scality:Scality is one of the most prominent FrenchTech startups, recognized throughout the industry for its technical leadership and its open-source contributions. Selected for the French Tech 120 #FT120, Scality is a worldwide leader in the space of software-defined storage. Scality has over 300 customers in more than 30 countries, including some of...

  • Senior ML Infrastructure

    il y a 1 jour


    Paris, France Pathway Temps plein

    About Pathway Pathway is shaking the foundations of artificial intelligence by introducing the world’s first post-transformer model that adapts and thinks just like humans. Pathway’s breakthrough architecture (BDH) outperforms Transformer and provides the enterprise with full visibility into how the model works. Combining the foundational model with the...

  • Senior ML Infrastructure

    il y a 1 jour


    Paris, France Pathway Temps plein

    This range is provided by Pathway. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Base pay range About Pathway Pathway is shaking the foundations of artificial intelligence by introducing the world’s first post‑transformer model that adapts and thinks just like humans. Pathway’s breakthrough...