Senior Software Engineer
il y a 20 heures
At Datadog, we are building a next-generation AI platform that enables seamless training, tracking, and deployment of ML and LLM models at scale. The Training & Registry team is responsible for the infrastructure and tooling that allows applied scientists to iterate rapidly and reliably—managing training jobs, tracking experimentation, and versioning model artifacts across distributed systems.
Our work is foundational to AI development at Datadog, powering everything from classic ML workflows to large-scale LLM fine-tuning and embedding generation. We build deeply technical infrastructure: distributed systems for job orchestration, model lifecycle management, and training observability. This is a high-impact team working on problems critical to Datadog's AI evolution.
We're looking for a Senior Software Engineer to design and build robust backend and platform systems that drive model experimentation and registry workflows. In this role, you'll collaborate with platform teams, applied scientists, and infra stakeholders to shape the future of AI infrastructure at Datadog.
At Datadog, we place value in our office culture - the relationships that it builds, the creativity it brings to the table, and the collaboration of being together. We operate as a hybrid workplace to ensure our employees can create a work-life harmony that best fits them.
What You'll Do:
- Design and implement scalable, reliable systems for training orchestration, artifact tracking, and model registration across multiple data centers and cloud regions.
- Improve and streamline ML experimentation workflows by integrating tooling like Ray, Airflow, and interactive notebooks.
- Develop APIs and services that enable applied scientists to seamlessly launch, debug, and track training jobs.
- Ensure reproducibility and traceability by building robust version control and metadata systems for model artifacts.
- Collaborate with AI infra teams (LLMObs, Compute, etc.) to deliver consistent user experiences and integrated telemetry.
- Mentor engineers and help drive architectural decisions and technical standards.
Who You Are:
- You have 6+ years of experience in backend, distributed systems, or platform engineering roles.
- You have worked on ML platforms or infrastructure, ideally supporting real-world training or model lifecycle workflows.
- You're comfortable designing APIs, managing data at scale, and architecting systems for reliability and observability.
- You're fluent in Python or Go and have experience with cloud-native tools (e.g., Kubernetes, object stores, queueing systems).
- You're comfortable navigating cross-functional environments and translating scientific requirements into reliable systems.
- Bonus points: experience with model registries, experiment tracking tools (e.g., MLflow, Weights & Biases), or distributed training frameworks.
Datadog values people from all walks of life. We understand not everyone will meet all the above qualifications on day one. That's okay. If you're passionate about technology and want to grow your skills, we encourage you to apply.
Benefits and Growth:
- New hire stock equity (RSUs) and employee stock purchase plan (ESPP)
- Continuous professional development, product training, and career pathing
- Intradepartmental mentor and buddy program for in-house networking
- An inclusive company culture, ability to join our Community Guilds (Datadog employee resource groups)
- Access to Inclusion Talks, our internal panel discussions
- Free, global mental health benefits for employees and dependents age 6+
- Competitive global benefits
Benefits and Growth listed above may vary based on the country of your employment and the nature of your employment with Datadog.
About Datadog:
Datadog (NASDAQ: DDOG) is a global SaaS business, delivering a rare combination of growth and profitability. We are on a mission to break down silos and solve complexity in the cloud age by enabling digital transformation, cloud migration, and infrastructure monitoring of our customers' entire technology stacks. Built by engineers, for engineers, Datadog is used by organizations of all sizes across a wide range of industries. Together, we champion professional development, diversity of thought, innovation, and work excellence to empower continuous growth. Join the pack and become part of a collaborative, pragmatic, and thoughtful people-first community where we solve tough problems, take smart risks, and celebrate one another. Learn more about #DatadogLife on Instagram, LinkedIn, and Datadog Learning Center.
Equal Opportunity at Datadog:
Datadog is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and other characteristics protected by law. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. Here are our Candidate Legal Notices for your reference.
Datadog endeavors to make our Careers Page accessible to all users. If you would like to contact us regarding the accessibility of our website or need assistance completing the application process, please complete this form. This form is for accommodation requests only and cannot be used to inquire about the status of applications.
Privacy and AI Guidelines:
Any information you submit to Datadog as part of your application will be processed in accordance with Datadog's Applicant and Candidate Privacy Notice. For information on our AI policy, please visit Interviewing at Datadog AI Guidelines.
-
Senior Software Engineer
il y a 1 semaine
Sophia Antipolis, Provence-Alpes-Côte d'Azur, France Datadog Temps pleinThe Data Science team designs and builds algorithmically driven features in the Datadog app. We work across a range of applications, primarily focusing on analysis on streaming data such as anomaly detection, error outliers and faulty deployment analysis. As a Software Engineer on the Data Science team, you will design, build and scale the backend systems...
-
Software Engineer
il y a 1 semaine
Sophia Antipolis, Provence-Alpes-Côte d'Azur, France Hillcrest Labs, acquired by CEVA Temps pleinBack to careersDescriptionAbout the Business Unit:Ceva is at the forefront of the Smart Edge revolution, with innovative state-of-the-art Silicon and Software solutions that enable products to Connect, Sense and Infer.Within the Wireless Internet of Things Business Unit (WIoT BU), we are offering you a unique opportunity to shape the future of connected...
-
Software Engineer
il y a 7 jours
Sophia Antipolis, Provence-Alpes-Côte d'Azur, France Ceva, Inc. Temps pleinAbout the Business Unit:Ceva is at the forefront of the Smart Edge revolution, with innovative state-of-the-art Silicon and Software solutions that enable products to Connect, Sense and Infer.Within the Wireless Internet of Things Business Unit (WIoT BU), we are offering you a unique opportunity to shape the future of connected devices. Our advanced wireless...
-
Software Engineer
il y a 7 jours
Sophia Antipolis, Provence-Alpes-Côte d'Azur, France Ceva Temps pleinAbout the Business Unit: Ceva is at the forefront of the Smart Edge revolution, with innovative state-of-the-art Silicon and Software solutions that enable products to Connect, Sense and Infer. Within the Wireless Internet of Things Business Unit (WIoT BU), we are offering you a unique opportunity to shape the future of connected devices. Our advanced...
-
Senior Software Engineer
il y a 4 jours
Sophia Antipolis, Provence-Alpes-Côte d'Azur, France Datadog Temps pleinThe AI Platform owns Datadog's entire AI stack—everything from distributed training infrastructure (for our SOTA models) to the frameworks that power Bits AI, LLMObs, and the next wave of generative‑AI experiences. We're expanding beyond model creation to the tooling that lets engineers ship production‑grade GenAI systems: retrieval‑augmented...
-
Senior Software Engineer
il y a 2 semaines
Sophia Antipolis, Provence-Alpes-Côte d'Azur, France Datadog Temps pleinThe Datadog Agent collects and transmits application and host telemetry, which powers many Datadog products such as Infrastructure Monitoring, Logs, APM, and Database Monitoring. This software runs on customer infrastructure across various platforms, collecting and receiving telemetry that is then forwarded to Datadog for analysis. This process provides...
-
Senior Software Engineer
il y a 14 heures
Sophia Antipolis, Provence-Alpes-Côte d'Azur, France Datadog Temps pleinAt Datadog, we're building an AI Platform that powers ML and LLM features across our observability and security products. The Serving team plays a mission-critical role in this platform: ensuring that trained models can be reliably deployed and served at scale in production, across multiple data centers and compute environments.Our goal is to enable...
-
Security Firmware software engineer
il y a 2 semaines
Sophia Antipolis, Provence-Alpes-Côte d'Azur, France NXP Semiconductors Temps pleinWe are now looking for a Software Engineer to join the Security ROM and Firmware Team. As an industry leader in embedded processing, the BL SCE offers the broadest portfolio of ARM-based embedded solutions. Its scalable portfolio spans from high-performing, many-core application processors to ultra-low-power microcontrollers. You will be working on i.MX...
-
Senior Software Engineer
il y a 5 jours
Sophia Antipolis, Provence-Alpes-Côte d'Azur, France Datadog Temps pleinWe're on a mission to build the best platform in the world for engineers to understand and scale their systems, applications, and teams. We operate at high scale—trillions of data points per day—providing always-on alerting, metrics visualization, logs, and application tracing for tens of thousands of companies. Our engineering culture values pragmatism,...
-
Senior Software Engineer
il y a 17 heures
Sophia Antipolis, Provence-Alpes-Côte d'Azur, France Datadog Temps pleinAs Datadog builds more and more products integrated together, they all flow eventually in the Alerting platform, they all have their own failure modes and it's a challenge to build a reliable platform on top of all of these different pipelines. Everything has to be built with robustness in mind.At Datadog, we place value in our office culture - the...