GPU Engineer

il y a 2 jours

Paris, Île-de-France Kog AI Temps plein

KOG:

Kog is a European VC-funded startup and real-time AI frontier lab building the world's fastest AI execution layer, part of the 2030 French Tech cohort.

We are not just optimizing existing libraries; we are bypassing inefficient abstraction layers to rewrite the rules of AI inference. By coding at the Assembly level on high-end GPUs (starting with the AMD MI300X), we unlock raw performance that standard stacks leave on the table.

Our Mission: To enable true real-time AI. We are targeting 10x performance gains through a combination of low-level GPU mastery and novel model architecture. Our goal is to build the sovereign infrastructure that will power the next generation of collaborative AI agents.

Why join now? We have already achieved a 3x to 10x speedup compared to state-of-the-art alternatives (vLLM, TensorRT-LLM) by making breakthroughs in:

Inter-GPU communication & Grid synchronization
Aggressive Kernel fusion
Low-level Memory Access Optimization

We've built an inference engine optimized at the Assembly level, bypassing the inefficient abstraction layers, and we've made significant advancements in several areas:

Inter-GPU communication
Kernel fusion
Grid synchronization
Memory access optimization

The inference engine offers speed improvements 3 to 10 times greater compared to the best GPU alternatives, starting with AMD MI300X.

What you'll do:

We wish to strengthen our world-class team with technically brilliant individuals who want to take on this challenge. Your missions will include:

Implementing cutting-edge AI models in low-level C++ code and Assembly on high-end AMD and NVIDIA GPUs
Reverse-engineering subtle GPU features (such as memory page mappings, memory channels, hash functions, cache behaviors, credit assignment logic, etc.)
Leveraging this knowledge to find and implement creative optimization ideas
Optimizing the Kog inference engine to make AI inference incredibly fast (10x compared to vLLM, SGLang, or TensorRT-LLM—we are already at 3x)

Who we'd like to work with :

World-class talents with 5+ years of experience
Proficiency in CUDA or ROCm
Start-up mindset
Team player attitude
PhD or Top Engineering Schools
Someone who has side projects or shows great passion and interest

What we offer:

Top-Tier Compensation: We offer a highly competitive salary package (top of the market) tailored to match your expertise and leadership level.
Real Ownership (BSPCE): You aren't just an employee; you are a partner. We offer significant equity to ensure you share in the startup's success.
Unrivaled Technical Playground: Work on the bleeding edge of AI hardware. You will have access to the compute power you need (high-end clusters) to perform your magic.
A world-class Environment: Join a high-density talent team of 12 engineers (including 5 PhDs). We value peer-to-peer learning, high autonomy, and zero bureaucracy.
Impact & Autonomy: As a Lead, you will have a direct seat at the table to shape our engineering culture and roadmap alongside the CEO.
Prime Location & Flexibility: WeWork offices in the 13th district (near Station F), the heart of Paris' tech scene. We operate with a hybrid model, punctuated by our "Paris Weeks" for deep work and team bonding (and great afterworks).

Feel free to apply if feel like you're up to the task

GPU Engineer

il y a 2 semaines

Paris, Île-de-France Kog AI Temps plein

KOG:Kog is a French deeptech company building an ultra-fast AI execution layer for real-time AI.We target up to 10x gains through GPU optimization and, crucially, up to 10x gains through model and training architecture design.We start on AMD GPUs and will expand to other accelerators.Our aim is a modular, real-time AI platform where developers and users can...
Lead GPU Engineer

il y a 4 jours

Paris, Île-de-France Kog AI Temps plein

KOG: Kog is a Paris-based deeptech company building the world's fastest AI execution layer.We are not just optimizing existing libraries; we are bypassing inefficient abstraction layers to rewrite the rules of AI inference. By coding at the Assembly level on high-end GPUs (starting with the AMD MI300X), we unlock raw performance that standard stacks leave on...
Lead GPU Engineer

il y a 2 jours

Paris, Île-de-France Kog AI Temps plein

KOG:Kog is a European VC-funded startup and real-time AI frontier lab building the world's fastest AI execution layer, part of the 2030 French Tech cohort.We are not just optimizing existing libraries; we are bypassing inefficient abstraction layers to rewrite the rules of AI inference. By coding at the Assembly level on high-end GPUs (starting with the AMD...
AI Engineer

il y a 2 semaines

Paris, Île-de-France Experis Temps plein

Software Engineer – AI Infrastructure for Agents (Paris)We're seeking aSoftware Engineerwith deep expertise inPython, PyTorch, CUDA, and C++to join our AI research team inParis. This role is focused on building high-performance infrastructure for training and deploying large-scale AI agents.Important:Candidates will be asked to complete atimed coding...
Pre-Training Engineer

il y a 2 jours

Paris, Île-de-France Kog AI Temps plein

KOG: Kog is a French deeptech company building an ultra-fast AI execution layer for real-time AI. We target up to 10x gains through GPU optimization and, crucially, up to 10x gains through model and training architecture design. We start on AMD GPUs and will expand to other accelerators. Our aim is a modular, real-time AI platform where developers and users...
Senior DevOps Engineer

il y a 7 jours

Paris, Île-de-France Wirk Temps plein

WIRK — Senior DevOps Engineer (Kubernetes, Terraform, LLM & ELK) — Paris (On-site)À propos de WIRKWIRK est une entreprise innovante spécialisée dans l'extraction documentaire grace à des plateformes scalables et intelligentes qui répondent aux défis techniques modernes.Chez WIRK, nous plaçons lafiabilité, la performance et l'innovation au cœur...
Site Reliability Engineer

il y a 4 jours

Paris, Île-de-France Criteo Temps plein

What You'll Do:At Criteo, our Platform Core group builds the foundational services that power our global advertising platform. We design and operate scalable, resilient systems that support real-time decision-making and data processing at massive scale.As we expand our capabilities in high-performance inference and distributed computing, we're forming a new...
Imaging Software Engineer

il y a 2 jours

Paris, Île-de-France Harmattan AI Temps plein

About UsAt Harmattan AI, we are a next-generation defense prime building autonomous and scalable defense systems. Driven by rigorous engineering developments of new defense products based on recent robotics and AI developments, we are on a steep growth trajectory. If you are interested in a career in a highly technical environment, thrive on pushing...
Research Engineer

il y a 2 jours

Paris, Île-de-France Kog AI Temps plein

KOG: Kog is a European VC-funded startup and real-time AI frontier lab building the world's fastest AI execution layer, part of the 2030 French Tech cohort.We are not just optimizing existing libraries; we are bypassing inefficient abstraction layers to rewrite the rules of AI inference. By coding at the Assembly level on high-end GPUs (starting with the AMD...
Imaging Software Engineer

il y a 4 jours

Paris, Île-de-France Harmattan AI Temps plein

About UsAt Harmattan AI, we are a next-generation defense prime building autonomous and scalable defense systems. Driven by rigorous engineering developments of new defense products based on recent robotics and AI developments, we are on a steep growth trajectory. If you are interested in a career in a highly technical environment, thrive on pushing...

Amériques

Europe

Asie / Océanie

Afrique

GPU Engineer