Lead GPU Engineer

Il y a 55 minutes


Paris, Île-de-France Kog AI Temps plein

KOG:

Kog is a European VC-funded startup and real-time AI frontier lab building the world's fastest AI execution layer. As part of the 2030 French Tech cohort, we are on a mission to redefine the boundaries of artificial intelligence by enabling true real-time interaction at a scale never seen before.

While the industry often settles for incremental software updates, we are engineering a radical, vertically integrated solution. Our approach is built on three deeply interconnected streams that form the core of our competitive advantage:

  • GPU Engineering: We are building the Kog Inference Engine, a proprietary runtime purpose-built for Dense and MoE LLMs. We develop our own kernels directly in Assembly on AMD Instinct accelerators, entirely bypassing standard libraries to extract the theoretical maximum throughput from the hardware.

  • Model Architecture: We do not just run models; we reinvent them. Our researchers design model architectures specifically optimized for our engine, creating a hardware-software co-design loop that enables massive performance leaps.

  • Product & Software Engineering: We prove our technology through extreme use cases, starting with high-performance software engineering to power real-time generative video games directly in the web browser.

Why join now?

We aim to achieve 100x faster token generation compared to current industry standards, targeting 10,000+ tokens per second to unlock truly instantaneous AI.

While our GPU team extracts maximum throughput from the hardware, we know that raw compute is only half the equation. To reach this scale, we must fundamentally rethink how models are built. We are moving beyond standard architectures to create models that are natively designed for our execution engine.

We are achieving this through Hardware-Software Co-design, focusing on breakthroughs in:

  • Architecture-Hardware Alignment: Designing model layers and dimensions that map perfectly to our proprietary kernels and memory hierarchy.

  • Next-Gen Architectures: Moving beyond standard Transformers to explore and implement linear attention, SSMs (Mamba), and highly optimized Mixture-of-Experts (MoE).

  • Extreme Sparsity & Quantization: Implementing aggressive quantization-aware training and structured sparsity to minimize memory bandwidth usage without sacrificing intelligence.

  • Training for Inference: Designing training pipelines that directly prioritize final inference velocity as a core objective.

Our work is recognized by industry leaders, as evidenced by our recent performance benchmark, which was published on AMD's official blog.

Now, we need the architectural vision to multiply that leverage.

The Culture:

You will join a high-intensity environment where velocity is a core virtue.

We treat research as an iterative engineering process: we prioritize execution cycles over theoretical perfection, shipping over talking, and technical truth over corporate consensus.

At Kog, you will not just be an engineer, you will be a pioneer building the foundational infrastructure for the next generation of collaborative, real-time AI agents.

What You'll Do:

We are seeking a Lead GPU Engineer with strong managerial experience or senior expertise to act as a strategic partner to the CEO.

You will bridge the gap between high-level architectural vision and concrete execution, turning ambitious ideas into a production-ready reality.

Your role is a hybrid of high-impact technical leadership and hands-on engineering. You will be expected to:

1/ Technical Strategy & Execution (The "Owner")

  • Own the Roadmap: Take high-level directions and abstract concepts from the CEO to define concrete objectives and convert them into a structured, actionable technical roadmap.

  • Architect the 10x leap: Lead the engineering efforts to optimize the Kog inference engine, aiming to surpass current state-of-the-art solutions by an order of magnitude.

  • Hardware-Software Co-design: Collaborate deeply with the Model Architecture stream to influence model design. You will provide the feedback loop that allows researchers to structure models specifically to exploit your latest kernel optimizations and memory hierarchy breakthroughs.

  • Accountability: You are fully accountable for the delivery. You ensure that we don't just have a plan, but that we execute it with precision.

2/ Hands-on Engineering (The "Expert")

  • Strategic Contribution: You are capable of diving into the code to unblock the team or tackle critical optimizations when necessary. Lead by example: You maintain a deep understanding of the codebase to conduct low-level code reviews and both macro and micro decisions, ensuring quality without being the bottleneck.

  • Deep-dive optimization: Spearhead the reverse-engineering of subtle AMD GPU features (memory page mappings and hashing functions, L2 and L3 cache behaviors, wavefront scheduling, register file exploitation, etc.) to unlock raw performance.

  • System-level creativity: Leverage your deep understanding of GPU hardware and AI algorithms to find and implement creative optimization strategies that junior engineers might miss.

3/ Team Leadership & Culture (The "Captain")

  • Manage and Coach: Manage a team of brilliant engineers and PhDs. Whether you are an experienced manager or a senior expert stepping up, your goal is to foster a culture of technical excellence and "no-ego" collaboration.

  • Efficient Communication: You promote a culture of deep work and asynchronous communication to minimize interruptions while keeping the team aligned.

  • Entrepreneurial Ownership: Act not just as an employee, but as a builder of the company. Make decisions that prioritize the startup's long-term success, velocity, and scalability.

Whom we'd like to work with:

  • We are seeking a unique profile: a deeply technical engineer who enjoys the craft of coding, but also possesses the maturity to lead a team and develop a roadmap.

You are a "force multiplier" and make everyone around you better.

1/ Technical Expertise:

  • Low-Level Authority: You have experience in modern C++ and Assembly. You are comfortable working close to the metal.

  • GPU Intimacy: You understand exactly how GPUs work under the hood (memory hierarchy, cache coherency, thread scheduling). You are familiar with NVIDIA (CUDA/PTX) and/or AMD (ROCm/HIP/CDNA) architectures.

  • Optimization Obsession: You have a proven track record of extracting every ounce of performance from hardware. You know how to use profilers and debuggers to solve complex bottlenecks.

2/ Leadership & Accountability:

  • Transform Directions into Concrete Plans: You can take high-level, sometimes abstract scientific directives and turn them into a concrete, executed engineering plan.

  • Flexible Experience Level:

    • Option A: You are already an Engineering Manager / Tech Lead with experience managing a high-performance team.

    • Option B: You are a Senior/Staff Engineer at a top-tier tech company, looking to take the next step in your career and shoulder managerial responsibilities.

  • Delivery Focus: You prioritize shipping. You understand the trade-offs between "perfect code" and "market timing."

3/ Mindset:

  • Superstar without the Ego: You are confident in your skills but humble in your interactions. You are hands-on and willing to do the "grunt work" when necessary.

  • Entrepreneurial Drive: You treat the company as if it were your own. You are resilient, adaptable, and motivated by the ambition of building a European AI giant.

  • Curiosity: You are not afraid of the unknown. If the documentation doesn't exist, you reverse-engineer it.

What we offer:

  • Top-Tier Compensation: We offer a highly competitive salary package (top of the market) tailored to match your expertise and leadership level.

  • Real Ownership (BSPCE): You aren't just an employee; you are a partner. We offer significant equity to ensure you share in the startup's success.

  • Unrivaled Technical Playground: Work on the bleeding edge of AI hardware. You will have access to the compute power you need (high-end clusters) to perform your magic.

  • A world-class Environment: Join a high-density talent team of 12 engineers (including 5 PhDs). We value peer-to-peer learning, high autonomy, and zero bureaucracy.

  • Impact & Autonomy: As a Lead, you will have a direct seat at the table to shape our engineering culture and roadmap alongside the CEO.

  • Remote-First & Team Bonding: We operate as a remote-first company, valuing autonomy and deep work. Our culture is punctuated by our monthly "Paris Weeks" one week per month, where the whole team gathers at our WeWork offices in the 13th district (near Station F), the heart of Paris' tech scene. These weeks are dedicated to strategic alignment, intense collaboration, and team bonding.

Ready to build the 10,000+ tokens/sec stack? Apply directly to start the conversation


  • Lead GPU Engineer

    Il y a 25 minutes


    Paris, Île-de-France Kog AI Temps plein

    KOG: Kog is a European VC-funded startup and real-time AI frontier lab building the world's fastest AI execution layer. As part of the 2030 French Tech cohort, we are on a mission to redefine the boundaries of artificial intelligence by enabling true real-time interaction at a scale never seen before.While the industry often settles for incremental software...

  • Lead Research Engineer

    Il y a 38 minutes


    Paris, Île-de-France Kog AI Temps plein

    KOG: Kog is a European VC-funded startup and real-time AI frontier lab building the world's fastest AI execution layer. As part of the 2030 French Tech cohort, we are on a mission to redefine the boundaries of artificial intelligence by enabling true real-time interaction at a scale never seen before.While the industry often settles for incremental software...

  • Lead Research Engineer

    Il y a 18 minutes


    Paris, Île-de-France Kog AI Temps plein

    KOG: Kog is a European VC-funded startup and real-time AI frontier lab building the world's fastest AI execution layer. As part of the 2030 French Tech cohort, we are on a mission to redefine the boundaries of artificial intelligence by enabling true real-time interaction at a scale never seen before.While the industry often settles for incremental software...

  • DevOps Engineer

    Il y a 57 minutes


    Paris, Île-de-France Wirk Temps plein

    DevOps Engineer (H/F) Paris 2ᵉ – Sur site WIRK – Startup IA & traitement documentaireÀ propos de WIRKWIRK est unestartup tech d'environ 10–15 personnes, baséedans le 2ᵉ arrondissement de Paris, spécialisée dans letraitement documentaire avancéet l'extraction intelligente de données.Nous concevons des plateformes robustes...

  • Site Reliability Engineer

    Il y a 16 minutes


    Paris, Île-de-France Criteo Temps plein

    What You'll Do:At Criteo, our Platform Core group builds the foundational services that power our global advertising platform. We design and operate scalable, resilient systems that support real-time decision-making and data processing at massive scale.As we expand our capabilities in high-performance inference and distributed computing, we're forming a new...

  • Senior DevOps Engineer

    Il y a 19 minutes


    Paris, Île-de-France Wirk Temps plein

    WIRK — Senior DevOps Engineer (Kubernetes, Terraform, LLM & ELK) — Paris (On-site)À propos de WIRKWIRK est une entreprise innovante spécialisée dans l'extraction documentaire grace à des plateformes scalables et intelligentes qui répondent aux défis techniques modernes.Chez WIRK, nous plaçons lafiabilité, la performance et l'innovation au cœur...

  • Lead Electrical Engineer

    Il y a 48 minutes


    Paris, Île-de-France Leap29 Temps plein

    Job Description Senior / Lead Electrical Engineer – Paris, France Start Date: September 2025Duration: 6 months (renewable – potential for long-term project)Rates: Candidates will work of a daily rateWe are seeking an experienced Senior / Lead Electrical Engineer to support a major international project in Paris. This is a senior discipline leadership...

  • Methods Lead engineer

    Il y a 16 minutes


    Paris, Île-de-France Airswift Temps plein

    Subsea Installation Engineer – Rigid JumpersLocation:Offices based in Puteaux or Montigny-le-Bretonneux, FranceOverviewWe are seeking an experienced Subsea Installation Engineer with a strong background in rigid jumper systems to join a leading offshore engineering team. This role involves developing and validating installation procedures, leading...

  • Lead Data Engineer

    Il y a 33 minutes


    Paris, Île-de-France Collective Temps plein

    Budget: € à €/anTu es passionné(e) par les données et tu aimes concevoir des architectures scalables et performantes ? Piloter des équipes techniques et transformer des données brutes en insights stratégiques t'enthousiasme ?Rejoinsskiilsen tant queLead Data Engineeret prends la responsabilité des projets data les plus ambitieux dans le secteur...

  • Lead data engineer

    il y a 1 heure


    Paris, Île-de-France Collective Temps plein

    Budget: 690Descriptif du posteLe Lead travaillera dans un environnement transverse à l'organisation et pourra s'appuyer notamment sur :Delivery Performance pour favoriser un environnement de travail conforme à la méthode agile au sein de ta squad, pour accompagner le Lead dans la résolution de problématiques techniques avec le(s) coach(s)...