Data Engineer – Spark Specialist

il y a 2 jours


Paris, France Dataiku Temps plein

Dataiku is The Universal AI Platform giving organizations control over their AI talent processes and technologies to unleash the creation of analytics models and agents. Providing no- low- and full-code capabilities Dataiku meets teams where they are today allowing them to begin building with AI using their existing skills and knowledge. About the Role Dataiku is looking for a Data Engineer specialized in Spark (PySpark) to join our Field Engineering. This role you will work closely with our clients to troubleshoot and optimize complex data pipelines within the Dataiku platform. This includes both reactive support (advanced issues reported via the support portal) and proactive services (performance reviews and architecture advisory missions we propose to clients). You will serve as a technical expert in data processing leveraging SQL and Python frameworks. You will specialize in Spark-based distributed data processing and lakehouse architecture. You will help our clients succeed whether working with SQL-based workflows processing data on Kubernetes Databricks or other modern data platforms. What Youll Do Help customers design build and optimize Flows in Dataiku improving overall project performance and maintainability Debug and enhance complex Spark code and data pipelines for better performance and reliability. Guide clients in tuning and scaling Spark environments such as Kubernetes and Databricks including providing architectural guidance and best practices to enhance performance and reliability. Optimize SQL-based data pipelines to ensure efficient and robust data workflows within Dataiku. Advise clients on integrating different data pipelines (Spark SQL Python) into optimized solutions Collaborate with internal teams to resolve technical issues and contribute to the knowledge base. Who You Are You have deep hands-on experience building debugging and tuning Spark pipelines in production environments. Specifically you have : Spark & PySpark Expertise Proficiency in writing and debugging PySpark code for large-scale data processing. Experience with Parquet Delta Lake and columnar file formats. Understanding of Sparks interaction with metastores (e.g. Hive Unity Catalog). Deep understanding of resource management : Spark executors cores memory and relevant configurations (e.g. ). Expertise in tuning Spark jobs : partitioning caching broadcast joins and avoiding unnecessary shuffles. Lakehouse & Orchestration Familiarity with lakehouse architectures and ACID-compliant data layers (Delta Lake Iceberg Hudi). Experience working with Databricks including Databricks Connect and Databricks Workflows. Experience automating and scheduling Spark jobs using tools like Apache Airflow or native orchestration tools. Core Data Engineering Skills Proven experience developing optimizing and troubleshooting SQL-based data pipelines for efficient ETL and data transformation processes. Proficiency in building and managing data transformation workflows in Python leveraging frameworks such as pandas. Familiarity with data modeling concepts and data quality best practices. Experience integrating data from a variety of sources including databases APIs and cloud storages. Ability to communicate technical concepts effectively to both technical and non-technical stakeholders. What does the hiring process look like #LI-Hybrid #LI-AN1 Initial call with a member of our Technical Recruiting team Video call with the Field Engineer Hiring Manager Technical Assessment to show your skills (Home Test) Debrief of your Tech Assessment with FE Team members Final Interview with the VP Field Engineering What are you waiting for At Dataiku youll be part of a journey to shape the ever-evolving world of AI. Were not just building a product; were crafting the future of AI. If youre ready to make a significant impact in a company that values innovation collaboration and your personal growth we cant wait to welcome you to Dataiku And if youd like to learn even more about working here you can visit our Dataiku LinkedIn page. Protect yourself from fraudulent recruitment activity Dataiku will never ask you for payment of any type during the interview or hiring process. Other than our video-conference application Zoom we will also never ask you to make purchases or download third-party applications during the process. If you experience something out of the ordinary or suspect fraudulent activity please review our page on identifying and reporting fraudulent activity here. Required Experience IC Key Skills GIS,Computer Data Entry,Facilities Management,ADMA,Fleet,Key Account Employment Type: Full-Time Experience: years Vacancy: 1 Our practices are rooted in the idea that everyone should be treated with dignity decency and fairness. Dataiku also believes that a diverse identity is a source of strength and allows us to optimize across the many dimensions that are needed for our success. Therefore we are proud to be an equal opportunity employer. All employment practices are based on business needs without regard to race ethnicity gender identity or expression sexual orientation religion age neurodiversity disability status citizenship veteran status or any other aspect which makes an individual unique or protected by laws and regulations in the locations where we operate. This applies to all policies and procedures related to recruitment and hiring compensation benefits performance promotion and termination and all other conditions and terms of employment. If you need assistance or an accommodation please contact us at : #J-18808-Ljbffr


  • Data Engineer Spark/Scala

    il y a 1 semaine


    Paris, Île-de-France Worldwide People Temps plein

    Data Engineer Spark/ScalaAu sein de la DSI Finance de la DSI, le poste recherché porte sur le besoin d?un Data Engineer Spark/Scala confirmé (5-6 ans d?expérience).Objectifs et livrablesObjectifsLe prestataire intégrera le projet d?intégration en coordination avec le chantier assiette comptable de référence en tant que Data Engineer.Les...

  • data engineer spark

    il y a 2 semaines


    Paris, Île-de-France UCASE CONSULTING Temps plein

    Bonjour ?,Pour le compte de notre client, nous recherchons un data engineer Spark / Scala / Pyspark / Databricks / Azure.Missions principales :Participer au développement des User Stories (US) et réalisation des tests associés.Produire et maintenir la documentation projet : mapping des données, modélisation des pipelines, documents d?exploitation.Mettre...


  • Paris, France Digistrat consulting Temps plein

    **? Poste**: Data Engineer Scala Spark Hadoop **? Secteurs stratégiques**: Banque d?investissement **? Démarrage**: ASAP **? Contexte /Objectifs**: Notre client souhaite une assistance dans le cadre de la mise en ?uvre d'une assiette commune pour les usages Finances et Risques, plus particulièrement sur les données Evènements. **Dans ce cadre, la...

  • data engineer spark/scala

    il y a 5 jours


    Paris, Île-de-France UCASE CONSULTING Temps plein

    Je recherche pour un de mes clients dans le domaine de la banque/Assurance un Data Engineer :Contexte de la missionDans le cadre d'un renfort équipe, nous sommes en recherche d?un profil confirmé avec de bonnes compétences en ingénierie des données (Big Data) pour intervenir sur des sujets en lien avec le périmètre marketing.Le prestataire...

  • Data Engineer

    il y a 2 semaines


    Paris, Île-de-France MP DATA Temps plein

    En tant que Data Engineer Senior, vous jouerez un rôle clé dans la construction, l'optimisation et la fiabilisation de nos pipelines de données à grande échelle, au cœur de notre plateforme analytique. Votre expertise sur Databricks et l'environnement Spark sera essentielle pour garantir des traitements performants, sécurisés et scalables.Vos...

  • Data Engineer Expérimenté

    il y a 1 semaine


    Paris, Île-de-France Mp Data Temps plein

    Nous recherchons un(e) Data Engineer expérimenté(e) pour intervenir sur la mise en production, la fiabilisation et l'évolution d'une plateforme data moderne basée sur AWS, Spark et Dataiku.Vous participerez activement à la construction et l'optimisation des environnements de traitement de données à grande échelle, en lien étroit avec les équipes...

  • Data Engineer

    il y a 3 jours


    Paris, France MP DATA Temps plein

    Généraliste des RH et chargée de recrutementEn tant que Data Engineer Senior, vous jouerez un rôle clé dans la construction, l’optimisation et la fiabilisation de nos pipelines de données à grande échelle, au cœur de notre plateforme analytique. Votre expertise sur Databricks et l’environnement Spark sera essentielle pour garantir des...

  • Data Engineer Expérimenté

    il y a 2 semaines


    Paris, Île-de-France MP DATA Temps plein

    Nous recherchons un(e)Data Engineer expérimenté(e)pour intervenir sur lamise en production, la fiabilisation et l'évolutiond'une plateforme data moderne basée surAWS, Spark et Dataiku.Vous participerez activement à laconstruction et l'optimisationdes environnements de traitement de données à grande échelle, en lien étroit avec les équipes Data...

  • Data Engineer Spark

    il y a 7 jours


    Paris, France UCASE CONSULTING Temps plein

    Je recherche pour l?un de mes clients dans le domaine Banque/Assurance un Data Engineer. Contexte: En tant que Data Engineer vous participerez à la continuité du déploiement de notre architecture BigData: Missions: - Maintenir le code existant - Concevoir et développer de nouvelles fonctions - Mettre en place des pipelines de données. - Metttre en...


  • Paris, France Dataiku Temps plein

    A leading AI platform company in Paris is seeking a skilled Data Engineer specialized in Spark (PySpark). You will work closely with clients to troubleshoot and enhance data pipelines, ensuring efficient performance and reliability. Responsibilities include optimizing Spark environments and integrating various data workflows. The ideal candidate has...