Data Engineer
il y a 2 semaines
Big Data Engineer - Pipelines
Location: Paris or Toulouse
Salary: €60, ,000
Our mission:
Only 1% of freshwater on earth is usable for humanity's needs and climate change is exacerbating both water scarcity and water hazards – such as floods, droughts, and water pollution.
BWI is an impact startup on a mission to use data and machine learning to provide governments and companies with river flow forecast services in order to help better adapt to the impact of climate change on water management and availability. BWI subscribers benefit from near real-time hydrological forecasts based on dozens of data sources – such as space sensors, weather radars, and other ground sensors.
The role:
After recently securing a major new contract, we're now looking for 3 x Big Data Engineers to join our engineering team in either our Paris or Toulouse offices.
Working as part of either our satellite or water simulation team, you will be responsible for building and operating data ingestion and processing pipelines for terabytes of weather and hydrological data.
You'll be focused on ensuring scalable, reproducible and production-grade delivery of model inputs and outputs.
Responsibilities:
- Implement ingestion (batch/streaming), ELT/ETL steps, and data publishing workflows.
- Handle scientific formats (netCDF, GRIB2) and columnar storage (Parquet); optimize I/O and algorithms.
- Design storage with eventual‑consistency patterns (atomic publishes, manifests, versioned paths) and a metadata catalog.
- Partition and parallelize workloads for distributed compute; compact small files and tune for cost/performance.
- Build and run containerized services and orchestrated workflows; ensure observability, retries, idempotency, and runbooks.
- Collaborate with scientists to define data models and validation rules.
Top paradigms & architectural patterns required:
- ELT-first with ETL where needed; streaming/micro-batch for low-latency sources
- Data lake + metadata catalogue (object storage), data model design for cataloguing
- Partitioned columnar storage & distributed data-parallel processing
- Idempotent, restartable workflows with orchestration
- Versioned datasets, atomic publish patterns, and catalogue as source of truth
- Observability-driven ops and infra-as-code
Essential tools & technologies:
- Python (xarray, netCDF4, pyarrow), PySpark or Dask
- S3-compatible object storage; Parquet format
- PostgreSQL / PostGIS
- Kubernetes + Docker for deployment
- AWS (S3, EKS, EC2) or equivalent cloud; Terraform for IaC
Must-haves:
- 3+ years relevant experience building/operating large data pipelines
- Strong software engineering in Python with testing and CI/CD
- Practical experience with partitioning, and parallelism
- Computer science education
- Understanding of architectural patterns in storing & processing large volumes of data
- Humility & eagerness to accept both: full hands-on coding position or technical leadership depending on circumstances
- Clear communication, teamwork, and analytical problem-solving
Nice-to-haves:
- Hydrology, meteorology, remote-sensing or space-ground-segment experience
- STAC, GeoTIFF, PostGIS, Numba/Cython performance tuning
- Understanding of low-level subject (memory management, HTTP, S3 implementation)
- Willingness to develop managerial skills
Our culture:
- Dynamic, fast paced start-up, working on an important problem for humanity.
- Multicultural Team - With team members from various countries like France, Poland, India, Lebanon and Morocco, you will be exposed to a multicultural work environment.
- Offices in Central Paris and Toulouse
- Diverse Domains - work across software engineering, hydrology, climate and satellite.
- Hybrid working w/ 4 days in office and 1 day remote.
-
Data Engineer
il y a 2 semaines
Paris, Île-de-France MP DATA Temps pleinEn tant que Data Engineer Senior, vous jouerez un rôle clé dans la construction, l'optimisation et la fiabilisation de nos pipelines de données à grande échelle, au cœur de notre plateforme analytique. Votre expertise sur Databricks et l'environnement Spark sera essentielle pour garantir des traitements performants, sécurisés et scalables.Vos...
-
Data Engineer
il y a 2 semaines
Paris, Île-de-France MP Data Temps pleinEn tant que Data Engineer Senior, vous jouerez un rôle clé dans la construction, l'optimisation et la fiabilisation de nos pipelines de données à grande échelle, au coeur de notre plateforme analytique. Votre expertise sur Databricks et l'environnement Spark sera essentielle pour garantir des traitements performants, sécurisés et scalables. Vos...
-
Junior Data Engineer
il y a 2 semaines
Paris, Île-de-France Data Reply Temps pleinJunior Data EngineerTasksImplementing new use casesMapping data and data flowsImplementing data analysis and processing pipelinesIndustrializing data flows and their visualization through dashboards and reportingCarrying out unit tests and integration tests BenefitsStructured career progression – at Reply, we encourage career development and will...
-
Data Engineer Expérimenté
il y a 2 semaines
Paris, Île-de-France MP DATA Temps pleinNous recherchons un(e)Data Engineer expérimenté(e)pour intervenir sur lamise en production, la fiabilisation et l'évolutiond'une plateforme data moderne basée surAWS, Spark et Dataiku.Vous participerez activement à laconstruction et l'optimisationdes environnements de traitement de données à grande échelle, en lien étroit avec les équipes Data...
-
Graduate Data Engineer
il y a 2 semaines
Paris, Île-de-France Data Reply Temps pleinGraduate Data EngineerTasks• Implementing new use cases• Mapping data and data flows• Implementing data analysis and processing pipelines• Industrializing data flows and their visualization through dashboards and reporting• Carrying out unit tests and integration tests Benefits• Structured career progression – at Reply, we encourage career...
-
Data Scientist NLP/LLM Engineer confirmé(e)
il y a 2 semaines
Paris, Île-de-France MP DATA Temps pleinESN spécialisée Data & IA pour les environnements industriels. Pour l'un de nos clients, nous recherchons unLLM Engineerchargé d'industrialiser lesPOC GenAIdéveloppés par les équipes Data Science et de déployer des solutions robustes et scalables en production.Développement et Industrialisation des POC LLM / GenAI.Conception et optimisation de...
-
Data Engineer
il y a 2 semaines
Paris, Île-de-France Kaino Temps plein*Description Du Poste*Client Final / Tech – HospitalityEntreprise SaaS / Data / CloudCDI – Paris CentreKaino, cabinet de recrutement spécialisé, recherche unData Engineeren CDI pour un acteur international du secteur Tech / Hospitality, en pleine transformation de ses plateformes data.Vous rejoignez une équipe data stratégique au sein d'un...
-
DATA ENGINEER
il y a 2 semaines
Paris, Île-de-France Collective Temps pleinBudget: 500 euros/jourMission :Data engineerLocalisation : Paris 17Démarrage : ASAPJours obligatoires sur site : 5 jours/semaineExpérience : 5-8 ans minimumTJM: 500FICHE MISSION – DATA ENGINEER AZURE (KPI, DBT, MÉTIER)IntituléData Engineer Azure – KPI, dbt (Data Build Tool) & compréhension métierContexteDans le cadre du renforcement de son équipe...
-
data engineer
il y a 2 semaines
Paris, Île-de-France Collective Temps pleinMission :Data engineerLocalisation : Paris 17Démarrage : ASAPJours obligatoires sur site : 5 jours/semaineExpérience : 5-8 ans minimumTJM: 500FICHE MISSION – DATA ENGINEER AZURE (KPI, DBT, MÉTIER)IntituléData Engineer Azure – KPI, dbt (Data Build Tool) & compréhension métierContexteDans le cadre du renforcement de son équipe data, le client...
-
Data Engineer
il y a 2 semaines
Paris, Île-de-France RED Global Temps plein***Data Engineer – Paris – Hybride***RED Global est à la recherche d'unData Engineerpour venir rejoindre les équipes de l'un de nos clients à Paris dans le cadre de leurs projets en cours.Compétences requises :Expérience de 3 à 5 ans maximum en tant que Data EngineerBonne maitrise de PythonExpérience solide avec Docker et l'orchestration des...