Internship: Python Data Processing On
il y a 1 semaine
Le descriptif de l’offre ci-dessous est en Anglais_
**Type de contrat **:Convention de stage
**Niveau de diplôme exigé **:Bac + 4 ou équivalent
**Fonction **:Stagiaire de la recherche
**A propos du centre ou de la direction fonctionnelle**:
The Centre Inria de l’Université de Grenoble groups together almost 600 people in 23 research teams and 9 research support departments.
Staff is present on three campuses in Grenoble, in close collaboration with other research and higher education institutions (Université Grenoble Alpes, CNRS, CEA, INRAE,), but also with key economic players in the area.
The Centre Inria de l’Université Grenoble Alpe is active in the fields of high-performance computing, verification and embedded systems, modeling of the environment at multiple levels, and data science and artificial intelligence. The center is a top-level scientific institute with an extensive network of international collaborations in Europe and the rest of the world.
**Contexte et atouts du poste**:
The length of the internship is _4 months minimum_ and the start date is flexible, but need a 2 month delay before starting the interhsip due to administrative constraints. The DataMove team is a friendly and stimulating environment that gathers Professors, Researchers, PhD and Master students all leading research on High-Performance Computing. The city of Grenoble is a student-friendly city surrounded by the Alps mountains, offering a high quality of life and where you can experience all kinds of mountain-related outdoor activities.
**Mission confiée**:
Without a significant change in practices, the increased computing capacity of the next generation of computers will lead to an explosion in the volume of data produced by numerical simulations. Managing this data, from production to analysis, is a major challenge.
The use of simulation results is based on a well-established calculation-storage-calculation protocol. The difference in capacity between computers and file systems makes it inevitable that the latter will be clogged. For instance, the Gysela code in production mode can produce up to 5TB of data per iteration. It is obvious that storing 5TB of data is not feasible at high frequency. What's more, loading this quantity of data for later analysis and visualization is also a difficult task. To bypass this difficulty, we choose to rely on the in situ data analysis approach.
In situ consists in coupling the parallel simulation code, Gysela for instance, with a data analytics code that processes the data online, as soon as produced. In situ enables reducing the amount of data to write to disk, limiting the pressure on the file system. This is a mandatory approach to run massive simulations like Gysela on the latest Exascale supercomputers.
We developed an in situ data processing approach, called Deisa, relying on Dask, a Python environment for distributed tasks. Dask defines tasks that are executed asynchronously on workers once their input data are available. The user defines a graph of tasks to be executed. This graph is then forwarded to the Dask scheduler. The scheduler is in charge of (1) optimizing the task graph and (2) distributing the tasks for execution on the different workers according to a scheduling algorithm aiming at minimizing the graph execution time.
Deisa extends Dask so it becomes possible to couple a MPI-based parallel simulation code with Dask. Deisa enables the simulation code to directly send newly produced data into the worker memories, notify the Dask scheduler that these data are available for analysis and that associated tasks can then be scheduled for execution.
Compared to previous in situ approaches that are mainly MPI-based, our approach relying on Python tasks makes for a good tradeoff between programming ease and runtime performance.
The goal of this internship is to investigate solutions to improve task placement and thus performance enabling tasks to be scheduled in process (into the simulation processes), in situ (running on external processes but on the same compute nodes that also run the simulation code), in transit (on dedicated nodes different from the simulation nodes). Running closer to the simulation reduces the need for data movements, but can potentially steal resources (CPU, GPU, network, memory, cache) from the simulation and slow it down. Dask task graph optimization is a good starting point to develop such approaches.
**References**
**Principales activités**:
**Compétences**:
Expected skills include
- Knowledge on distributed, parallel computing and numerical simulations.
- Python, Numpy, Parallel programming (MPI)
- English (working language)
**Avantages**:
- Subsidized meals
- Partial reimbursement of public transport costs
- Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
- Possibility of teleworking (90 days /
-
Signal Processing on Temporal Graphs
il y a 3 jours
Saint-Omer, Hauts-de-France Laboratoire d'Informatique Signal et Image de la Côte d'Opale Université du Littoral Côte d'Opale Temps pleinSignal Processing on Temporal GraphsRéf ABG-134823Stage master 2 / IngénieurDurée 6 moisSalaire net mensuel 670 euros (gratification minimale légale)18/12/2025Laboratoire d'Informatique Signal et Image de la Côte d'Opale / Université du Littoral Côte d'OpaleLieu de travailSaint-Omer Les Hauts de France FranceChamps scientifiquesInformatiqueDate limite...
-
Internship: Combining Transformers and Normalizing
il y a 1 semaine
Saint-Martin-d'Hères (38), France Inria Temps pleinLe descriptif de l’offre ci-dessous est en Anglais_ **Type de contrat **:Convention de stage **Niveau de diplôme exigé **:Bac + 4 ou équivalent **Fonction **:Stagiaire de la recherche **A propos du centre ou de la direction fonctionnelle**: The Centre Inria de l’Université de Grenoble groups together almost 600 people in 23 research teams and 9...
-
Internship: Adaptive Sampling for Training Deep
il y a 1 semaine
Saint-Martin-d'Hères (38), France Inria Temps pleinLe descriptif de l’offre ci-dessous est en Anglais_ **Type de contrat **:Convention de stage **Niveau de diplôme exigé **:Bac + 4 ou équivalent **Fonction **:Stagiaire de la recherche **A propos du centre ou de la direction fonctionnelle**: The Centre Inria de l’Université de Grenoble groups together almost 600 people in 23 research teams and 9...
-
Fba Data Service Internship
il y a 7 jours
Saint-Michel-sur-Orge, France ING Temps pleinYour role & work environment Within the Finance Business Advice (FBA) department, FBA Data Services team is a pro-active partner empowering the business with valuable and reconciled Management Information concerning P&L and Balance Sheet at overall bank or product and client segment level. As an internship in FBA Data Service you will help to interact with...
-
Senior Data Engineer
il y a 2 semaines
Any-Martin-Rieux, Hauts-de-France Architecture in Motion Inc. Temps pleinJob ResponsibilitiesData Ingestion and Integration: You will need to ingest data from various sources such as databases, files, APIs, and streaming platforms into Azure using services such as Azure Data Factory for ETL jobs.Data Transformation and Processing: Large-scale data transformation and processing tasks to prepare the data for analysis and reporting...
-
Saint-Paul-lès-Durance, Provence-Alpes-Côte d'Azur, France CEA Temps pleinGeneral information Organisation The French Alternative Energies and Atomic Energy Commission (CEA) is a key player in research, development and innovation in four main areas :• defence and security,• nuclear energy (fission and fusion),• technological research for industry,• fundamental research in the physical sciences and life sciences.Drawing...
-
Saint-Martin-d'Hères, France Inria Temps pleinLe descriptif de l’offre ci-dessous est en Anglais_ **Type de contrat**: Convention de stage **Niveau de diplôme exigé**: Bac + 4 ou équivalent **Fonction**: Stagiaire de la recherche **A propos du centre ou de la direction fonctionnelle**: The Centre Inria de l’Université de Grenoble groups together almost 600 people in 23 research teams and 9...
-
Senior Data Engineer
il y a 2 semaines
Any-Martin-Rieux, Hauts-de-France Architecture in Motion Inc. Temps pleinTitle: Senior Data Engineer Location: PakistanRole Type: FulltimeTime: Eastern Standard Time (EST)Responsibilities:Design, implement, and maintain data platforms and pipelines leveraging Microsoft Fabric, Azure Data Factory (ADF), SQL Server, and SSIS.Develop and optimize ETL/ELT workflows for structured and unstructured data using ADF, Fabric Dataflows, and...
-
Internship@grenoble: The Ai Pac-man
il y a 1 semaine
Saint-Martin-d'Hères, France Inria Temps plein_Le descriptif de l’offre ci-dessous est en Anglais_ **Type de contrat**: Convention de stage **Niveau de diplôme exigé**: Bac + 4 ou équivalent **Fonction**: Stagiaire de la recherche **A propos du centre ou de la direction fonctionnelle**: The Centre Inria de l’Université de Grenoble groups together almost 600 people in 23 research teams and 9...
-
Master Data Internship
il y a 8 heures
Saint-Ouen, France Eviosys Temps plein**Master data Internship**: - Requisition ID: 4098 - Location: Saint-Ouen, FR, 93400 - Functional Area: Finance - Experience Level: Graduate/Entry level - Type of Contract: Intern As a global leader in metal packaging technology, Eviosys is renowned for designing and manufacturing a diverse range of innovative, sustainable metal packaging solutions. Our...