Internship: Python Data Processing On Supercomputers for Large Parallel Numerical Simulations
il y a 1 semaine
Le descriptif de l’offre ci-dessous est en Anglais_
**Type de contrat**: Convention de stage
**Niveau de diplôme exigé**: Bac + 4 ou équivalent
**Fonction**: Stagiaire de la recherche
**A propos du centre ou de la direction fonctionnelle**:
The Centre Inria de l’Université de Grenoble groups together almost 600 people in 23 research teams and 9 research support departments.
Staff is present on three campuses in Grenoble, in close collaboration with other research and higher education institutions (Université Grenoble Alpes, CNRS, CEA, INRAE,), but also with key economic players in the area.
The Centre Inria de l’Université Grenoble Alpe is active in the fields of high-performance computing, verification and embedded systems, modeling of the environment at multiple levels, and data science and artificial intelligence. The center is a top-level scientific institute with an extensive network of international collaborations in Europe and the rest of the world.
**Contexte et atouts du poste**:
The length of the internship is _4 months minimum_ and the start date is flexible, but need a 2 month delay before starting the interhsip due to administrative constraints. The DataMove team is a friendly and stimulating environment that gathers Professors, Researchers, PhD and Master students all leading research on High-Performance Computing. The city of Grenoble is a student-friendly city surrounded by the Alps mountains, offering a high quality of life and where you can experience all kinds of mountain-related outdoor activities.
**Mission confiée**:
Without a significant change in practices, the increased computing capacity of the next generation of computers will lead to an explosion in the volume of data produced by numerical simulations. Managing this data, from production to analysis, is a major challenge.
The use of simulation results is based on a well-established calculation-storage-calculation protocol. The difference in capacity between computers and file systems makes it inevitable that the latter will be clogged. For instance, the Gysela code in production mode can produce up to 5TB of data per iteration. It is obvious that storing 5TB of data is not feasible at high frequency. What's more, loading this quantity of data for later analysis and visualization is also a difficult task. To bypass this difficulty, we choose to rely on the in situ data analysis approach.
We developed an in situ data processing approach, called Deisa, relying on Dask, a Python environment for distributed tasks. Dask defines tasks that are executed asynchronously on workers once their input
data are available. The user defines a graph of tasks to be executed. This graph is then forwarded to the Dask scheduler. The scheduler is in charge of (1) optimizing the task graph and (2) distributing the tasks
for execution on the different workers according to a scheduling algorithm aiming at minimizing the graph execution time.
Deisa extends Dask so it becomes possible to couple a MPI-based parallel simulation code with Dask. Deisa enables the simulation code to directly send newly produced data into the worker memories, notifies
the Dask scheduler that these data are available for analysis and that associated tasks can then be scheduled for execution.
Compared to previous in situ approaches, which are typically MPI-based, our approach, relying on Python tasks, strikes a good balance between programming ease and runtime performance.
But Dask has one major limitation: the scheduler is centralized creating a performance bottleneck at large scale. To circumvent this limitation we developed a variation of Deisa (Deisa-on-Ray or Doreisa) that relies on
the Ray runtime. Ray is a framework for distributed task and actors very popular in the AI community. Ray is more flexible than Dask and supports a distributed task scheduler, making it a more suitable runtime than Dask when targeting the large scale.
What Dask-on-Ray acheives is:
- The Dask task graph is split in sub-graphs and distributed to different Ray Actors
- These Ray actors implement a local Dask scheduler. Each Dask to be executed is turned into a Ray task and handled to the local Ray scheduler. The execution of the Dask task graph is them distributed, showing sginficant performance gains
- If a rask requires a data that is actually produced by an other task handled by an other remote Ray scheduling actor, the Ray scheduler will fetch it automatically by relying on the Ray reference mechanism
(can be seen as some kind of distributed smart pointer).
Dask-on-Ray has demonstrated significant performance improvement at scale (tested with up to 15 000 core) than the pure Dask-based appraoch.
The goal of this internship is to investigate solutions for:
- Further improving performance. In situ analytics often repeats the execution of the same task graph at different iterations. So far
the task graph is always processed, split and distributed at each iteration, while it could be
-
Internship: Python Data Processing On
il y a 2 semaines
Saint-Martin-d'Hères (38), France Inria Temps pleinLe descriptif de l’offre ci-dessous est en Anglais_ **Type de contrat **:Convention de stage **Niveau de diplôme exigé **:Bac + 4 ou équivalent **Fonction **:Stagiaire de la recherche **A propos du centre ou de la direction fonctionnelle**: The Centre Inria de l’Université de Grenoble groups together almost 600 people in 23 research teams and 9...
-
Saint-Martin-d'Hères, France Inria Temps pleinLe descriptif de l’offre ci-dessous est en Anglais_ **Type de contrat**: Convention de stage **Niveau de diplôme exigé**: Bac + 4 ou équivalent **Fonction**: Stagiaire de la recherche **A propos du centre ou de la direction fonctionnelle**: The Centre Inria de l’Université de Grenoble groups together almost 600 people in 23 research teams and 9...
-
Internship: Adaptive Sampling for Training Deep
il y a 2 semaines
Saint-Martin-d'Hères (38), France Inria Temps pleinLe descriptif de l’offre ci-dessous est en Anglais_ **Type de contrat **:Convention de stage **Niveau de diplôme exigé **:Bac + 4 ou équivalent **Fonction **:Stagiaire de la recherche **A propos du centre ou de la direction fonctionnelle**: The Centre Inria de l’Université de Grenoble groups together almost 600 people in 23 research teams and 9...
-
Saint-Martin-d'Hères, France Inria Temps pleinLe descriptif de l’offre ci-dessous est en Anglais_ **Type de contrat**: Convention de stage **Niveau de diplôme exigé**: Bac + 4 ou équivalent **Fonction**: Stagiaire de la recherche **A propos du centre ou de la direction fonctionnelle**: The Centre Inria de l’Université de Grenoble groups together almost 600 people in 23 research teams and 9...
-
Internship: Combining Transformers and Normalizing
il y a 2 semaines
Saint-Martin-d'Hères (38), France Inria Temps pleinLe descriptif de l’offre ci-dessous est en Anglais_ **Type de contrat **:Convention de stage **Niveau de diplôme exigé **:Bac + 4 ou équivalent **Fonction **:Stagiaire de la recherche **A propos du centre ou de la direction fonctionnelle**: The Centre Inria de l’Université de Grenoble groups together almost 600 people in 23 research teams and 9...
-
Saint-Paul-lès-Durance, Provence-Alpes-Côte d'Azur, France CEA Temps pleinGeneral information Organisation The French Alternative Energies and Atomic Energy Commission (CEA) is a key player in research, development and innovation in four main areas :• defence and security,• nuclear energy (fission and fusion),• technological research for industry,• fundamental research in the physical sciences and life sciences.Drawing...
-
Signal Processing on Temporal Graphs
il y a 2 semaines
Saint-Omer, Hauts-de-France Laboratoire d'Informatique Signal et Image de la Côte d'Opale Université du Littoral Côte d'Opale Temps pleinSignal Processing on Temporal GraphsRéf ABG-134823Stage master 2 / IngénieurDurée 6 moisSalaire net mensuel 670 euros (gratification minimale légale)18/12/2025Laboratoire d'Informatique Signal et Image de la Côte d'Opale / Université du Littoral Côte d'OpaleLieu de travailSaint-Omer Les Hauts de France FranceChamps scientifiquesInformatiqueDate limite...
-
Computational Design Intern
il y a 4 jours
Saint-Ouen, France Samsung Electronics Temps pleinPosition Summary Location: Paris, France Duration: 6 months Start date: Flexible (February/March 2026) Position Summary Samsung Design Innovation Center (SDIC) is looking for a Computational Design Intern who is passionate about design, technology, and computational workflows. As part of the Computational Design team, you will work at the intersection of...
-
Software Development Intern
il y a 5 jours
Montbonnot-Saint-Martin, France Moody's Temps plein**Location(s)**: - 20 Rue Lavoisier, Montbonnot Saint Martin, 38330, FR **Line Of Business**: Banking OU(BANKING OU) **Job Category**: - Students & Early Careers **Experience Level**: Early Career Software Development Internship Job ID 6300 - Banking OU - Montbonnot, FRANCE - Full Time At Moody's, we unite the brightest minds to turn today’s risks...
-
PhD Position F/M Modelling of curly hair
il y a 2 semaines
Montbonnot-Saint-Martin, Auvergne-Rhône-Alpes, France Inria Temps pleinLe descriptif de l'offre ci-dessous est en AnglaisType de contrat : CDDNiveau de diplôme exigé : Bac + 5 ou équivalentFonction : DoctorantNiveau d'expérience souhaité : De 3 à 5 ansA propos du centre ou de la direction fonctionnelleThe Centre Inria de l'Université de Grenoble groups together almost 600 people in 26 research teams and 9 research...