Stage: Development of a Deep Learning methodology for synthetic Spatial Metabolomics data generation
il y a 2 semaines
Development of a Deep Learning methodology for synthetic Spatial Metabolomics data generation
Background
Spatial metabolomics, based on mass spectrometry imaging (MSI), offers unique insights into tissue metabolite distributions. However, analyzing MSI data is challenging due to its high dimensionality, sparsity, and complex spatial dependencies. While spatial transcriptomics (ST) has benefited from deep learning to address similar challenges, MSI lacks comparable advances due to a scarcity of annotated datasets.
Project
This project is based on the hypothesis that distinct gene expression profiles at the spot level reflect distinct metabolic states, allowing us to predict corresponding metabolite quantifications. To do this we will use paired ST and MSI data from brain cancer samples (Heiland et al., 2022) to develop a deep learning model predicting metabolite quantification values from spot-level gene expression. In particular, we will use pre-processed ST data already partitioned into spatial clusters, each characterized by a unique gene expression signature (a defined set of genes).
We propose to employ a two-stage approach.
- First, a model to predict the presence/absence of each metabolite value for each spot, generating a binary vector of length n, where n is the total number of metabolites (m1,...,mn). We will explore suitable multi-label classification strategies for this mi selection step. This can be achieved by training individual binary classifiers (e.g., logistic regression, SVM with one-vs-rest, or single-output neural networks with sigmoid activation) for each mi value. We will also consider using a neural network with multiple sigmoid outputs, one for each mi, allowing for simultaneous prediction of all metabolite presence/absence labels. The input for these models will be gene expression vector (g1,...,gk) and a signature mask (1 for the signature genes and 0 for the rest) for each spot.
- Second, a model to predict the intensities of all metabolite values (continuous outputs) for each spot based on the gene expression vector (g1,...,gk). To address this regression problem, we propose to begin with baseline models (e.g. Linear and/or Ridge Regression, Multilayer Perceptron) to establish a performance benchmark before exploring more complex architectures such as Transformers. To handle the varying sets of selected mi values across spots, the output (a vector of length n) is element-wise multiplied by the binary presence/absence vector from the first stage, masking the intensities of non-selected mi values to zero. The input is the gene expression vector but can potentially be combined with positional encodings for mi.
Model performance will be evaluated using appropriate metrics. For the metabolite selection task, we will use precision, recall, F1-score, and AUC. For the intensity prediction task, we will use mean squared error, R-squared, and Pearson correlation. Cross-validation will be used for model selection and hyperparameter tuning. Regularization techniques will be employed for deep learning models to prevent overfitting. Attention visualization will be explored to provide biological insights into the model's predictions.
Candidate profile
We are looking for Master 2 or last year Engineering school student (e.g. Computer Science, Data Engineering etc), motivated to work in an interdisciplinary environment.
References:
Yilmaz, M., Fondrie, W.E., Bittremieux, W. et al. Sequence-to-sequence translation from mass spectra to peptides with a transformer model. Nat Commun 15,
Pizurica, M., Zheng, Y., Carrillo-Perez, F. et al. Digital profiling of gene expression from histology images with linearized attention. Nat Commun 15,
Type d'emploi : Stage
Durée du contrat : 6 mois
Rémunération : 17,75€ à 31,61€ par heure
Avantages :
- Intéressement et participation
- Prise en charge du transport quotidien
- RTT
Langue:
- Anglais (Optionnel)
Lieu du poste : En présentiel
Date limite de candidature : 15/02/2025
-
Improving Non-Player Character Decision Making with ML
il y a 2 semaines
Bordeaux, Nouvelle-Aquitaine, France Ubisoft Temps pleinDescription de l'entreprise Ubisoft is a global leader in gaming with teams across the world creating original and memorable gaming experiences, from Assassin's Creed, RainbowSix to Just Dance and more. We believe diverse perspectives help both players and teams thrive. If you're passionate about innovation and pushing entertainment boundaries, join our...
-
Improving Non-Player Character Decision Making with ML
il y a 2 semaines
Bordeaux, Nouvelle-Aquitaine, France Ubisoft Bordeaux Temps pleinCompany DescriptionUbisoft is a global leader in gaming with teams across the world creating original and memorable gaming experiences, from Assassin's Creed, RainbowSix to Just Dance and more. We believe diverse perspectives help both players and teams thrive. If you're passionate about innovation and pushing entertainment boundaries, join our journey and...
-
Learning & Development Intern
il y a 2 semaines
Bordeaux, Nouvelle-Aquitaine, France Back Market Temps pleinHi, we're Back Market.We're here to help make tech reliable, affordable, and better than new. We're a global marketplace for refurbished devices, helping lower our collective environmental impact by providing trustworthy, affordable tech with 92% less carbon emissions than new.Yep, you read that right. Turns out refurbished tech is way better for the planet...
-
Learning & Development Intern
il y a 2 semaines
Bordeaux, Nouvelle-Aquitaine, France Back Market Temps pleinLocationBordeauxEmployment TypeInternLocation TypeOn-siteDepartmentINTERNSHIPSHi, we're Back Market.We're here to help make tech reliable, affordable, and better than new. We're a global marketplace for refurbished devices, helping lower our collective environmental impact by providing trustworthy, affordable tech with 92% less carbon emissions than new.Yep,...
-
Freelance Data Scientist
il y a 2 semaines
Bordeaux, Nouvelle-Aquitaine, France Mirakl Temps pleinAbout MiraklMirakl is the leading provider of eCommerce software solutions. Mirakl's suite of solutions provides enterprises with a transformative way to drive significant growth and efficiency in their online business.Since 2012, Mirakl has been pioneering the platform economy, empowering retail and B2B enterprises with the most advanced, secure and...
-
Data Scientist
il y a 5 jours
Bordeaux, Nouvelle-Aquitaine, France Mirakl Temps pleinAbout MiraklMirakl is the leading provider of eCommerce software solutions. Mirakl's suite of solutions provides enterprises with a transformative way to drive significant growth and efficiency in their online business.Since 2012, Mirakl has been pioneering the platform economy, empowering retail and B2B enterprises with the most advanced, secure and...
-
Data Architect F/M
il y a 1 jour
Bordeaux, Nouvelle-Aquitaine, France Betclic Group Temps pleinWE ARE BETCLICBetclic, European leader in sports betting, is much more than just an online gaming site Also offering Poker, Horse Racing, and Casino games across various countries and continents, it is an inspiring and forward-thinking company: every day brings new challenges in a modern and dynamic environment. As an influential player in the tech industry,...
-
Data Engineer
il y a 1 jour
Bordeaux, Nouvelle-Aquitaine, France Betclic Group Temps pleinWE ARE BETCLICBetclic, European leader in sports betting, is much more than just an online gaming site Also offering Poker, Horse Racing, and Casino games across various countries and continents, it is an inspiring and forward-thinking company: every day brings new challenges in a modern and dynamic environment. As an influential player in the tech industry,...
-
Head of Product Sportsbook F/M
il y a 3 jours
Bordeaux, Nouvelle-Aquitaine, France Betclic Group Temps pleinWE ARE BETCLIC Betclic, European leader in sports betting, is much more than just an online gaming site Also offering Poker, Horse Racing, and Casino games across various countries and continents, it is an inspiring and forward-thinking company: every day brings new challenges in a modern and dynamic environment. As an influential player in the tech...
-
Senior Data Scientist F/M
il y a 2 semaines
Bordeaux, Nouvelle-Aquitaine, France Betclic Group Temps pleinWE ARE BETCLIC Betclic, European leader in sports betting, is much more than just an online gaming site Also offering Poker, Horse Racing, and Casino games across various countries and continents, it is an inspiring and forward-thinking company: every day brings new challenges in a modern and dynamic environment. As an influential player in the tech...