Stage: Development of a Deep Learning methodology for synthetic Spatial Metabolomics data generation

il y a 2 semaines


Bordeaux, Nouvelle-Aquitaine, France Centre de Bioinformatique de Bordeaux Temps plein

Development of a Deep Learning methodology for synthetic Spatial Metabolomics data generation

Background

Spatial metabolomics, based on mass spectrometry imaging (MSI), offers unique insights into tissue metabolite distributions. However, analyzing MSI data is challenging due to its high dimensionality, sparsity, and complex spatial dependencies. While spatial transcriptomics (ST) has benefited from deep learning to address similar challenges, MSI lacks comparable advances due to a scarcity of annotated datasets.

Project

This project is based on the hypothesis that distinct gene expression profiles at the spot level reflect distinct metabolic states, allowing us to predict corresponding metabolite quantifications. To do this we will use paired ST and MSI data from brain cancer samples (Heiland et al., 2022) to develop a deep learning model predicting metabolite quantification values from spot-level gene expression. In particular, we will use pre-processed ST data already partitioned into spatial clusters, each characterized by a unique gene expression signature (a defined set of genes).

We propose to employ a two-stage approach.

  • First, a model to predict the presence/absence of each metabolite value for each spot, generating a binary vector of length n, where n is the total number of metabolites (m1,...,mn). We will explore suitable multi-label classification strategies for this mi selection step. This can be achieved by training individual binary classifiers (e.g., logistic regression, SVM with one-vs-rest, or single-output neural networks with sigmoid activation) for each mi value. We will also consider using a neural network with multiple sigmoid outputs, one for each mi, allowing for simultaneous prediction of all metabolite presence/absence labels. The input for these models will be gene expression vector (g1,...,gk) and a signature mask (1 for the signature genes and 0 for the rest) for each spot.
  • Second, a model to predict the intensities of all metabolite values (continuous outputs) for each spot based on the gene expression vector (g1,...,gk). To address this regression problem, we propose to begin with baseline models (e.g. Linear and/or Ridge Regression, Multilayer Perceptron) to establish a performance benchmark before exploring more complex architectures such as Transformers. To handle the varying sets of selected mi values across spots, the output (a vector of length n) is element-wise multiplied by the binary presence/absence vector from the first stage, masking the intensities of non-selected mi values to zero. The input is the gene expression vector but can potentially be combined with positional encodings for mi.

Model performance will be evaluated using appropriate metrics. For the metabolite selection task, we will use precision, recall, F1-score, and AUC. For the intensity prediction task, we will use mean squared error, R-squared, and Pearson correlation. Cross-validation will be used for model selection and hyperparameter tuning. Regularization techniques will be employed for deep learning models to prevent overfitting. Attention visualization will be explored to provide biological insights into the model's predictions.

Candidate profile

We are looking for Master 2 or last year Engineering school student (e.g. Computer Science, Data Engineering etc), motivated to work in an interdisciplinary environment.

References:

Yilmaz, M., Fondrie, W.E., Bittremieux, W. et al. Sequence-to-sequence translation from mass spectra to peptides with a transformer model. Nat Commun 15,

Pizurica, M., Zheng, Y., Carrillo-Perez, F. et al. Digital profiling of gene expression from histology images with linearized attention. Nat Commun 15,

Type d'emploi : Stage

Durée du contrat : 6 mois

Rémunération : 17,75€ à 31,61€ par heure

Avantages :

  • Intéressement et participation
  • Prise en charge du transport quotidien
  • RTT

Langue:

  • Anglais (Optionnel)

Lieu du poste : En présentiel

Date limite de candidature : 15/02/2025



  • Bordeaux, Nouvelle-Aquitaine, France Ubisoft Temps plein

    Description de l'entreprise Ubisoft is a global leader in gaming with teams across the world creating original and memorable gaming experiences, from Assassin's Creed, RainbowSix to Just Dance and more. We believe diverse perspectives help both players and teams thrive. If you're passionate about innovation and pushing entertainment boundaries, join our...


  • Bordeaux, Nouvelle-Aquitaine, France Ubisoft Bordeaux Temps plein

    Company DescriptionUbisoft is a global leader in gaming with teams across the world creating original and memorable gaming experiences, from Assassin's Creed, RainbowSix to Just Dance and more. We believe diverse perspectives help both players and teams thrive. If you're passionate about innovation and pushing entertainment boundaries, join our journey and...

  • Learning & Development Intern

    il y a 2 semaines


    Bordeaux, Nouvelle-Aquitaine, France Back Market Temps plein

    Hi, we're Back Market.We're here to help make tech reliable, affordable, and better than new. We're a global marketplace for refurbished devices, helping lower our collective environmental impact by providing trustworthy, affordable tech with 92% less carbon emissions than new.Yep, you read that right. Turns out refurbished tech is way better for the planet...

  • Learning & Development Intern

    il y a 2 semaines


    Bordeaux, Nouvelle-Aquitaine, France Back Market Temps plein

    LocationBordeauxEmployment TypeInternLocation TypeOn-siteDepartmentINTERNSHIPSHi, we're Back Market.We're here to help make tech reliable, affordable, and better than new. We're a global marketplace for refurbished devices, helping lower our collective environmental impact by providing trustworthy, affordable tech with 92% less carbon emissions than new.Yep,...

  • Freelance Data Scientist

    il y a 2 semaines


    Bordeaux, Nouvelle-Aquitaine, France Mirakl Temps plein

    About MiraklMirakl is the leading provider of eCommerce software solutions. Mirakl's suite of solutions provides enterprises with a transformative way to drive significant growth and efficiency in their online business.Since 2012, Mirakl has been pioneering the platform economy, empowering retail and B2B enterprises with the most advanced, secure and...

  • Data Scientist

    il y a 5 jours


    Bordeaux, Nouvelle-Aquitaine, France Mirakl Temps plein

    About MiraklMirakl is the leading provider of eCommerce software solutions. Mirakl's suite of solutions provides enterprises with a transformative way to drive significant growth and efficiency in their online business.Since 2012, Mirakl has been pioneering the platform economy, empowering retail and B2B enterprises with the most advanced, secure and...

  • Data Architect F/M

    il y a 1 jour


    Bordeaux, Nouvelle-Aquitaine, France Betclic Group Temps plein

    WE ARE BETCLICBetclic, European leader in sports betting, is much more than just an online gaming site Also offering Poker, Horse Racing, and Casino games across various countries and continents, it is an inspiring and forward-thinking company: every day brings new challenges in a modern and dynamic environment. As an influential player in the tech industry,...

  • Data Engineer

    il y a 1 jour


    Bordeaux, Nouvelle-Aquitaine, France Betclic Group Temps plein

    WE ARE BETCLICBetclic, European leader in sports betting, is much more than just an online gaming site Also offering Poker, Horse Racing, and Casino games across various countries and continents, it is an inspiring and forward-thinking company: every day brings new challenges in a modern and dynamic environment. As an influential player in the tech industry,...


  • Bordeaux, Nouvelle-Aquitaine, France Betclic Group Temps plein

    WE ARE BETCLIC Betclic, European leader in sports betting, is much more than just an online gaming site Also offering Poker, Horse Racing, and Casino games across various countries and continents, it is an inspiring and forward-thinking company: every day brings new challenges in a modern and dynamic environment. As an influential player in the tech...

  • Senior Data Scientist F/M

    il y a 2 semaines


    Bordeaux, Nouvelle-Aquitaine, France Betclic Group Temps plein

    WE ARE BETCLIC Betclic, European leader in sports betting, is much more than just an online gaming site Also offering Poker, Horse Racing, and Casino games across various countries and continents, it is an inspiring and forward-thinking company: every day brings new challenges in a modern and dynamic environment. As an influential player in the tech...