STAGE - Benchmark Small Language Models (SLMs) in semiconductor tool calling context F/H

il y a 2 semaines

Paris, Île-de-France STMicroelectronics Temps plein

At STMicroelectronics, we believe in the power of technology to drive innovation and make a positive impact on people, businesses, and society. As a global semiconductor company, our advanced technologies and chips form the hidden foundation of the world we live in today.

When you join ST, you will be part of a global business with more than 115 nationalities, present in 40 countries, and comprising over 50,000 diverse and dedicated creators and makers of technology around the world.

Developing technologies takes more than talent: it takes amazing people who understand collaboration and respect. People with passion and the desire to disrupt the status quo, drive innovation, and unlock their own potential.

Embark on a journey with us, where you can innovate for a future that we want to make smarter and greener, in a responsible and sustainable way. Our technology starts with you.

A propos de vos missions

Vous aiderez à sélectionner et tester des Small Language Models (SLMs) performants sur CPU et à les connecter avec des outils locaux via le Model Context Protocol (MCP). Vous mettrez également en place des workflows simples pour gérer les données, suivre les expériences et évaluer ces SLM.

Sélectionner et évaluer des Petits Modèles de Langage adaptés aux CPU afin de tester leurs performances prêtes à l'emploi.
Intégrer les SLMs avec les serveurs MCP fournis ; créer éventuellement des serveurs MCP simples pour les outils manquants.
Collecter et organiser un jeu de données standardisé d'instructions associées à des exemples d'utilisation d'outils/API issus de la documentation existante.
Créer un dispositif de test pour comparer les modèles de base avec leurs versions affinées en utilisant le même jeu de données organisé.
Mettre en place un workflow LLMOps léger incluant :
La préparation d'un jeu de données unique et standardisé d'instructions et d'utilisation d'outils/API
Le suivi des expériences
La gestion des versions des modèles et des prompts
La génération automatique de rapports d'évaluation montrant les différences entre modèles de base et modèles affinés
Réaliser des expériences de fine-tuning sur Azure ML en utilisant des frameworks comme Unsloth, Axolotl ou similaires.
Créer un tableau comparatif illustrant la qualité des modèles, leur vitesse, leur consommation mémoire et les améliorations apportées par le fine-tuning, et fournir des recommandations sur les modèles à utiliser.

A propos de vous

Étudiant(e) en dernière année de Master ou d'ingénierie en Informatique, IA, Data Science ou Systèmes Embarqués
Solides compétences en programmation Python
Connaissance des environnements d'exécution et formats de modèles tels que GGUF, , Ollama, etc.
Expérience dans la création et la gestion de datasets pour le fine-tuning de modèles de langage
Notions de base ou expérience avec Azure ML ou autres environnements cloud ML
Compréhension du MCP (Model Context Protocol) et des appels d'outils/fonctions.
Connaissances de base en quantification et notions de concepts comme LoRA/QLoRA
Familiarité avec Git, CLI, Docker
Compétences en communication technique en anglais : B2 minimum.

YOUR ROLE

You will help select and test Small Language Models (SLMs) that work well on CPUs and connect them with local tools through the Model Context Protocol (MCP). You will also set up simple workflows to manage data, track experiments, and evaluate these SLMs.

Select and benchmark CPU friendly Small Language Models (SLMs) to test their out-of-the-box performance.
Integrate SLMs with provided MCP servers; optionally create simple MCP servers for any missing tools.
Collect and curate a standardized dataset of instructions paired with tool/API usage examples from existing documentation.
Create a test setup to compare base models with fine-tuned versions using the same curated dataset.
Set up a lightweight LLMOps workflow that includes:
Preparing a single, standardized dataset of instructions and tool/API usage
Tracking experiments
Managing versions of models and prompts
Automatically generating evaluation reports showing differences between base and fine-tuned models
Run fine-tuning experiments on Azure ML using frameworks like Unsloth, Axolotl, or similar.
Create a comparison chart that shows model quality, speed, memory use, and fine-tuning improvements, and give advice on which models to use.

YOUR SKILLS & EXPERIENCES

Master or final-year engineering student in Computer Science, AI, Data Science, or Embedded Systems
Strong Python programming skills
Familiar with model runtimes and formats such as GGUF, , Ollama, etc.
Experience creating and curating datasets for fine-tuning language models
Basic or notions of Azure ML or other cloud ML environment use
Understanding of MCP (Model Context Protocol) and tool/function calling
Basic quantization and notion of concepts like LoRA/QLoRA
Familiar with Git, CLI, Docker
English technical communication skills (B2).

ST is proud to be one of the 17 companies certified as a 2025 Global Top Employer and the first and only semiconductor company to achieve this distinction. ST was recognized in this ranking thanks to its continuous improvement approach and stands out particularly in the areas of ethics & integrity, purpose & values, organization & change, business strategy, and performance.

At ST, we endeavor to foster a diverse and inclusive workplace, and we do not tolerate discrimination. We aim to recruit and retain a diverse workforce that reflects the societies around us. We strive for equity in career development, career opportunities, and equal remuneration. We encourage candidates who may not meet every single requirement to apply, as we appreciate diverse perspectives and provide opportunities for growth and learning. Diversity, equity, and inclusion (DEI) is woven into our company culture.

To discover more, visit

Internship - Research Engineer - Benchmarking and optimization of tool usage by language models. (in collaboration with MICS CentraleSupelec)

il y a 7 jours

Paris, Île-de-France Alpic Temps plein

About UsAlpic is the all-in-one MCP hosting platform.We're entering a major technological shift where AI is replacing traditional search and beginning to act on our behalf through agents. These agents won't rely on human interfaces, they'll interact through AI-native protocols. The current UI-based web is too slow, costly, and fragile for agents, while also...
Post-Training LLM Engineer

il y a 1 semaine

Paris, Île-de-France Earthian AI Temps plein

Company DescriptionEarthian AI is a leading provider of agentic Data+AI Inference Infrastructure for global insurers and asset owners. Trusted by prominent organizations such as AXA and Allianz, the company specializes in delivering autonomous AI-driven solutions to empower risk, underwriting, claims, and portfolio teams. Its platform ensures...
software / ai engineer intern (yc startup)

il y a 2 semaines

Paris, Île-de-France STATION F Temps plein

À proposHello, we're nao LabsWe are building anopen-source AI agent for data analytics.We are an early stage start-up with2 cofounders- we joinedY CombinatorSpring 2025 batch and STATION F and are now based in Paris 11.We already have a first product - AI IDE for data people - used by100+ users. We are now rolling out anew product- the open source analytics...
Stage - Ingénieur ChatBot

il y a 2 semaines

Paris 16 Passy, Île-de-France myteam Temps plein

À proposNous sommes une jeune équipe de chercheurs et d'ingénieurs spécialisés dans l'Intelligence Artificielle, le Machine Learning et la Data Science. Nous accompagnons les organisations dans la réussite de leurs projets conversationnels, depuis la définition de la stratégie jusqu'à la conception, la modélisation et le développement de chatbots...
MCP & Tools Python Developer - Agent Evaluation Infrastructure

il y a 2 jours

Paris, Île-de-France Mindrift Temps plein

This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency.At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of...
Machine Learning Tools Engineer, Global Siri

il y a 1 jour

Paris, Île-de-France Apple Temps plein

Join the Siri team at Apple Build and contribute to a team and company that is building products, personal devices, and software designed to enrich people's lives. Work on building and advancing the world's most popular intelligent assistant that helps millions of people get things done - just by asking Global Siri works to take Siri to the next level of...
Machine Learning Ops

il y a 1 semaine

Paris, Île-de-France Free-Work Temps plein

Je suis à la recherche pour un de nos clients d'un Machine Learning Ops.Le rôle consiste à garantir l'industrialisation, la fiabilisation, et la mise en production robuste et sécurisée de l'ensemble de nos modèles d'Intelligence Artificielle. Vous serez un pilier dans l'établissement des bonnes pratiques MLOps (Monitoring, Sécurité,...
Expert Machine Learning Ops( Expert MLOps)

il y a 1 semaine

Paris, Île-de-France KeyPeople Consulting Temps plein

Re, Une autre recherche (Machine Learning Ops) ? Expert MLOps (IA / ML / LLM) | Industrialisation & Production IA ? Contexte & enjeuxVous jouerez un rôle clé dans la définition et la mise en ?uvre des bonnes pratiques MLOps (CI/CD, monitoring, sécurité, reproductibilité). ?? Responsabilités principales Conception, mise en place et maintien de...
Data Scientist Analyst

il y a 6 jours

Paris, Île-de-France AXA France Temps plein

Job Description:About the jobJob purposeLarge Language Models (LLMs) have demonstrated impressive reasoning capabilities, but how do we reliably assess these abilities? Current benchmarks used to evaluate reasoning in AI models face important limitations regarding their robustness and their capacity to truly measure reasoning performance.This internship is...
Multilingualk B2B Digital Sales Specialist in Bulgaria

il y a 2 semaines

Paris, Île-de-France Velenosi&Meredith Temps plein

About the Role We are seeking passionate and results-driven Multilingual B2B Digital Sales Specialists to join our growing team supporting the LinkedIn Refuel Project.In this role, you will engage with B2B clients across European markets, helping them understand how LinkedIns digital advertising solutions can improve their brand visibility, demand...

Amériques

Europe

Asie / Océanie

Afrique

STAGE - Benchmark Small Language Models (SLMs) in semiconductor tool calling context F/H