Evaluation Scenario Writer
il y a 2 jours
Overview2 days ago Be among the first 25 applicantsAt Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI. The Mindrift platform, launched and powered by Toloka, connects domain experts with cutting-edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real-world expertise from across the globe.Who we're looking for: We're looking for curious and intellectually proactive contributors who never miss an error and can think outside of the box when brainstorming solutions. Are you comfortable with ambiguity and complexity? Does an async, remote, flexible opportunity sound exciting? Would you like to learn how modern AI systems are tested and evaluated?This is a flexible, project-based opportunity well-suited for:Analysts, researchers, or consultants with strong critical thinking skillsStudents (senior undergrads / grad students) looking for an intellectually interesting gigPeople open to a part-time and non-permanent opportunityAbout the project: We're on the hunt for an Evaluation Scenario Writer - QA for a new project focused on ensuring the quality and correctness of evaluation scenarios created for LLM agents. This project opportunity blends manual scenario validation, automated test thinking, and collaboration with writers and engineers. You will verify test logic, flag inconsistencies, and help maintain a high bar for evaluation coverage and clarity.What you'll be doingReviewing and validating test scenarios from Evaluation WritersSpotting logical inconsistencies, ambiguities, or missing checksSuggesting improvements to structure, edge cases, or scoring logicCollaborating with infrastructure and tool developers to automate parts of the reviewCreating clean and testable examples for others to followAlthough we're only looking for experts for this current project, contributors with consistent high-quality submissions may receive an invitation for ongoing collaboration across future projects.How To Get StartedApply to this post, qualify, and get the chance to contribute to a project aligned with your skills, on your own schedule. Shape the future of AI while building tools that benefit everyone.RequirementsThe ideal contributor will have:Strong QA background (manual or automation), preferably in complex testing environmentsUnderstanding of test design, regression testing, and edge case detectionAbility to evaluate logic and structure of test scenarios (even if written by others)Experience reviewing and debugging structured test case formats (JSON, YAML)Familiarity with Python and JS scripting for test automation or validationClear communication and documentation skillsWillingness to occasionally write or refactor test scenariosWe also value applicants who have:Experience testing AI-based systems or NLP applicationsFamiliarity with scoring systems and behavioral evaluationGit/GitHub workflow familiarity (PR review, versioning of test cases)Experience using test management systems or tracking toolsBenefitsContribute on your own schedule, from anywhere in the world. This opportunity allows you to get paid for your expertise, with rates that can go up to $47/hour depending on your skills, experience, and project needsTake part in a flexible, remote, freelance project that fits around your primary professional or academic commitmentsParticipate in an advanced AI project and gain valuable experience to enhance your portfolioInfluence how future AI models understand and communicate in your field of expertiseJob DetailsSeniority level: InternshipEmployment type: Part-timeJob function: OtherIndustries: IT Services and IT ConsultingReferrals increase your chances of interviewing at Mindrift by 2xGet notified about new Writer jobs in France. #J-18808-Ljbffr
-
spécialiste solutions support sur ia/copilot + pc h/f
il y a 4 jours
Roissy-en-France () Com line direct Temps pleinCom Line Direct est un tremplin pour rejoindre le monde de la distribution IT, pour intégrer soit un grossiste, soit un éditeur soit un constructeur dans le secteur des nouvelles technologiesLe/la Technical Sales Specialist sera responsable de l'engagement client et de l'activation technique autour des solutions Windows 11 et Copilot+ PC, en pilotant des...
-
Digital IC Design Engineer
il y a 2 semaines
france microTECH Global Ltd Temps pleinPosition: Digital IC Design Engineer Location: Paris – France About the role: The company’s design team is seeking a dynamic and highly motivated Digital IC design engineer who will participate to the design of a state-of-the-art CMOS Transceiver ASIC for the Communications market. The candidate will be particularly involved in the architecture...
-
Friulian Linguistic Projects
il y a 2 jours
France Sigma AI Temps pleinOverviewFriulian Linguistic Projects - Latin Script (Remote) — Sigma AIJoin to apply for the Friulian Linguistic Projects - Latin Script (Remote) role at Sigma AI.What you will doCategories: Annotation, Categorization, Correction, Transcription, Evaluation, Conversational interactions, Voice recording, Content creation, Localization, Validation of audio,...
-
Consultant(e) senior en organisation
il y a 1 jour
Rue de la Part-Dieu, Lyon, France Lindea Temps pleinDans le cadre de vos missions, vous serez amené à :Réaliser des études préliminaires de réorganisation ou d'implantation, des schémas directeurs pour des activités industrielles et de services, pour le compte des exploitants ou des propriétaires/investisseurs, voire des aménageurs ou des gestionnaires de parcsÉvaluer les investissements associés...
-
ANALYSTE D'AFFAIRES
il y a 2 jours
France Gravity Conseil Temps pleinIntitulé du posteAnalyste d'affaires - système billetterieLocalisation et modeQuébec (100% télétravail). Ce poste est en mode hybride.ContexteDans le cadre d'une réflexion menée au sein de nos mutuelles et de notre fédération, nous amorçons une démarche d'étude portant sur l'évolution de notre outil de billetterie. Cette initiative vise à mieux...
-
financial risk analyst
il y a 2 jours
France Capijob Temps pleinOverviewBeobank – Entreprise Risk Management (ERM) est responsable de mettre en place des outils de pilotage des risques et de partager les dashboards et analyses (indicateurs d'appétence aux risques et analyses de stress tests) avec la Direction de la banque. L'équipe Risques Financiers et Climatiques rédige certains rapports réglementaires, notamment...
-
Occitan Linguistic Projects
il y a 2 jours
France Sigma AI Temps pleinOverview Occitan Linguistic Projects - Latin Script (Remote) — Sigma AI Join to apply for the Occitan Linguistic Projects - Latin Script (Remote) role at Sigma AI. Sigma AI is a global technology company specializing in data collection and annotation for Artificial Intelligence. What you will do Categorization Annotation Correction Transcription Evaluation...
-
France Gravity Conseil Temps pleinRéférenceDate de démarrage : Au plus viteLocalisation du poste : Montréal (hybride)DuréeNous recherchons un Coordonnateur / Coordonnatrice - Équipe de relations avec les partenaires pour rejoindre notre équipe de conseillers du bureau de Montréal.Ce poste est en mode hybride. Nous recherchons un(e) candidat(e) motivé(e), passionné(e) et dynamique...
-
Data & Immobilier durable _ Stage F/H
il y a 1 semaine
Rue de Stockholm, Paris, France Etyo Temps pleinRattaché à un consultant du pôle Sustainable Real Estate d'Etyo Green Insight, vous participerez au développement des démarches d'innovation data au service de la performance énergétique et environnementale du bâti.Vous contribuerez à la conception, l'expérimentation et la mise en valeur d'outils destinés à analyser, structurer et représenter de...
-
Analyste Décisionnel H/F
il y a 2 jours
France SURAVENIR ASSURANCES Temps pleinDescription du poste Intitulé du poste Analyste Décisionnel H/F Votre mission Au sein du Pôle Pilotage et Opérations, le département Data Office a pour objectif de travailler sur l'élaboration d’une stratégie de gestion des données stockées au sein de l'entreprise, l'investissement dans l'infrastructure de données, la mise en place de normes et...