Stagiaire Data Engineer Data Enablement with AI for AI

il y a 4 jours


Paris, France CAST Software Temps plein

CAST a Software Company based in Meudon is the market leader in Software Intelligence. Working at CAST R&D means being an important part of a highly-talented fast-paced multicultural and Agile team. Overview Were building the foundation to ground AI with AAA Software Intelligence Aggregated Accurated and Augmented sourced from real-world software and technology projects. This role goes beyond manual curation : its about using AI to empower AI. You will leverage LLMs embeddings and NLP tools to clean enrich and validate data enabling AI systems and autonomous agents to rely on it for training and contextual understanding. Responsibilities Aggregate and structure data from software ecosystems (codebases APIs tickets documentation architecture specs). Apply LLMs embeddings and NLP tools to automate : data cleaning entity extraction metadata tagging and semantic annotation. Build and maintain semantic pipelines for LLM fine-tuning and RAG (Retrieval-Augmented Generation). Organize datasets into formats suitable for Agent-to-Agent (A2A) interactions : APIs vector DBs knowledge graphs etc. Collaborate with AI teams to evolve schemas prompts labeling strategies and evaluation data. Ensure strong data lineage reproducibility and version control. Requirements Experience in data engineering ML data ops or structured data curation. Proficient in Python with strong data pipeline skills (Pandas PyArrow regex Airflow). Experience with LLMs or NLP tools (e.g. Hugging Face spaCy LangChain). Ability to use AI to clean enrich classify and organize technical content. Strong understanding of tokenization chunking and model input preparation. Experience working with software project data : Git repos APIs technical documentation etc. Bonus Skills Knowledge of vector DBs (FAISS Qdrant Weaviate) or knowledge graphs (Neo4j RDF SPARQL). Key Skills Apache Hive,S3,Hadoop,Redshift,Spark,AWS,Apache Pig,NoSQL,Big Data,Data Warehouse,Kafka,Scala Employment Type : Contract Experience : years Vacancy : 1 #J-18808-Ljbffr


  • Data Engineer

    il y a 4 jours


    Paris, France Harmattan AI Temps plein

    About UsAt Harmattan AI, we are a next‑generation defense prime building autonomous and scalable defense systems. Driven by rigorous engineering developments of new defense products based on recent robotics and AI developments, we are on a steep growth trajectory. If you are interested in a career in a highly technical environment, thrive on pushing...

  • Data Engineer

    il y a 7 jours


    Paris, Île-de-France Harmattan AI Temps plein

    About UsAt Harmattan AI, we are a next-generation defense prime building autonomous and scalable defense systems. Driven by rigorous engineering developments of new defense products based on recent robotics and AI developments, we are on a steep growth trajectory. If you are interested in a career in a highly technical environment, thrive on pushing...

  • Data Engineer

    il y a 1 semaine


    Paris, France Siena AI Temps plein

    **About us**: At Siena we are revolutionizing the customer service industry with the world's first autonomous AI customer service agents. We are a remote-first startup that's passionate about enabling machines to engage in delightful and empathic conversations. Siena is the first of its kind, designed to work out-of-the-box to interact with customers across...


  • Paris, France Harmattan AI Temps plein

    A technology company in Paris seeks a Data Engineer to transform data into high-quality datasets for analytics and product improvements. The role involves developing data pipelines, managing databases, and collaborating with multidisciplinary teams. Applicants should have 4+ years of experience and strong Python skills. This full-time position offers the...


  • Paris, France Capital Fund Management (CFM) Temps plein

    Our Company Founded in 1991, we are a global quantitative and systematic asset management firm applying a scientific approach to finance to develop alternative investment strategies that create value for our clients. We value innovation, dedication, collaboration, and the ability to make an impact. Together, we create a stimulating environment for talented...

  • Senior AWS Data Engineer

    il y a 2 semaines


    Paris, Île-de-France Data Reply Temps plein

    Senior AWS Data EngineerTasksImplement new use cases and data pipelines on AWSMap data and data flows across cloud platformsDevelop and industrialize data pipelines and processing workflowsDesign and build dashboards and reporting toolsPerform unit and integration testing of data flowsParticipate in Data Reply events (Reply Xchange, hackathons, AWS summits,...


  • Paris, Île-de-France CAPITAL FUND MANAGEMENT Temps plein

    Paris, 75, FRABOUT CFMFounded in 1991, we are a global quantitative and systematic asset management firm applying a scientific approach to finance to develop alternative investment strategies that create value for our clients.We value innovation, dedication, collaboration, and the ability to make an impact. Together, we create a stimulating environment for...

  • Graduate Data Engineer

    il y a 2 semaines


    Paris, Île-de-France Data Reply Temps plein

    Graduate Data EngineerTasks• Implementing new use cases• Mapping data and data flows• Implementing data analysis and processing pipelines• Industrializing data flows and their visualization through dashboards and reporting• Carrying out unit tests and integration tests  Benefits• Structured career progression – at Reply, we encourage career...

  • Technical Staff, Data

    il y a 1 semaine


    Paris, France Reflection AI Temps plein

    **Role Overview**: Data quality and diversity is one of the single most important factors to training state-of-the-art models. As a member of the technical staff focused on data at Reflection, you will play a pivotal role in shaping how we collect, process, and analyze human and internet data for training AI models. You will design and execute protocols,...

  • Senior Data Scientist

    il y a 1 semaine


    Paris, Île-de-France Monk AI Temps plein

    ACV is a technology company that has revolutionized how dealers buy and sell cars online. We are transforming the automotive industry. ACV Auctions Inc. (ACV), has applied innovation and user-designed, data driven applications and solutions. We are building the most trusted and efficient digital marketplace with data solutions for sourcing, selling and...