From Zero to One: Building An Autonomous and Open Data Scientist Agent from Scratch
Captured source
source ↗From Zero to One: Building An Autonomous and Open Data Scientist Agent from Scratch
⚡️ FlashAttention-4: up to 1.3× faster than cuDNN on NVIDIA Blackwell →
Introducing Together AI's new look →
🔎 ATLAS: runtime-learning accelerators delivering up to 4x faster LLM inference →
⚡ Together GPU Clusters: self-service NVIDIA GPUs, now generally available →
📦 Batch Inference API: Process billions of tokens at 50% lower cost for most models →
🪛 Fine-Tuning Platform Upgrades: Larger Models, Longer Contexts →
All blog posts
Research
Published 6/12/2025
From Zero to One: Building An Autonomous and Open Data Scientist Agent from Scratch
Authors
Federico Bianchi, Shang Zhu, Zain Hasan, Ben Athiwaratkun and James Zou
Table of contents
40+ Models Chosen for Production...40+ Models Chosen for Production...40+ Models Chosen for Production...
Links in this article
Open source codebase Together Code Interpreter Data Science Agent Cookbook
TLDR. This blog post shows how to build an effective data scientist agent from scratch using Together’s open-source Models and Together Code Interpreter . While straightforward to implement, our agent performs well on several different use cases and benchmarks. We open source our codebase on GitHub . Introduction The daily work of data science requires us to sift through messy, incomplete information to extract actionable insights. This is often a process that requires multiple steps, ranging from data cleaning to model training and data analysis. Considering the explosion in capabilities of large language models, it is natural to start thinking about how to use readily available open-source technologies as a foundation for building agents that can effectively manipulate and analyze data. While human data scientists need to remain at the heart of these processes, AI agents can help support some of the workload and reduce the burden of analytical tasks. Today, we show how to implement a simple yet effective data scientist agent that can solve data science tasks. The implementation of this agent will be end-to-end, from the design of the pipeline to the implementation and the testing. We will do this using models and tools that are accessible on the Together Cloud, making this development surprisingly accessible. This “recipe” we share today for data science can be extended to other agents in other settings. In particular, we will follow a ReAct (Yao et al., 2023) pattern: the agent will first “think” and then “act”. Each action the agent generates is a Python snippet (e.g., import pandas as pd; pd.read_csv(...)). This is strongly inspired by the smolagents package, which is in turn inspired by the CodeAct (Wang et al., 2024) and the original ReAct paper.
A CLI interface for configuring and launching DeepSeek V3 to run automated analysis on the ESOL dataset. The Open Data Scientist’s actions have to be executed. While conceptually this is an “easy” step, in practice it is not: code execution is, by nature, a very unsafe operation ; even more so in the context of language models, where you do not know what the code generated by the language model is going to look like. To solve this problem and to run code safely and efficiently, we will use the Together Code Interpreter (TCI). TCI elegantly abstracts away the complexity of sandboxed Python execution, providing a clean API that accepts code and returns structured results. This architectural choice has strong implications for our agent's design—it becomes inherently modular and maintainable. Since the code execution layer is completely decoupled from the reasoning logic, we can modify prompts, adjust the ReAct patterns, or add new analytical capabilities without touching the execution infrastructure. The agent's behavior becomes highly tunable through prompt engineering alone, while TCI ensures consistent, reliable execution regardless of the complexity of the generated Python code. After having implemented the agent, we will focus on evaluation by providing quantitative and qualitative results regarding its performance. We will discuss some best practices on how to start building agents from scratch. Our data scientist agent implementation serves as a transparent example for understanding how to use open source models and TCI to build an agent and to evaluate agentic pipelines. Since this entire solution can be composed using Together Cloud features with just a single API key, this blog post is a good reference for those looking to understand the fundamentals of building reasoning-driven AI assistants. Why build agents from scratch? This is a good question! Considering the number of frameworks out there, one might not need to dive into lower-level implementations. However, in the context of agentic models, it is actually good to know how things work at a lower level of abstraction. Observing the steps a language model must take to interact with the external world is helpful in better understanding how the agent solves problems and how to improve general agent architectures. Even more so, considering how many edge cases one ends up encountering in the process of building agent architectures. Indeed, another interesting feature is that the Open Data Scientist agent we are going to build here is “hackable” and adaptable to different use cases. A CodeAct Data Scientist ReAct The ReAct framework was introduced by Yao et al., 2023 to improve language agents. The idea behind the ReAct framework is encapsulated in its name: Reasoning and Action. The agent's activity unfolds using the following cyclical set of steps: the agent is tasked with a goal, reasons about what the next step could be, and then prepares the inputs to an action (which is then executed). Each action generates an observation from the environment, which is used as a source of new information for the next React step. To clarify, the agent operates entirely through text generation. When we say the agent "reasons about the task," we mean it outputs natural language that articulates its step-by-step thinking process. When we say the agent "prepares the next action," we mean it generates structured text that specifies which function to call and with what parameters. Both reasoning and action preparation are fundamentally text generation tasks that enable the agent to interface with external tools and environments. What does this mean in practice? In our "system prompt," we ask the language model first to write its…
Excerpt shown — open the source for the full document.
Notability
notability 5.0/10Educational blog post, not a major release or traction mentioned.