ReleaseMicrosoftMicrosoftpublished Mar 3, 2026seen 5d

microsoft/data-formulator 0.7-alpha

microsoft/data-formulator

Open original ↗

Captured source

source ↗
published Mar 3, 2026seen 5dcaptured 9hhttp 200method plain

Data Formulator 0.7-alpha

Repository: microsoft/data-formulator

Tag: 0.7-alpha

Published: 2026-03-03T02:11:30Z

Prerelease: no

Release notes:

More Charts, New Experience, Enterprise-Ready

🚧 *This version is in fact a big redesign, probably deserves v1.0. But for now, we're shipping this as 0.7-alpha for fun --- a proper, detailed write-up on the new architecture is coming soon. *

> Version: 0.7.0a1 (alpha) · Files changed: ~282 · +84k / −16k lines

---

What's New

📊 Dramatically Expanded Visualization Support

The chart template system has been rebuilt with a new semantic engine, expanding from ~15 chart types to 30 Vega-Lite chart types:

| Category | Chart Types | |---|---| | Scatter & Point | Scatter Plot, Regression, Boxplot, Strip Plot *(new)*, Ranged Dot Plot | | Bar | Bar Chart, Grouped Bar Chart, Stacked Bar Chart, Histogram, Lollipop Chart *(new)*, Pyramid Chart, Heatmap | | Line & Area | Line Chart, Dotted Line Chart, Bump Chart *(new)*, Area Chart *(new)*, Streamgraph *(new)* | | Part-to-Whole | Pie Chart *(new)*, Rose Chart *(new)*, Waterfall Chart *(new)* | | Statistical | Density Plot *(new)*, Candlestick Chart *(new)*, Radar Chart *(new)* | | Map | US Map *(new)*, World Map *(new)* | | Custom | Custom Point, Custom Line, Custom Bar, Custom Rect, Custom Area |

Semantic field analysis automatically infers temporal, categorical, quantitative, and geographic types to recommend the right chart for the data.

💬 Hybrid Chat + Data Thread & Enhanced Agent Mode

  • Redesigned Data Thread — Chat-based interaction is woven directly into the exploration thread. Users converse with agents inline alongside data transformations and chart results, replacing the separate chat panel.
  • Richer thread cards showing transformation lineage, chart previews, and agent reasoning in a unified timeline.
  • New agent mode — Agents autonomously plan multi-step explorations, generate chart recommendations, and produce data insights, all surfaced inline in the thread.
  • Conversational data loading via integrated chat-based data ingestion.

🤖 Redesigned Agent Architecture

The backend agent system has been significantly restructured — consolidating previously fragmented agents into a cleaner, more capable design:

  • Unified `DataAgent` replaces four separate agents (agent_py_concept_derive, agent_py_data_rec, agent_sql_data_rec, agent_sql_data_transform) with a single agent that handles both Python and SQL data transformations.
  • New `agent_data_transform` — Dedicated data transformation agent.
  • New `agent_data_rec` — Recommendation agent that suggests charts and exploration directions.
  • New `agent_chart_insight` — Generates natural-language insights from chart results.
  • Shared `semantic_types` — Type system used by both backend agents and frontend chart engine for consistent field inference.

🏗️ Workspace / Data Lake Architecture (Enterprise-Ready)

A new persistent, identity-based Workspace layer replaces the previous in-memory DB approach:

  • `Workspace` manages per-user directories with a workspace.yaml metadata catalog tracking every table's lineage, schema, provenance, and source type.
  • Uploaded files (CSV, Excel, JSON, etc.) preserved as-is; data-loader sources stored as Parquet via PyArrow.
  • `CacheManager` and `FileManager` for efficient caching and file lifecycle.
  • Azure Blob and Cached Azure Blob workspace backends for cloud deployments.
  • `WorkspaceFactory` selects the correct workspace backend from configuration.
  • New modular route layer replaces monolithic app routes.

🔒 Security Hardening

  • Code signing for AI-generated Python code.
  • Sandboxed execution with local and docker backends.
  • Authentication layer for user identity.
  • Flask rate limiting to protect API endpoints.

📦 Other Notable Changes

  • UV-first build — Fully reproducible builds with uv.lock; uv sync + uv run data_formulator is now the recommended development workflow.
  • Unified data upload dialog and refresh data dialog.
  • Demo streaming routes for live data scenarios.
  • api-keys.env.template consolidated into .env.template.

---

Getting Started

# Recommended (uv)
uvx data_formulator

# Or via pip
pip install data_formulator==0.7.0a1
python -m data_formulator

---

Community Contributions

Thanks to our contributors:

  • @BAIGUANGMEI — Map projection support (projection types & centers) (#232)
  • @IAMkecheng — ECharts renderer: scatter plot & bar chart (#236)
  • @joshpoll — GoFish grouped bar sizing fix (#235)

---

> Alpha notice: This is a pre-release. APIs and features may change before the stable 0.7.0 release. Please report issues and share feedback!

Full Changelog: https://github.com/microsoft/data-formulator/compare/0.6...0.7.0a1

Notability

notability 5.0/10

Alpha release of data tool from Microsoft