The Interoperable Lakehouse: Agency Over Your Data
Captured source
source ↗The Interoperable Lakehouse: Agency Over Your Data
Skip to content
Blog / Product and Technology / Build the Interoperable Lakehouse: Agency Over Your Data
JUN 02, 2026 / 13 min read Product and Technology Build the Interoperable Lakehouse: Agency Over Your Data
James Rowland-Jones +1
AI is testing every architecture decision. When teams can't act on data where it lives, they copy it. Pipelines sprawl, governance fragments, costs compound, and AI agents end up reasoning over stale, disconnected data instead of the governed, semantically rich data they need.
The open lakehouse promised to solve data fragmentation without forcing everyone onto a single platform. But for most organizations, the format arrived before governance and semantic fragmentation could be addressed. That changes today. Snowflake's Interoperable Lakehouse , built on Apache Iceberg™, Apache Polaris™ and Open Semantic Interchange (OSI), is generally available. It offers a new blueprint for connecting, accessing, governing and operating on a single governed copy of your data, wherever it lives and without lock-in. By giving control back to data owners, not vendors, you can create agency over your data, and in the process cut architectural cost and ground every AI initiative in a foundation you can actually trust.
Act on data in place
Agency over your data starts with a connected data foundation — one place to act on every data set, for any operation, without copying it. With this launch, Snowflake advances that foundation across every layer of access. Snowflake's support for Apache Iceberg v3 is generally available and production-ready, providing the broadest set of v3 capabilities on the market today that are deeply integrated throughout the platform to unlock greater interoperability. Snowflake Storage for Apache Iceberg™ tables makes managed Iceberg as easy as CREATE TABLE. Zero-Copy Integrations bring your systems-of-record into the foundation with semantics intact. Horizon Context connects the business definitions every team and AI agent runs on. More data. More context. One governed copy.
Apache Iceberg was originally designed for huge analytical datasets, but it had suboptimal support for workloads involving semi-structured data, small updates, geospatial analytics, and change-tracking pipelines. Apache Iceberg v3 closes that gap. As of today, Snowflake brings the broadest set of v3 capabilities to production, including VARIANT support for semi-structured data, row lineage for change tracking across engines, deletion vectors for performant row-level deletes, nanosecond timestamps for high-frequency telemetry and financial workloads, default values and geospatial types. More workloads now have a clean path to interoperability.
A capable format, however, does not eliminate the operational tax of managing storage. Snowflake Storage for Apache Iceberg™ tables, generally available for AWS and Azure and private preview soon for Google Cloud, delivers a fully managed Iceberg experience: open from the start, governed through Horizon Catalog, readable and writable by any Iceberg-compatible engine. For teams managing their own storage on Azure, Azure DFS support is generally available, delivering full interoperability through native Azure Data Lake Storage Gen2 endpoints.
Figure 1: Introducing Snowflake Storage for Apache Iceberg™, now generally available.
Bringing existing data in shouldn't require migration or conversion. Parquet Direct , in private preview with general availability coming soon, makes existing Parquet files queryable with Iceberg-class performance. Google Cloud Lakehouse integration is generally available, creating Catalog Linked Databases for Google's cross-cloud lakehouse environment with automatic table discovery and cross-cloud read and write access. Just-in-time refresh for externally managed Iceberg, in private preview, detects stale metadata at query time and refreshes it automatically, doing away with the need to configure scheduled refreshes.
Enterprise platforms are where the most valuable enterprise data lives — and where the pipeline tax has always been heaviest. Zero-copy integration makes critical business data available in your Snowflake ecosystem in near real time without ETL pipelines or the need to rebuild semantic context. These exist now for SAP (GA), Salesforce , Workday (private preview), and new partnerships with AVEVA and IBM will extend this model further — operational technology and industrial data from AVEVA CONNECT, and enterprise data platforms from IBM — bringing business definitions and context together for more consistent, AI-ready data.
Having connected systems doesn't necessarily translate into connected meaning. Revenue, churn and customer counts still mean three different things in three different places until the definitions themselves live in one connected layer. Horizon Context is that layer. It links scattered business definitions across databases, data lakes and BI tools so that every team inside and outside of Snowflake (and AI agents) reason from the same definition of enterprise truth. Connect to external database, BI and data pipeline systems, including PostgreSQL, Microsoft SQL Server, Tableau, Microsoft Power BI and dbt and enrich metadata with schemas, query logs, dashboard definitions and more (in private preview). Horizon Context enables this foundation through a set of integrated capabilities:
Out-of-the-box connectors: Connect to tools such as PostgreSQL, Microsoft SQL Server, Tableau, Microsoft Power BI and dbt that allow you to gather rich context — query logs, popularity, schemas and more — from many sources into one searchable catalog.
End-to-end column-level lineage: Lineage is key to understanding how data assets are related to one another. Horizon Context mines lineage information from Snowflake and external database query logs, BI systems and OpenLineage feeds, and stitches it all together to create a complete, end-to-end lineage graph.
Semantic Studio , in private preview, is an AI-assisted IDE within Snowflake Workspaces that lets teams define, test and publish shared business logic without SQL expertise, with Snowflake CoCo integration and Git sync for version control.
Semantic View Autopilot (generally available) analyzes existing query patterns to automatically generate and refine semantic views, helping ensure your context layer stays current as your data and usage evolve. CoCo now retrieves business context…
Excerpt shown — open the source for the full document.
Notability
notability 3.0/10Corporate blog post, not a notable AI model or release.