The Metadata Hub: Unify Your Data Estate
Captured source
source ↗The Metadata Hub: Unify Your Data Estate
Skip to content
Blog / Data Lake / The Metadata Hub: One Control Plane for Your Entire Data Estate
MAY 29, 2026 / 8 min read Data Lake The Metadata Hub: One Control Plane for Your Entire Data Estate
Purvaja Narayanaswamy +1
For years, the promise of a cohesive data platform has bumped into the same hard reality: Data lives everywhere. There is data in Snowflake, in AWS Glue, in Microsoft OneLake, in Databricks Unity Catalog, in Apache Polaris™, in homegrown REST catalogs no one fully remembers building.... Every platform came with its own rules, its own metadata, its own gravity. And with each new system, the distance between a common platform and your data grew wider.
The answer isn't to move everything into one place. That ship has sailed. The answer is to connect everything through one layer: a Metadata Hub.
The problem with "just pick one platform"
The conventional wisdom used to be consolidation. Pick a cloud. Pick a catalog. Migrate data and call it done. But the data landscape has moved beyond that vision. Migrating data is counterproductive to AI context. Acquisitions bring new platforms. Teams build on the tools they know. Regulatory requirements demand geographic separation. Multi-catalog/multi-cloud isn't a "mistake." It's the reality that most enterprises live in.
That means metadata fragmentation at a very broad scale. You might know what data you have in Snowflake. But do you know what's in your Glue catalog? Your Unity workspace? Your OneLake instance? Can you query across all of them without copying data? Can you govern them from a single policy? Can you see, in real time, everything your data estate contains?
For most organizations, the honest answer is no. This is why Snowflake is focusing on interoperability as a Metadata Hub.
What a Metadata Hub actually is
A Metadata Hub isn't another catalog competing for dominance. A Metadata Hub aggregates and federates metadata across your entire data estate into a single, queryable interface. Rather than forcing data migration, it brings the understanding of that data together, giving every user a unified window into structure, semantics, lineage, quality and ownership no matter which system holds the data.
Snowflake Horizon Catalog is the connective layer, a single plane of visibility and control that sits above your existing catalogs and makes them more useful.
Learn more about Metadata Management .
A Metadata Hub rests on three foundational ideas:
1. Open by design
The Apache Iceberg™ ecosystem has established something rare in the data world: a genuine open standard. Iceberg's table format and the Iceberg REST Catalog (IRC) specification create a shared protocol that any system can participate in. Snowflake, AWS, Databricks and the broader industry have all rallied around it. We publish and consume Iceberg REST Catalog (IRC) through the embedded Apache Polaris instance in every Horizon Catalog. It's seamless, fully integrated and requires no additional configuration to unlock these interoperability capabilities.
Iceberg and the IRC specification establish an open protocol the industry can build on. AWS, Databricks and many others have aligned around that standard. We went further by investing in Apache Polaris™, now a Top-Level Project , to help advance what an open catalog should be. By embedding Polaris in every Horizon Catalog, we give you a genuinely open catalog foundation for your lakehouse.
Interoperability isn't a differentiator; it's the baseline. Any system that speaks Iceberg can exchange data, share metadata and participate in a broader data estate without custom connectors, proprietary bridges or vendor gatekeeping. No silos.
2. Native and bidirectional
Read-only access is a workaround. Native participation? That's the collaborative architecture and the one you want.
Snowflake's IRC integration goes further. When Horizon Catalog connects to a supporting catalog, such as AWS Glue, Databricks Unity Catalog 1 , Polaris or any other supporting platform, it connects natively and bidirectionally. You're not querying a copy or a reflection. You're reading and writing data directly , in its native format, with full fidelity. AWS Glue or Databricks Unity sees every change Snowflake makes. Snowflake can immediately query data that any other platform is writing natively. Catalog entries are live, and you have the option of configuring the synchronization interval ( typically we recommend 30 seconds ). Consider this example for bringing in a Unity Catalog over IRC:
CREATE DATABASE my_unity_linked_db LINKED_CATALOG = ( CATALOG = 'my_unity_catalog_int', SYNC_INTERVAL_SECONDS = 30-- controls namespace/table discovery polling ) CATALOG_CASE_SENSITIVITY = CASE_INSENSITIVE;
This allows you the ability to both read and write to a Unity Catalog source. True cross platform participation. Not just access.
3. Complete visibility
The third pillar turns connectivity into control. Using Horizon Catalog as your Metadata Hub gives you a single, real-time inventory of your entire data estate. Every table, every database, every catalog, on every platform, everywhere your data lives.
Complete visibility isn't a nice-to-have. It's the prerequisite for everything else. This matters for more than just basic discovery. You can enrich the metadata you're connected to with semantic definitions and incorporate business logic directly into the metadata layer. This layer is the foundation for governance. You can't apply a policy to data you can't see. You can't track lineage across a boundary you can't observe. You can't optimize costs on workloads you can't measure.
This is already happening. Forward-thinking organizations are building on it today. Production teams run workloads across multiple catalogs simultaneously, Snowflake to AWS Glue, Databricks Unity Catalog, Apache Polaris and others into a coherent operating model. This isn't experimentation; it's their architecture.
Organizations across industries are not consolidating onto a single platform; instead, they are adopting an interoperable-first approach across multiple platforms. This federation relies on Iceberg as a common data language and a Metadata Hub to provide a unified view. The Snowflake Horizon Catalog is uniquely positioned to serve as this essential Metadata Hub.
What this unlocks
Cross-platform queries without copying data. Join an AWS Glue table with a Databricks Unity Catalog table…
Excerpt shown — open the source for the full document.
Notability
notability 2.0/10Not AI-lab, routine product announcement