Features
AI Studio (Collate - Beta)
Collate already hosted AI Agents that users could control via Applications, to generate descriptions, add Tiers, and Data Quality tests. AI Studio now provides visibility and control over AI agents powering the data platform.
With AI Studio, users can customize the prompts of these agents so their output aligns with organizational needs. Moreover, admins can create new AI agents with specific behavior, capabilities and prompts, that can be executed or embedded in external AI Platforms thanks to the Metadata AI SDK.
Metadata AI SDK
The Metadata AI SDK enables programmatic access to Collate's AI agents and semantic layer, allowing teams to build custom chatbots, automate governance tasks, and integrate metadata intelligence into external applications.
By creating Agents either via the UI in AI Studio or through the Metadata AI CLI, you can now access lineage information, quality metrics and business context through simple APIs based on Natural Language.
The Metadata AI SDK is available via CLI and programmatic access in Java, Python, and Node.js.
OpenMetadata users can also leverage the AI SDK by bringing MCP tooling easily into their langchain applications, adding all the necessary semantic intelligence into their agents!
Auto Classification with Custom Recognizers (Collate)
OpenMetadata already had an Auto Classification Metadata Agent that automatically tagged PII Sensitive data. With release 1.12, Collate brings the ability to create custom AI-powered recognizers for any classification using regex patterns, column names, and data content scanning.
Moreover, users can report false positives with explanations, creating a feedback loop that improves model accuracy and ensures that the agent does not make the same mistakes again.
Data Quality Test Library
Release 1.12 takes Data Quality capabilities one step further than any other platform by letting admins create reusable, parameterized SQL-based test templates easily from the UI. Define tests once with parameters like table_name and column_name or any other custom parameter your users need, then apply consistently across multiple tables without rewriting SQL.
Users can then apply these new tests via the UI, giving them centralized governance and standardized definitions for critical business rules organization-wide.
Data Diff Column/Row Analysis (Collate)
Granula tables ar visual comparison of differences between source and target column, row, and character level. Identify which columns were added, removed, or modified with side-by-side comparison. Drill down to specific rows and see character-level changes within fields. Visual diff interface accelerates troubleshooting and root cause analysis.
GitHub Metadata Sink (Collate - Beta)
Bring metadata under version control with automated Git commits for every metadata change in Collate. If you are using separate development and production environments, routing metadata changes through GitHub pull requests brings you a human in the loop experience for any event that you might want to push into higher environments via CICD.
Human & AI Audit Logs
While every asset already supported version history, OpenMetadata now supports a comprehensive audit logs track all user and AI agent actions across the platform.
With a six-month retention, filtering by user, agent, time range, or action type as well as export capabilities, your governance teams can easily handle compliance reporting and security audits.
Column Bulk Operations
How can your teams keep up with an always growing and evolving Data Platform? Users can now aggregate identical column names across all asset types (tables, topics, containers, APIs, search indexes) in a single view to set descriptions, tags, and glossary terms for all instances simultaneously.
This aggregate view also helps users detect inconsistencies where the same column has different definitions, as well as digging into specific elements by filtering operations by domain, tags or even metadata completeness.
Column Details Panel
Added an expandable details panel for columns. Users can now view column-level custom properties and metadata directly within the table view. The panel also includes a dedicated Data Quality section for quick visibility into column health. This eliminates the need to navigate away to access column details and quality insights.
Open Standards: ODCS 3.1 & OpenLineage Support
Import and export contracts in Open Data Contract Standard (ODCS) 3.1 format for interoperability with other tools. Collate's contract specification extends ODCS with terms of service, semantic relationships, and ownership details while maintaining compatibility.
OpenMetadata also accepts events from OpenLineage, so you can now easily bring any OpenLineage-compatible systems through native API integration and benefit from the broader metadata semantics available in the platform.
AskCollate Enhancements & MS Teams Integration (Collate)
- Expanded entity support for Metrics, Knowledge Center articles, and Dashboard Data Models.
- AskCollate now holds your company’s context from glossary terms, metrics, and knowledge center. Pushing towards your governance initiatives also improves your AI tooling and interactions!
- Enhanced thinking transparency showing detailed reasoning process
- MS Teams integration alongside existing Slack integration, allowing your users to interact with AskCollate directly where they are without having to jump from tool to tool.
Kubernetes Orchestrator
OpenMetadata now brings a Kubernetes Orchestrator for those users that don’t want to use Airflow to manage the Metadata Agents and other automations. With this new orchestrator, OpenMetadata doubles-down on a simplified deployment experience, while ensuring scalability and operational efficiency in production k8s environments.
MCP Tools & Semantic Search
We have added more tooling to OpenMetadata’s MCP, helping you to create lineage, as well as adding all necessary DQ tooling around test definitions, test case creation and Root Cause Analysis.
We have also been working on the feedback shared on MCP from the community, so keep sharing your thoughts on how we can make MCP even better.
Moreover, the 1.12 release brings Semantic Search into OpenMetadata! You can enable it in the configuration to create vector embeddings for your entities, supporting both Bedrock and OpenAI embeddings. On top of that, we have created an MCP tool for Semantic Search so that you can interact with these vectors in your applications!
New Connectors
- SFTP: Catalog unstructured files alongside structured data
- Redshift Serverless: Native support for Amazon's serverless deployment option
- StarRocks: Support for open-source analytical database
- Microsoft Fabric (collate - beta): Connect to Microsoft's unified data platform including Data Warehouse, Power BI, and pipelines
- Dremio (collate): Support for lakehouse platform with query engine and semantic layer integration
- Mulesoft (collate): Integration with API management platform for API metadata and lineage
Breaking Changes
OpenSearch/Elastic Search Client have been upgraded , it is necessary to use 9.3.0 for Elasticsearch and 3.4.0 for Opensearch
Data Contract Schema Changes
Security, SLAs and Terms of Use can now be inherited from the Data Product’s Data Contract. To allow for this, we’ve added the ‘inherited’ property to Security and SLAs, and converted ‘termsOfUse’ from a simple Markdown field to an object that holds both the markdown information and the inheritance flag.
Helm
- Updated Airflow section
From here, the ‘pipelineServiceClientConfig’ key, used to host directly Airflow configuration. Now we support also a native k8s orchestration engine, so we’ve added a ‘‘pipelineServiceClientConfig’.type’ key to switch between “airflow” or “k8s”, and moved orchestrator-specific configurations into another nested level: “‘pipelineServiceClientConfig.airflow” and “‘pipelineServiceClientConfig.k8s”.
Changelog
Additional Enhancements
- Learning Resources (?): Contextual tutorials and videos throughout UI based on current page, with admin customization for organization-specific training materials
- Lineage Improvements: Column-only filtering, edge highlighting on hover, stored procedure support in edit mode, faster SQL parsing for complex lineages
- Explore Page Sidebar: Right-side navigation showing lineage, data quality details, and custom properties without leaving explore view
- Metadata Exporter - Entity History (Collate): Export complete change tracking to data warehouses for custom dashboarding, running within customer networks
- Test Case Import/Export: Bulk operations on data quality tests at table and multi-table levels
- Data Contracts at Data Product Level: Define contracts once at data product level with automatic inheritance to all assets for Semantics, Terms of Use, Security and SLAs.
- Distributed Search Indexing: Multiple application servers share indexing workload for improved scalability
- Data Product Input/Output Ports: Support port specifications with lineage visualization for data flow
- Timezone-Aware Freshness Tests: Set specific timezones on freshness tests to prevent UTC misalignment issues
- SQL Studio - Postgres & Redshift Support (Collate): Adds Postgres and Redshift to existing Snowflake, Trino, and BigQuery support
- Snowflake Dynamic Table System Metrics: Support for INSERT, UPDATE, DELETE metrics in profiler
- Column Custom Properties: Side panel drawer interface with improved navigation.