open-metadata/OpenMetadata 1.13.0-release on GitHub

Features

MCP Services

MCP (Model Context Protocol) is now a first-class service category with service entities, server entities, execution logs, test-connection support, REST resources, and UI pages.

Usage analytics expose summary, history, tool breakdown, user breakdown, and current-user usage
MCP OAuth now supports SAML SSO authentication
Client secrets are not issued to public clients
get_entity_details now surfaces custom properties in responses

Knowledge Graph and RDF

Requires Apache Jena. Run the RDF Knowledge Graph Index App after upgrade for first-time users.

Distributed RDF indexing with state tables for jobs, partitions, locks, and server stats
Glossary membership scoping, relation cleanup, distributed mode, and compaction
Revamped graph with custom nodes, relation details, and distributed indexing status

Search Index Performance and Live Indexing

Tunable settings: refresh interval, replica count, translog durability, sync interval, and per-entity overrides
Per-stage reindex timing metrics for reader, process, sink, and vector stages
Live indexing retries on failure with a dead-letter queue for failed items
Search results can be exported to CSV from the Explore page under Tools
Search Index App schedule has moved to weekly — review before upgrading

Ontology Explorer

New first-class governance page at /governance/ontology with graph filters, layout controls, side-panel entity details, and export controls.

Typed Glossary Term Relations

New relation types: relatedTo, synonym, antonym, broader, narrower, partOf, hasPart, calculatedFrom, usedToCalculate, seeAlso

New governance settings page to manage relation types
Relation badges, filters, and graph views throughout Glossary UI
Concept mappings for external IRIs and SKOS-style relation types
APIs for relation usage counts, asset counts, batch fetch, add/remove, and relation graph

Data Marketplace

New sidebar and routes at /data-marketplace, /data-marketplace/domains, /data-marketplace/data-products
Customizable landing page with widgets for domains, data products, announcements, and search

AI and Hybrid Search

Google Gemini embedding provider with configurable dimensions and endpoint override
OpenAI NLQ: modelId, request timeouts, max tokens, and temperature now configurable
Hybrid search tuning: keyword/semantic weights, RRF settings, semantic score threshold, highlight fragment size
textToLLMContext and vector body-text extension hooks

Data Quality and Profiler

Dynamic and static sampling via profileSampleConfig
Explicit metrics selection per profiler run
Top-dimension controls for dimensional test cases
Bulk add and select-all for logical and bundle test suites
Dashboard widgets and filters: data products, certification, incident status, tiers, entity health
Storage auto-classification for containers with language-aware recognizer selection
Deterministic MySQL median behavior

Governance and Workflows

Data-contract references across data assets and service entities
Workflow triggers extended: data product, data contract, glossary terms, input ports, output ports
Approval tasks show proposed changes with clickable entity links, domain stamped on creation
Self-approval prevention for workflow change requests
New Archived entity status

New Connectors

Google Drive — ingestion connector and example workflow
Pub/Sub — messaging connector with test-connection support
QuestDB — database connector
IOMETE — database connector
SAP SuccessFactors — database connector
SAP S/4HANA — dashboard connector
Matillion Data Cloud — pipeline connector
Airflow 3.x — API-based connector; constraints upgraded to 3.2.1

Connector Improvements

Snowflake — opt-in ACCESS_HISTORY lineage path; queries chunked by day to avoid timeouts
Unity Catalog — incremental metadata extraction, only fetching changed entities since last run
SSRS — report-to-dataset lineage
Metabase — chart-level lineage extraction
OpenLineage — Glue, Kusto, Cosmos DB naming; symlinks facet for Iceberg; pipeline node for single-sided lineage
Storage — compressed archive ingestion (ZIP, tar, gzip) in S3, ADLS, GCS; Redis caching for container ancestors
MySQL — queryHistoryTable option; GCP Cloud SQL IAM support
Athena — catalogId for S3 Tables and cross-account Glue
Oracle — preserveIdentifierCase and useDBATable options
S3, ADLS, GCS — profiling capability flags; REST connector S3 and SSL config

Platform, Cache, and Operability

Read-bundle prefetch and cache warmup for tags, certifications, relationships, containers, ancestors
Redis: cache metrics, distributed warmup, per-command timeout defaulting to 300ms
Deadlock retry handling and reduced write deadlocks
JSON log format via LOG_FORMAT=json, streamable logs, non-blocking handlers
QoS request admission enabled by default via QOS_* settings
CSP nonce handling and web security headers: COEP, CORP, COOP
Regenerate-bot-tokens for JWT key rotation
db-tune ops subcommand and production RDS runbook
Diagnostics v2 framework — legacy ExecutionTimeTracker removed

Columns as Independent Entities

Columns are now indexed as independent entities. They appear in asset counts and are the default entity shown in Explore when selecting a database service. Previously tables were shown. This is a behavioral change.

Breaking Changes

Connector and Ingestion Changes

Iceberg connector removed — services migrated to CustomDatabase, pipelines hard-deleted. Update any YAML or automation referencing serviceType Iceberg
Databricks/Unity Catalog scheme changed from databricks+connector to databricks. Stored configs migrated; external YAMLs must be updated
Profiler sampling changed to profileSampleConfig. Old fields profileSample, profileSampleType, samplingMethodType, and computeMetrics are removed
randomizedSample defaults now explicitly false in migrated configs
Python ingestion targets 3.10, 3.11, 3.12. Key deps: SQLAlchemy 2.x, pandas 2.1.x, pyodbc 5.3.x, Airflow 3.2.1, Databricks SQLAlchemy 2.x
Storage manifest partitionColumns uses a smaller partition-column shape

API and Schema Changes

Feed APIs no longer accept from in createThread or createPost — remove it from client payloads
Search payloads removed the semanticSearch boolean
Application schemas renamed preview to enabled with inverted meaning — custom app manifests must use enabled
Webhook moved from secretKey to authType object (no auth / bearer / OAuth2)
Custom property names must start alphanumeric and cannot contain / or ~
Glossary relatedTerms changed to typed TermRelation objects — existing data migrated to relatedTo
entity_relationship primary key now includes relationType
Logical-suite add endpoint deprecated — use PUT /api/v1/dataQuality/testCases/logicalTestCases/bulk
Bulk Assets dryRun now enforced for tag, glossary, dataProduct, and team removes
New Archived entity status — update any hard-coded status enums

Operational Notes

Quartz tables cleared during migration — stop all instances before upgrading
Postgres fqnHash text_pattern_ops indexes added or replaced — runbook included in migration file if build is interrupted
New tables for MCP services, servers, executions, RDF indexing jobs, partitions, locks, and server stats
SERVER_CHANGE_LOG historical gaps backfilled — missing entries caused data-insights timeline holes
Profiler pipeline cleanup force-executed on upgrade to clear stuck pre-1.13 state
LOG_FORMAT=json now supported — review any custom Dropwizard logging config
QoS admission enabled by default — check QOS_* settings if adjustment needed
Redis per-command timeout defaults to 300ms — tune for slow Redis deployments

Changelog

Search and Reindexing

Fixed nested children causing Elasticsearch/OpenSearch mapping-depth failures
Fixed stale file-extension aggregation on v1.13.0 upgrade causing 500 errors on file search
Fixed stale flattened-children highlight field on v1.13.0 upgrade causing 500 errors on container search
Fixed search_after silently dropping entities when sort value contains a comma
Fixed query, worksheet, and file reindexing missing relationship fields
Fixed search-index alias resolution for entity-specific and OpenSearch cluster prefixes
Fixed batch-prefetch of upstream lineage leaking Hikari connections during bulk reindex
Fixed soft-delete propagation to time-series child aliases
Fixed clean reindex jobs incorrectly marked failed when only warnings existed
Fixed text-field sorting and aggregation .keyword resolution
Fixed user index searches on nested owners queries
Fixed advanced-search Contains and Not Contains operators for description field

Glossary, Tags, and Governance

Fixed glossary relation rendering for multiple relation types between the same term pair
Fixed related-term tooltip sanitization and relation badge colors and icons
Fixed tag rename and relationship cache invalidation
Fixed TagLabel server fields lost when saving tags
Fixed certification tags leaking into regular tags and missing appliedBy audit trail
Fixed soft-deleted users appearing in experts and reviewers selectors
Fixed hyperlink workflow rules and Tags/Tier field ambiguity

Data Quality and Profiler

Fixed test-case suite search membership preservation
Fixed tier and certification filter queries in Data Quality dashboard
Fixed incident manager status and severity chip behavior
Fixed TableColumnCountToBeBetween API responses
Fixed column profile percentages showing 0% for zero proportions
Fixed tableCustomSQLQuery ignoring computePassedFailedRowCount flag
Fixed orphan test cases breaking search indexing
Fixed sample randomization at 100% sample

Ingestion and Connectors

Fixed single bad table aborting entire schema ingestion run
Fixed Snowflake and OpenMetadata socket waits causing silent hangs
Fixed Power BI lineage buffer flushing, TSQL Sql.Database parsing, and workspace cache scope
Fixed Databricks nested column descriptions and SQLAlchemy 2.x compatibility
Fixed Databricks and Unity Catalog valueless tags being silently dropped
Fixed Datalake JSON columns typed as string for empty dict values
Fixed MySQL profiler median query quoting and deterministic behavior
Fixed Redshift interval, numeric, and timestamp precision parsing, view definition, IAM auth, and LISTAGG errors
Fixed Oracle, MSSQL, Athena, and Redshift profiler under SQLAlchemy 2.0
Fixed dbt column tags, snapshot model patching, compiled-only test results, and test entity links
Fixed SQL Server temporal-table period columns classified as PII
Fixed SQLAlchemy engine resource leak on multi-database source iteration
Fixed ADLS object counts scoped to configured sub-path
Fixed PII recognizer selection based on configured language
Fixed runtime spaCy model loading for non-root containers

UI and UX

Fixed unknown service categories returning 404
Fixed Explore page column icon display, search term warnings, and text overflow
Fixed lineage edge misalignment, edge hover, temporary lineage table nodes, and service nodes
Fixed table constraints UI and cluster-key constraint display and editing
Fixed dotted custom-property names display
Fixed custom relation badge color handling and overlapping badges
Fixed activity feed, task notification refresh, and approval task rendering
Fixed MSAL and SAML token renewal and Safari SSO session loss
Fixed copy-to-clipboard in non-secure contexts
Fixed charts not deleted when parent dashboard or service is deleted
Fixed column.extension values silently dropped on entity creation

Security and Dependencies

AWS SDK pinned to 2.41.30 — clears CloudFront CVE
Airflow upgraded to 3.2.1 — clears 7 CVEs
gnutls, libcap, openssh, and rsync CVEs closed in ingestion Docker images
Test-connection workflow triggers now require authorization
Python ingestion: explicit jsonify at route level to break XSS taint chain
Axios, dompurify, follow-redirects, and related UI CVE fixes
Jetty and pac4j upgraded for Java-side CVEs