github open-metadata/OpenMetadata 1.13.0-release

pre-release8 hours ago

Features

MCP Services

MCP (Model Context Protocol) is now a first-class service category with service entities, server entities, execution logs, test-connection support, REST resources, and UI pages.

  • Usage analytics expose summary, history, tool breakdown, user breakdown, and current-user usage
  • MCP OAuth now supports SAML SSO authentication
  • Client secrets are not issued to public clients
  • get_entity_details now surfaces custom properties in responses

Knowledge Graph and RDF

Requires Apache Jena. Run the RDF Knowledge Graph Index App after upgrade for first-time users.

  • Distributed RDF indexing with state tables for jobs, partitions, locks, and server stats
  • Glossary membership scoping, relation cleanup, distributed mode, and compaction
  • Revamped graph with custom nodes, relation details, and distributed indexing status

Search Index Performance and Live Indexing

  • Tunable settings: refresh interval, replica count, translog durability, sync interval, and per-entity overrides
  • Per-stage reindex timing metrics for reader, process, sink, and vector stages
  • Live indexing retries on failure with a dead-letter queue for failed items
  • Search results can be exported to CSV from the Explore page under Tools
  • Search Index App schedule has moved to weekly — review before upgrading

Ontology Explorer

New first-class governance page at /governance/ontology with graph filters, layout controls, side-panel entity details, and export controls.

Typed Glossary Term Relations

New relation types: relatedTo, synonym, antonym, broader, narrower, partOf, hasPart, calculatedFrom, usedToCalculate, seeAlso

  • New governance settings page to manage relation types
  • Relation badges, filters, and graph views throughout Glossary UI
  • Concept mappings for external IRIs and SKOS-style relation types
  • APIs for relation usage counts, asset counts, batch fetch, add/remove, and relation graph

Data Marketplace

  • New sidebar and routes at /data-marketplace, /data-marketplace/domains, /data-marketplace/data-products
  • Customizable landing page with widgets for domains, data products, announcements, and search

AI and Hybrid Search

  • Google Gemini embedding provider with configurable dimensions and endpoint override
  • OpenAI NLQ: modelId, request timeouts, max tokens, and temperature now configurable
  • Hybrid search tuning: keyword/semantic weights, RRF settings, semantic score threshold, highlight fragment size
  • textToLLMContext and vector body-text extension hooks

Data Quality and Profiler

  • Dynamic and static sampling via profileSampleConfig
  • Explicit metrics selection per profiler run
  • Top-dimension controls for dimensional test cases
  • Bulk add and select-all for logical and bundle test suites
  • Dashboard widgets and filters: data products, certification, incident status, tiers, entity health
  • Storage auto-classification for containers with language-aware recognizer selection
  • Deterministic MySQL median behavior

Governance and Workflows

  • Data-contract references across data assets and service entities
  • Workflow triggers extended: data product, data contract, glossary terms, input ports, output ports
  • Approval tasks show proposed changes with clickable entity links, domain stamped on creation
  • Self-approval prevention for workflow change requests
  • New Archived entity status

New Connectors

  • Microsoft Fabric — database and pipeline with lineage, usage, and profiler
  • Google Drive — ingestion connector and example workflow
  • Pub/Sub — messaging connector with test-connection support
  • QuestDB — database connector
  • IOMETE — database connector
  • SAP SuccessFactors — database connector
  • SAP S/4HANA — dashboard connector
  • Matillion Data Cloud — pipeline connector
  • Airflow 3.x — API-based connector; constraints upgraded to 3.2.1

Connector Improvements

  • Snowflake — opt-in ACCESS_HISTORY lineage path; queries chunked by day to avoid timeouts
  • Unity Catalog — incremental metadata extraction, only fetching changed entities since last run
  • SSRS — report-to-dataset lineage
  • Metabase — chart-level lineage extraction
  • OpenLineage — Glue, Kusto, Cosmos DB naming; symlinks facet for Iceberg; pipeline node for single-sided lineage
  • Storage — compressed archive ingestion (ZIP, tar, gzip) in S3, ADLS, GCS; Redis caching for container ancestors
  • MySQL — queryHistoryTable option; GCP Cloud SQL IAM support
  • Athena — catalogId for S3 Tables and cross-account Glue
  • Oracle — preserveIdentifierCase and useDBATable options
  • S3, ADLS, GCS — profiling capability flags; REST connector S3 and SSL config

Platform, Cache, and Operability

  • Read-bundle prefetch and cache warmup for tags, certifications, relationships, containers, ancestors
  • Redis: cache metrics, distributed warmup, per-command timeout defaulting to 300ms
  • Deadlock retry handling and reduced write deadlocks
  • JSON log format via LOG_FORMAT=json, streamable logs, non-blocking handlers
  • QoS request admission enabled by default via QOS_* settings
  • CSP nonce handling and web security headers: COEP, CORP, COOP
  • Regenerate-bot-tokens for JWT key rotation
  • db-tune ops subcommand and production RDS runbook
  • Diagnostics v2 framework — legacy ExecutionTimeTracker removed

Columns as Independent Entities

Columns are now indexed as independent entities. They appear in asset counts and are the default entity shown in Explore when selecting a database service. Previously tables were shown. This is a behavioral change.

Breaking Changes

Connector and Ingestion Changes

  • Iceberg connector removed — services migrated to CustomDatabase, pipelines hard-deleted. Update any YAML or automation referencing serviceType Iceberg
  • Databricks/Unity Catalog scheme changed from databricks+connector to databricks. Stored configs migrated; external YAMLs must be updated
  • Profiler sampling changed to profileSampleConfig. Old fields profileSample, profileSampleType, samplingMethodType, and computeMetrics are removed
  • randomizedSample defaults now explicitly false in migrated configs
  • Python ingestion targets 3.10, 3.11, 3.12. Key deps: SQLAlchemy 2.x, pandas 2.1.x, pyodbc 5.3.x, Airflow 3.2.1, Databricks SQLAlchemy 2.x
  • Storage manifest partitionColumns uses a smaller partition-column shape

API and Schema Changes

  • Feed APIs no longer accept from in createThread or createPost — remove it from client payloads
  • Search payloads removed the semanticSearch boolean
  • Application schemas renamed preview to enabled with inverted meaning — custom app manifests must use enabled
  • Webhook moved from secretKey to authType object (no auth / bearer / OAuth2)
  • Custom property names must start alphanumeric and cannot contain / or ~
  • Glossary relatedTerms changed to typed TermRelation objects — existing data migrated to relatedTo
  • entity_relationship primary key now includes relationType
  • Logical-suite add endpoint deprecated — use PUT /api/v1/dataQuality/testCases/logicalTestCases/bulk
  • Bulk Assets dryRun now enforced for tag, glossary, dataProduct, and team removes
  • New Archived entity status — update any hard-coded status enums

Operational Notes

  • Quartz tables cleared during migration — stop all instances before upgrading
  • Postgres fqnHash text_pattern_ops indexes added or replaced — runbook included in migration file if build is interrupted
  • New tables for MCP services, servers, executions, RDF indexing jobs, partitions, locks, and server stats
  • SERVER_CHANGE_LOG historical gaps backfilled — missing entries caused data-insights timeline holes
  • Profiler pipeline cleanup force-executed on upgrade to clear stuck pre-1.13 state
  • LOG_FORMAT=json now supported — review any custom Dropwizard logging config
  • QoS admission enabled by default — check QOS_* settings if adjustment needed
  • Redis per-command timeout defaults to 300ms — tune for slow Redis deployments

Changelog

Search and Reindexing

  • Fixed nested children causing Elasticsearch/OpenSearch mapping-depth failures
  • Fixed stale file-extension aggregation on v1.13.0 upgrade causing 500 errors on file search
  • Fixed stale flattened-children highlight field on v1.13.0 upgrade causing 500 errors on container search
  • Fixed search_after silently dropping entities when sort value contains a comma
  • Fixed query, worksheet, and file reindexing missing relationship fields
  • Fixed search-index alias resolution for entity-specific and OpenSearch cluster prefixes
  • Fixed batch-prefetch of upstream lineage leaking Hikari connections during bulk reindex
  • Fixed soft-delete propagation to time-series child aliases
  • Fixed clean reindex jobs incorrectly marked failed when only warnings existed
  • Fixed text-field sorting and aggregation .keyword resolution
  • Fixed user index searches on nested owners queries
  • Fixed advanced-search Contains and Not Contains operators for description field

Glossary, Tags, and Governance

  • Fixed glossary relation rendering for multiple relation types between the same term pair
  • Fixed related-term tooltip sanitization and relation badge colors and icons
  • Fixed tag rename and relationship cache invalidation
  • Fixed TagLabel server fields lost when saving tags
  • Fixed certification tags leaking into regular tags and missing appliedBy audit trail
  • Fixed soft-deleted users appearing in experts and reviewers selectors
  • Fixed hyperlink workflow rules and Tags/Tier field ambiguity

Data Quality and Profiler

  • Fixed test-case suite search membership preservation
  • Fixed tier and certification filter queries in Data Quality dashboard
  • Fixed incident manager status and severity chip behavior
  • Fixed TableColumnCountToBeBetween API responses
  • Fixed column profile percentages showing 0% for zero proportions
  • Fixed tableCustomSQLQuery ignoring computePassedFailedRowCount flag
  • Fixed orphan test cases breaking search indexing
  • Fixed sample randomization at 100% sample

Ingestion and Connectors

  • Fixed single bad table aborting entire schema ingestion run
  • Fixed Snowflake and OpenMetadata socket waits causing silent hangs
  • Fixed Power BI lineage buffer flushing, TSQL Sql.Database parsing, and workspace cache scope
  • Fixed Databricks nested column descriptions and SQLAlchemy 2.x compatibility
  • Fixed Databricks and Unity Catalog valueless tags being silently dropped
  • Fixed Datalake JSON columns typed as string for empty dict values
  • Fixed MySQL profiler median query quoting and deterministic behavior
  • Fixed Redshift interval, numeric, and timestamp precision parsing, view definition, IAM auth, and LISTAGG errors
  • Fixed Oracle, MSSQL, Athena, and Redshift profiler under SQLAlchemy 2.0
  • Fixed dbt column tags, snapshot model patching, compiled-only test results, and test entity links
  • Fixed SQL Server temporal-table period columns classified as PII
  • Fixed SQLAlchemy engine resource leak on multi-database source iteration
  • Fixed ADLS object counts scoped to configured sub-path
  • Fixed PII recognizer selection based on configured language
  • Fixed runtime spaCy model loading for non-root containers

UI and UX

  • Fixed unknown service categories returning 404
  • Fixed Explore page column icon display, search term warnings, and text overflow
  • Fixed lineage edge misalignment, edge hover, temporary lineage table nodes, and service nodes
  • Fixed table constraints UI and cluster-key constraint display and editing
  • Fixed dotted custom-property names display
  • Fixed custom relation badge color handling and overlapping badges
  • Fixed activity feed, task notification refresh, and approval task rendering
  • Fixed MSAL and SAML token renewal and Safari SSO session loss
  • Fixed copy-to-clipboard in non-secure contexts
  • Fixed charts not deleted when parent dashboard or service is deleted
  • Fixed column.extension values silently dropped on entity creation

Security and Dependencies

  • AWS SDK pinned to 2.41.30 — clears CloudFront CVE
  • Airflow upgraded to 3.2.1 — clears 7 CVEs
  • gnutls, libcap, openssh, and rsync CVEs closed in ingestion Docker images
  • Test-connection workflow triggers now require authorization
  • Python ingestion: explicit jsonify at route level to break XSS taint chain
  • Axios, dompurify, follow-redirects, and related UI CVE fixes
  • Jetty and pac4j upgraded for Java-side CVEs

Don't miss a new OpenMetadata release

NewReleases is sending notifications on new releases.