github datahub-project/datahub v0.10.3

latest releases: v0.13.2, v0.13.1, v0.13.1rc2...
11 months ago

Release Highlights

User Experience

  • Define Data Products via YAML and manage associated entities within a Domain
  • Search experience: quickly apply a filter at time of search
  • Form-based PowerBI ingestion

Developer Experience

  • Progress toward Removing Confluent Schema Registry requirement -- Helm & Quickstart simplifications to follow
  • Delete CLI - correctly handles deleting timeseries aspects
  • Ongoing improvements to Quickstart stability
  • Support entity types filter in get_urns_by_filter
  • Search customization
    • regex based query matching
    • full control over scoring functions (useable on any document field, i.e. tags, deprecated flags, etc)
    • enable/disable fuzzy, prefix, exact match queries

Ingestion

  • BigQuery - Improve ingestion disk usage & speed; extract dataset usage from Views
  • Unity Catalog - Capture create/last modified timestamps; extract usage; data profiling support
  • PowerBI - Update workspace concept mapping; support modified_since, extract_dataset_schema, and more
  • Superset – support stateful ingestion
  • Business Glossary – Simplify ingestion source
  • Kafka – Add description in dataset properties
  • S3 – Support stateful ingestion & last_updated
  • CSV Enricher – Support updating more types
  • PII Classification - Configurable sample size
  • Nifi - Support Kerberos authentication

What's Changed

New Contributors

Full Changelog: v0.10.2...v0.10.3

Don't miss a new datahub release

NewReleases is sending notifications on new releases.