DataHub Release 0.8.18 is here!
Release Highlights
-
Metadata Service Authentication: Make authenticated requests to the Metadata Service APIs (GraphQL + Rest.li)
-
Redshift Lineage: Out-of-the-box support for ingesting Dataset->Dataset lineage from Redshift system tables. Includes Tables, Views, and COPY from S3
-
Apache Nifi Connector (Beta) : Integration with Apache Nifi to extract DataJobs and DataFlows! Read the source docs here. This source is currently incubating in beta.
-
Mode Connector (Beta): Integration with Mode Analytics to extract reports, charts, and more! Read the source docs here. This source is currently incubating in beta.
-
Add Aspects without a fork: This is a major milestone towards No-Code UI
- Watch the No Code UI Sneak Peek
-
Glossary Term Transformer: Allows users to add tags or glossary terms to entities based on a regex match filter (Shoutout to Community Member ecooklin!)
-
Bug Fixes:
- [metadata service] Empty search query fails to resolve
- [metadata service] Log4j vulnerability addressed!! Highly recommend folks to upgrade to latest.
- [metadata ingestion] [bigquery] Fix handling of partitioned & snapshotted tables for lineage usage, and basic table indexing.
- [metadata-service] [recommendations] Fix issue where recently viewed and most popular recommendations were not showing up when user urn contains special chars.
- [metadata ingestion] Add config to specify ca certificate path for datahub-rest sink
- [metadata ingestion][snowflake] Handling for special characters in snowflake databases and schemas.
- [ui] Fix Groups page not showing asset ownership correctly
- [ui] Fix issue where markdown links were not clickable.
- [metadata service] Improve search & recommendations performance by ~50%, homepage load by ~50%.
- [cli] Fix deletes by search cannot accept auth token
- [metadata service][policies] Fix invalid Tag creation policy
- [metadata service][upgrade] Fix Spring injection of Entity Client inside datahub-upgrade
Backwards Incompatible Changes
- The standalone Spring GraphQL Service has been removed. (Replaced in full by Metadata Service GraphQL API)
New Contributors
- @robscriva made their first contribution in #3600
- @adriangb made their first contribution in #3582
- @bartlomiejolma made their first contribution in #3650
- @anshbansal made their first contribution in #3653
- @ecooklin made their first contribution in #3657
What's Changed
- style(react-app): add default monospace font to font-family by @robscriva in #3600
- feat(boot): Ingest datahub root user info on boot by @jjoyce0510 in #3603
- [refactor] - Remove GMS GraphQL Service by @arunvasudevan in #3605
- feat(auth): Metadata Service Authentication! by @jjoyce0510 in #3598
- docs:remove hubspot form and instead link to acryldata.io by @jeffmerrick in #3488
- fix(docs): Move transformers to be under metadata ingestion by @aseembansal-gogo in #3591
- fix(bigquery-usage): Fix filters and event joining logic. by @varunbharill in #3610
- feat(cli): adding a put command and docs by @swaroopjagadish in #3614
- feat(elastic): adding es logo by @gabe-lyons in #3611
- feat(profiler): dynamically combine queries by @hsheth2 in #3572
- doc(components): Adding DataHub components overview by @jjoyce0510 in #3606
- fix(java client): Fix Profiling NPE + misc improvements by @jjoyce0510 in #3621
- fix(docs-website): fix incorrect managed url by @jeffmerrick in #3618
- fix(ingest): rectify platform urn in kafka connect source by @mayurinehate in #3624
- docs(okta): Added Okta Logout Settings by @serefacet in #3627
- fix(search): Fix issue when query is empty by @dexter-mh-lee in #3620
- fix(redshift-usage): Add docs for redshift usage ingestion. by @varunbharill in #3617
- fix(ci): pin great expectations version by @swaroopjagadish in #3629
- fix(delete): Remove logic that adds an invalid filter for platform field by @dexter-mh-lee in #3619
- feat(metadata-service): support for custom model extensions without forks by @shirshanka in #3630
- fix(kafka-producer): fix debug logging by @claudio-benfatto in #3626
- fix(tests): fix typo in test name by @adriangb in #3582
- feat(cfg): Add configurable GCP log page size by @jjoyce0510 in #3556
- fix(recommendations): Fix issue with recently viewed and most popular recs not showing up by @dexter-mh-lee in #3631
- fix(ingestion): Add config to specify ca certificate path for datahub-rest sink by @dexter-mh-lee in #3632
- fix(ingest): workaround great-expectations compatibility issue by @hsheth2 in #3634
- fix(ingestion): Handling for special characters in snowflake databases and schemas. by @rslanka in #3635
- fix(group ownership): Fixing Groups Profile ownership by @jjoyce0510 in #3638
- feat(autorender): Auto render aspects that don't have frontend components in the UI by @gabe-lyons in #3597
- docs(business glossary): document the business glossary file format by @gabe-lyons in #3639
- fix(ingestion): Enhance supported and unsupported base_objects_accessed for Snowflake Usage by @rslanka in #3608
- feat(quickstart): Simplify docker generate and compare script by @EnricoMi in #3434
- fix(docs): small fixes to docs and docker images for custom metadata … by @swaroopjagadish in #3640
- fix(mongodb): enable version check for document size filter. by @varunbharill in #3644
- docs: Update to DataHub Adopter logos & Townhall details by @maggiehays in #3648
- feat(build): adds support for incremental build in ingestion by @swaroopjagadish in #3647
- fix(description): fix issue where markdown links are unclickable by @gabe-lyons in #3646
- fix(schema): fix bug where key/value toggle would appear on schema tabs with no fields by @gabe-lyons in #3643
- feat(build): Preflight script for metadata ingestion setup on m1 by @treff7es in #3652
- docs(graphql) Adding additional GraphQL docs by @jjoyce0510 in #3649
- docs: correct title of postgres gms by @bartlomiejolma in #3650
- fix(cli): fix for deletion cli by @anshbansal in #3653
- fix(metadata-io) Adds docker engine configuration checks before running docker-based tests by @pedro93 in #3654
- fix(model): Remove unused PDL from pre-nocode days by @dexter-mh-lee in #3659
- fix(docs): fix docs build on m1 by @anshbansal in #3662
- feat(ingest): add --strict-warnings option by @hsheth2 in #3665
- fix(search): Improve search and recs performance by @dexter-mh-lee in #3660
- feat(metadata-model): adding metadata model doc generation and upload… by @swaroopjagadish in #3667
- fix(ingestion): black formatting by @hsheth2 in #3676
- fix(metadata-ingestion): fix requirements for m1 preflight checks by @gabe-lyons in #3677
- fix(kafka): Add back changes to centralize kafka config by @dexter-mh-lee in #3675
- feat(ingestion): anonymous usage stats by @kevinhu in #3668
- docs(scheduling): re-arrange docs related to scheduling, lineage, CLI by @anshbansal in #3669
- feat(delete): support deleting by search w/ tokens by @gabe-lyons in #3684
- docs: change roadmap link in docs by @jeffmerrick in #3685
- docs(business glossary): fix specification of the file by @anshbansal in #3679
- refactor(profiling): clean up SQL query analysis by @hsheth2 in #3674
- fix(tags): Fixing Tag Create Privileges (issue #3609) by @jjoyce0510 in #3683
- fix(elasticsearch): Use auth tokens to authorize curl requests in dockerize by @dexter-mh-lee in #3596
- fix(snowflake): support geo types by @gabe-lyons in #3686
- feat(profiler): add query combiner report statistics by @hsheth2 in #3678
- feat(transformer) Adds glossary terms transformer by @ecooklin in #3657
- fix(deletes): Fixing system metadata index deletes by @jjoyce0510 in #3693
- feat(ingest): add nifi source in metadata-ingestion by @mayurinehate in #3681
- feat(bigquery): support snapshot and partition tables in bigquery ingest & lineage by @gabe-lyons in #3695
- fix(ingest): refactor urn deletion by @kevinhu in #3694
- fix(perf-test): fix for M1 by @anshbansal in #3689
- fix(bootstrap): revert accidental change to file_to_datahub_rest.yml by @gabe-lyons in #3698
- feat(ingestion): Add lineage support for Redshift source by @gabe-lyons in #3697
- fix(ingestion): Disable query parser failure reporting to Datahub in redshift lineage by default by @treff7es in #3699
- docs(airflow): add some troubleshooting for error by @anshbansal in #3687
- docs(redshift): Adding requirements for redshift permissions by @treff7es in #3707
- fix(nifi): add env in nifi config, add unit tests, fix nifi doc by @mayurinehate in #3703
- feat(mode): add mode analytics ingestion source by @gabe-lyons in #3710
- fix(url encoding): also encode square brackets by @gabe-lyons in #3709
- fix(datahub-upgrade): Fix Spring injection issue with datahub-upgrade by @dexter-mh-lee in #3688
- docs(guide): add example for adding user in DataHub by @anshbansal in #3682
- fix(home): Change docs count to not count removed datasets by @dexter-mh-lee in #3711
- Fix CVE-2021-44228 by @frsann in #3716
- Full Changelog*: v0.8.17...v0.8.18