Release Highlights
User Experience
- Improvements to UI-based ingestion: view live logs during execution, view ingestion summary (ie. number of entities ingested), and rollback functionality. Also surfaces CLI-run ingestion jobs.
- New look on Homepage: Domains have been promoted to the top of the fold, so they are listed above Entity cards and Platform cards
- Improvements to searching for Looker resources - when searching for a measure or dimension, we will now surface Looks & Dashboards that reference those fields
- The DataHub Docs Site has a new look! We are reorganizing content to make it easier and more intuitive for DataHub Developers and End-Users alike to navigate our resources.
- Improved Error Handling on the UI - a much nicer messaging when exceptions are caught by the frontend application.
- Misc minor bug fixes and improvements
Developer Experience
- Eternal personal access tokens are now supported
- Deprecated support for Python 3.6 (we expect this to have little-to-no impact on the Community based on pip download data)
Metadata Ingestion
- Improved documentation for Domains transformer
- Stateful Ingestion now supported for Glue
data-lake
Source has been deprecated in favor ofs3
source- Chart Entity now supports chartUsageStatistics
- dbt ingestion supports auto-extracting owner from the
meta
block - Improved Snowflake Connector is now available; we expect this to provide a reduction in ingestion run-time and lower levels of complexity
What's Changed
- chore(ingest): remove orderedset dependency by @hsheth2 in #5591
- refactor(ingest): simplify upgrade version stats by @hsheth2 in #5588
- feat(metadata-service-auth): add support for eternal personal access tokens by @ksrinath in #5433
- fix(ci): paths for github workflows by @anshbansal in #5595
- fix(ingest): Fix ingest Clickhouse without password by @liyuhui666 in #5511
- fix(ci): cleanup sleeps to instead use retries by @anshbansal in #5597
- Kafka form Addition and resolved confilict by @Ankit-Keshari-Vituity in #5598
- fix(ingest): Fix minor logging bug in the glue source. by @rslanka in #5605
- fix(ci): use different image for smoke base image by @anshbansal in #5607
- fix(ci): cancel docker-unified workflow only on PRs on new commits by @anshbansal in #5608
- fix(ci): add env variable for creds smoke test by @anshbansal in #5609
- fix(ui) Followups to recent changes to UI ingestion forms by @chriscollins3456 in #5602
- docs(transformers): Add domain transformer documentation in transformers readme by @mohdsiddique in #5606
- feat(model): adding status aspect to assertions by @shirshanka in #5612
- fix(ingest): use default telemetry ID when config is unwritable by @hsheth2 in #5614
- chore(ingest): drop python 3.6 support by @hsheth2 in #5521
- fix(ui): Split based on Data Platform delimiter in Lineage viz by @jjoyce0510 in #5613
- feat(search): Sticky search filters + misc bug fixes & improvements by @jjoyce0510 in #5601
- fix(graphql): handle null source values in ml features & primary keys by @gabe-lyons in #5626
- fix(graph service): only query for entities that should have lineage [Breaking Change] by @gabe-lyons in #5539
- feat(model): Add optional message field to auditstamp by @gabe-lyons in #5611
- fix(ingest): fix indenting issue in azure ad connector by @aditya-radhakrishnan in #5627
- feat(tokens) Create and display non-expiring tokens on the frontend by @chriscollins3456 in #5630
- Schema tab: Fixed the header issue by @Ankit-Keshari-Vituity in #5622
- build(docs-website): only show release notes for recent releases by @hsheth2 in #5621
- docs(README): update links and reorg content by @maggiehays in #5618
- perf(operations): performance improvement to operations tab via reduced fetching by @gabe-lyons in #5632
- feat(ui) Retrieve last ingested timestamp and display on frontend by @chriscollins3456 in #5600
- Update README.md and maintaining consistency by @hemanthkotaprolu in #5623
- fix(ingest): fix delta-lake dict iteration bug by @hsheth2 in #5625
- fix(ingest): okta - make async loop init more robust by @shirshanka in #5640
- fix(ingest): cli - handle exception in upgrade check by @shirshanka in #5641
- build(docs-website): make codegen script idempotent by @hsheth2 in #5620
- docs(airflow): fix formatting by @hsheth2 in #5617
- fix(ui): Fixing minor search redirect filtering issue introduced by sticky filters by @jjoyce0510 in #5643
- fix(ingestion): Update developer docs by @szalai1 in #5644
- feat(ui): Adding slack handle to corp group info by @jjoyce0510 in #5645
- fix(delta-table): allow env, credential file based s3 auth by @MugdhaHardikar-GSLab in #5636
- feat(GraphQL API): Add "browsePaths" field to browsable entity types by @jjoyce0510 in #5646
- feat(ingest): generate a list of aspects in codegen by @hsheth2 in #5633
- feat(ingestion): Glue stateful ingestion by @amanda-her in #5553
- feat(ingest): add snowflake-beta source by @mayurinehate in #5517
- fix(ingest): remove alphabet field from allow/deny config by @hsheth2 in #5629
- feat(mssql): add multi database ingest support by @MugdhaHardikar-GSLab in #5516
- chore(ingest): drop data-lake source in favor of s3 source by @hsheth2 in #5628
- fix(ingest): use mongodb ping command to test connection by @hsheth2 in #5650
- fix(ingest): remove
profile_sql_table
event by @hsheth2 in #5616 - fix(ci): use graphql instead of restli by @anshbansal in #5610
- feat(ingest): rest_emitter - Adding option to disable ssl by @szalai1 in #5642
- feat(ingest): GE Profile/Action Trino support by @aezomz in #5361
- Stats Tab: Table and column stats hide when there is no data by @Ankit-Keshari-Vituity in #5651
- fix(ingest): redash - fix redash dashboard url bug by @de-kwanyoung-son in #5500
- Glossary: Worked on the refetching data issue by @Ankit-Keshari-Vituity in #5638
- feat(ingestion) Fetch live logs on an ingestion run from UI by @chriscollins3456 in #5653
- fix(spark-lineage): Create application setup on sqlevent start by @MugdhaHardikar-GSLab in #5657
- fix(ui) Remove constraint for searching with less than 3 characters by @chriscollins3456 in #5654
- docs: adds ABLY as DataHub adopter by @de-kwanyoung-son in #5656
- fix(siblings): set sleep after checking if the restore step should run by @gabe-lyons in #5660
- fix(users): add origin aspect to corpuser by @aditya-radhakrishnan in #5662
- feat(domains): highlighting domain recommendation cards on homepage by @gabe-lyons in #5655
- feat(ingestion) Followups to live ingestion logs in UI by @chriscollins3456 in #5676
- feat(test): add option to send to slack thread by @anshbansal in #5673
- chore(ingest): set min stackprinter version by @hsheth2 in #5666
- docs(airflow): fix note formatting by @hsheth2 in #5679
- docs: fixes typos in Business Glossary docs by @topleft in #5615
- fix(docs) Fix link from Business Glossary ingestion page by @chriscollins3456 in #5680
- Worked on the Hive ingestion form by @Ankit-Keshari-Vituity in #5661
- feat(ingestion): Support for displaying history of CLI ingestion runs in the "Manage Ingestion" UI by @rslanka in #5639
- Search Page: Pagination Issue by @Ankit-Keshari-Vituity in #5685
- feat(ingestion-ui) Display CLI-based ingestion sources in UI by @chriscollins3456 in #5681
- fix(schema-history): make latestVersion field on result optional by @aditya-radhakrishnan in #5689
- feat(ingest): file - add support for folders, large files, improve co… by @shirshanka in #5692
- feat(ingest): rest-sink - stability improvements to handle large inpu… by @shirshanka in #5693
- Add UP_FOR_RETRY DPI run result by @divyamanohar-stripe in #5664
- feat(ingest): add support for a event failure log + reporting cancelled runs on cli by @shirshanka in #5694
- fix(doc): Fixing boolean type in datahub rest emitter's json schema by @treff7es in #5695
- fix(ui) Refresh executions on Ingestion page when they are visible by @chriscollins3456 in #5698
- fix(ingest): emit status aspect for entities ingested from okta and azure_ad by @aditya-radhakrishnan in #5700
- feat(kafka-setup): Adds SASL SSL support in kafka setup docker image by @pedro93 in #5697
- fix(ingest): refactor sync-async config, thread-safety for sink repor… by @shirshanka in #5705
- feat(ingest): add
enable_owner_extraction
option to dbt by @hsheth2 in #5707 - feat(ingestion): add github_info config for dbt by @remisalmon in #5648
- docs(ingest): add info about datahub auth tokens with airflow by @hsheth2 in #5703
- fix(airflow): Stable tag order in DataFlow/DataJobs by @treff7es in #5696
- fix(ingest): add pymongo srv extra by @hsheth2 in #5701
- fix(ui): Long overdue - Fix red error screens during OIDC login, logout exception scenarios by @jjoyce0510 in #5708
- feat(ingest): better reporting for file source, friendlier stats names by @shirshanka in #5710
- Worked on postgres ingestion form integration by @Ankit-Keshari-Vituity in #5671
- feat(ingest): Add mode option to presto-on-hive source by @szalai1 in #5659
- Worked on the alignment of all data in domain list by @Ankit-Keshari-Vituity in #5713
- feat(retention) Enable retention and set max versions for executionRequests by @chriscollins3456 in #5704
- fix(ingestion): Fix nifi integration tests. by @rslanka in #5718
- build(deps): bump nbconvert from 6.5.0 to 6.5.1 in /docker/datahub-ingestion by @dependabot in #5716
- feat(ingest): remove nulls during serialization by @shirshanka in #5719
- feat(looker): index looker charts and dashboards by business term by @gabe-lyons in #5649
- fix(GMS): No such classes directory file:///etc/datahub/plugins/auth/r… by @mohdsiddique in #5720
- fix(ingestion): ingest tables from dba_tables in oracle source by @mohdsiddique in #5592
- fix(ingest): redshift-usage: check full table/schema names with AllowDenyPattern by @hsheth2 in #5702
- Worked on the scroll to top of the page after pagination change by @Ankit-Keshari-Vituity in #5714
- feat(ingest): round time to 2 decimal places by @anshbansal in #5721
- fix(superset): do not crash when display_uri is not set by @daha in #5711
- fix(deps): remove tdigest dependency and associated code by @shirshanka in #5729
- fix(ingest): bigquery - Not setting ge config schema when profiling with temp table by @treff7es in #5737
- feat(ingest): file - allow filter by aspect and get stats by @anshbansal in #5738
- fix(ingest): looker - soft-deleted charts should re-emerge on re-disc… by @shirshanka in #5732
- feat(elasticsearch): Add nested type display by @liyuhui666 in #5524
- fix(docs): fixes issue with auto-generated ingestion doc by @shirshanka in #5733
- feat(mysql): support multiple database in single recipe by @MugdhaHardikar-GSLab in #5684
- fix(ingest): tweak mongodb schema inference to fix test by @hsheth2 in #5744
- fix(bootstrap): Remove malformed test in bootstrap.json by @jjoyce0510 in #5747
- docs(site redesign): Overhaul Docs Site by @maggiehays in #5731
- fix(ingestion): Fix SQL Lineage Parser to handle special tokens with a hyphen in table and column names. by @rslanka in #5748
- Snowflake beta improvements by @mayurinehate in #5736
- chore(ingest): update mixpanel api endpoint by @hsheth2 in #5750
- feat(model): add chartUsageStatistics to the chart entity by @shirshanka in #5753
- fix(ui): Improve Error Messaging on the UI by @jjoyce0510 in #5752
- chore(ingest): add vulture config and remove some dead code by @hsheth2 in #5745
- fix(doc): presto-on-hive - Removing new lines from docs to fix doc generation by @treff7es in #5755
- feat(restore-indices): add multithreading and add aspectName, urn filter by @anshbansal in #5712
- fix(GMS): fix no such classes directory file:///etc/datahub/plugins/auth/resources by @mohdsiddique in #5743
- feat(ingestion) Add ability to rollback ingestion from UI - BE PR by @chriscollins3456 in #5739
- feat(ingestion-ui) Add ability to set debug_mode on UI ingestion sources by @chriscollins3456 in #5762
- fix(search): validate entities exist before returning search results in EntityClient by @aditya-radhakrishnan in #5751
- feat(ingestion-ui) Add ability to rollback ingestion runs from the UI - FE only by @chriscollins3456 in #5740
- fix(ingest): proper null skip logic in serialization by @hsheth2 in #5749
- fix(ingest): snowflake-beta fix missing initialization of variable by @mayurinehate in #5757
- fix(ingest): add databricks dep for hive by @hsheth2 in #5764
- feat(ingest): add config to extractor interface by @hsheth2 in #5761
- chore: update server-side telemetry endpoint by @hsheth2 in #5759
- feat(ingestion): bigquery - Bigquery beta connector - first cut by @treff7es in #5663
- feat(ingestion): looker chart usage statistics by @mohdsiddique in #5652
- feat(restore-indices): add urn like filter by @anshbansal in #5770
- feat(restore-indices): add timing info by @anshbansal in #5773
- feat(simplified homepage): adding option to show limited entity types on homepage by @gabe-lyons in #5678
- fix(ingest): add pydantic version upper bound by @hsheth2 in #5775
- Worked on the Secret Fields in ingestion form by @Ankit-Keshari-Vituity in #5727
- feat(cli): add spinner to indicate progress by @shirshanka in #5769
- feat(roles): add roles feature to DataHub by @aditya-radhakrishnan in #5767
- feat(model): add storage size to dataset profiles by @shirshanka in #5777
- docs(roles): add documentation about roles by @aditya-radhakrishnan in #5778
- fix(ui): Remove add limit on Entity Profile for glossary terms and tags by @jjoyce0510 in #5780
- fix(ci): Attempting to fix failing smoke tests by @jjoyce0510 in #5760
- fix(tags) Add creator of tag as the owner of it by @chriscollins3456 in #5787
- docs(lookml): updating github_info in lookml docs by @gabe-lyons in #5779
- fix(audit logs) Set actor urn on audit stamp through Java Entity Client by @chriscollins3456 in #5788
- feat(ingestion-ui) Add test connection button to Looker form by @chriscollins3456 in #5794
- fix(ingestion): fix looker chart-usage by @mohdsiddique in #5791
- fix(ingest): Fix oauth config validation in snowflake. by @rslanka in #5796
- fix(bootstrap): Creating dedicated thread pool for executing async bootstrap steps + misc fixes by @jjoyce0510 in #5798
- feat(previews): add previews for glossary terms, tags, and domains by @gabe-lyons in #5784
New Contributors
- @hemanthkotaprolu made their first contribution in #5623
- @szalai1 made their first contribution in #5644
- @amanda-her made their first contribution in #5553
- @de-kwanyoung-son made their first contribution in #5500
- @topleft made their first contribution in #5615
- @divyamanohar-stripe made their first contribution in #5664
Full Changelog: v0.8.43...v0.8.44