github datahub-project/datahub v0.10.1
DataHub v0.10.1

latest releases: v0.13.0, v0.12.1, v0.12.1rc2...
12 months ago

Release Highlights

User Experience

  • The Queries Tab has a new look - supports manually adding and annotating queries directly from the UI, making it easier to share trusted SQL logic with others
  • Glossary Terms now shows “Contained by" and "Inherited by" relationships
  • Resolved issues with Download to CSV for large volumes of entities
  • Update to the Analytics tab - view Monthly Active users to keep track of DataHub adoption and activity within your organization
  • Ongoing UI optimizations focused on improve navigation experience

Metadata Ingestion

BigQuery

  • Improvements to memory usage during metadata extraction
  • Ingestion now captures Dataset Labels
  • Emit cross-project usage

PowerBI

  • Support for Platform Instance and uniquely identify multiple instances of the same Platform
  • Support for PowerBI <> (Redshift, BigQuery) lineage extraction
  • Extract entity descriptions

Miscellaneous

  • DataHub Integrations Catalog to quickly filter and search for supported integrations
  • Kafka Connect - support for stateful ingestion & lowercasing URNs
  • Snowflake: improvements to memory usage during metadata extraction
  • Postgres: supports estimated row counts during profiling
  • Fix to dbt ingestion to address inconsistent upper/lower casing
  • S3 ingestion now supports path_specs of multiple buckets in the same recipe
  • Looker: Upgrade Looker API from 3.1 to 4.0
  • Great Expectations: support for lowercasing URNs
  • Tableau: Support for Project Path & Containers; ingestion more resilient to timeout exceptions

Developer Experience

Miscellaneous

  • Neo4j support for lineage time filter
  • Metadata model support for JSON schemas stored in Files, Directories, and Kafka Schema Registry
  • Timeline API now supports Glossary Terms
  • Improvements to startup time for DataHub CLI

API Docs & Guides

  • Table of contents to understand DataHub APIs at a glance
  • Guides:
    • Add Tags, Terms, Owners to entities
    • Create datasets
    • Manage Lineage

Search Improvements

  • searchAcrossEntities/Lineage improvements
  • support searchAfter
  • advanced query, identity autocomplete, exact match weight

Breaking Changes

What's Changed

New Contributors

Full Changelog: v0.10.0...v0.10.1

v0.10.0

Release Highlights

Potential Downtime

This release introduces substantial improvements to search functionality which require reindexing indices.

During the reindexing:

  • a system-update job will set indices to read-only and create a backup/clone of each index
  • new components will be prevented from start-up until the reindex completes
  • Helm deployments will go into read-only mode and new ingestion runs will fail

This process can take anywhere from 5 minutes to multiple hours; as rough estimate, please expect it to take 1 hour for every 2.3 million entities. After the reindex is complete, please check your ingestion run to re-run any that did not complete.

User Experience

We have some really exciting improvements to the DataHub user experience in this release!

Improved documentation editor, contributed by @ngamanda and the Grab Team.
This work provides a much more intuitive documentation editing experience within the UI, providing “what you see is what you get” formatting & removing the need for markdown expertise.

Additionally, you can easily:

  • Add links to other entities/users within DataHub
  • embed and resize tables & images
  • toggle between font sizes and formats
  • embed syntax-highlighted code blocks

Filter lineage graphs based on time windows
You can now easily see the full lineage graph of an entity at a specific point in time. This makes it much easier to understand how interdependencies have evolved over time and to troubleshoot data issues in the past.

Improvements in Search
As noted above, we have rolled out substantial improvements to Search functionality, making it easier than ever for end-user to find the entities that matter most. This release includes:

  • Stemm & Synonyms
  • Search by full or partial URN
  • Autocomplete improvements
  • Quoted search analyzer for exact & prefix match

Metadata Ingestion

Here are some of the most notable ingestion-related improvements:

  • Redshift: You can now extract lineage information from unload queries – thanks for the contrib, @mmmeeedddsss
  • PowerBI: Ingestion now maps Workspaces to DataHub Containers – thanks for the contrib, @looppi
  • BigQuery: You can now extract lineage metadata from the Catalog API – thanks for the crontrib, @PatrickfBraz
  • Glue: Ingestion now uses table name as the human-readable name – thanks for the contrib, @danielcmessias

Developer Experience

  • This release introduces DataHub Lite - a new experimental lightweight implementation of DataHub. It is intended to enable local developer tooling use-cases such as simple access to metadata for scripts and other tools. DataHub Lite is compatible with the DataHub metadata format and all the ingestion connectors that DataHub supports. Checkout the docs here.

Breaking Changes

#7103 This should only impact users who have configured explicit non-default names for DataHub's Kafka topics. The environment variables used to configure Kafka topics for DataHub used in the kafka-setup docker image have been updated to be in-line with other DataHub components, for more info see our docs on Configuring Kafka in DataHub . They have been suffixed with _TOPIC where as now the correct suffix is _TOPIC_NAME. This change should not affect any user who is using default Kafka names.

What's Changed

  • fix(ci): only scan on master branch by @anshbansal in #7047
  • fix(ci): use trivy offline scanning by @anshbansal in #7050
  • docs(get-started) Simplify copy on Get Started landing page by @maggiehays in #7043
  • fix(ingest/kafka): fix ResourceType import error for confluent_kafka<1.9.0 by @mayurinehate in #7046
  • docs(dbt): fix indentation in dbt meta mapping docs by @jx2lee in #7045
  • fix(ingest): temporarily disable vertica tests by @hsheth2 in #7059
  • feat(editor): improve documentation editor using Remirror by @ngamanda in #6631
  • fix(bootstrap): add EDIT_LINEAGE privilege to some default policies by @aditya-radhakrishnan in #7060
  • feat(ingest): add entity registry in codegen by @hsheth2 in #6984
  • feat(ingest): extract powerbi endorsements to tags by @looppi in #6638
  • feat(ingestion): pull metabase database, schema names from raw query and api by @remisalmon in #7039
  • fix(ingest): support multiple entity_registry sections by @hsheth2 in #7066
  • ci(ingest): add flag to skip tests but run codegen during release by @hsheth2 in #7067
  • fix(ingest): preserve dbt column name casing by @hsheth2 in #7063
  • fix(ingest/tableau): fix node limit exceeded error for workbooks query by @mayurinehate in #7068
  • fix(build/airflow): Fixing gradlew path by @treff7es in #7069
  • feat(ingest): support snapshots in dbt and dbt-cloud by @hsheth2 in #7062
  • fix(ui) Fix duplicate schema field rendering with siblings by @chriscollins3456 in #7057
  • refactor(ingest/athena): Replace s3_staging_dir parameter in Athena source with query_result_location by @bossenti in #7044
  • feat(ingest): fix handling of unions with aliases in post restli conversion by @hsheth2 in #7058
  • fix(ui) Make checkboxes in ingestion forms easier to see by @chriscollins3456 in #7061
  • fix(ingest): support git clone of non-github repos by @hsheth2 in #7065
  • feat(ingest): reporting revamp, part 1 by @hsheth2 in #7031
  • fix(secret-service): fix default encrypt key by @david-leifker in #7074
  • feat(datahub-lite): introduces a new experimental lightweight impleme… by @shirshanka in #7052
  • feat(datahub-lite): adding tab completion, small serialization fixes by @shirshanka in #7079
  • docs: add docs for managed DataHub v0.1.72 by @anshbansal in #7070
  • docs(readme): add inovex as adopter by @DSchmidtDev in #7077
  • docs: add warning about clearing cookies for login by @anshbansal in #7084
  • feat(cache): add hazelcast distributed cache option by @RyanHolstien in #6645
  • docs(datahub-lite): small improvement for zsh tab completion by @shirshanka in #7085
  • fix(ingest/bigquery): clear stateful ingestion correctly by @hsheth2 in #7075
  • fix(graphql): Return with appropriate status code instead of stacktrace by @szalai1 in #7086
  • fix(sso): Clear cookies on SSO redirect error by @aditya-radhakrishnan in #7088
  • fix(docs): add missing mutation literal by @ruedigerblock in #7082
  • fix(ui): display the correct access token expiry in AccessTokenModal by @ngamanda in #7078
  • fix(cli/lite): fix datahub lite serve command by @hsheth2 in #7089
  • fix(profiling): Fix syntax for APPROX_COUNT_DISTINCT on bigquery and snowflake by @feljen in #7087
  • fix(ingest): fix logic error of google protobuf wrapper type. by @wngus606 in #7076
  • feat(ui): Documentation Editor Improvements by @jjoyce0510 in #7072
  • fix(uri): marks uri field as deprecated, removes problem code, and adds coercer for usages of URI typeref by @RyanHolstien in #7093
  • fix(build): postgres docker secret by @david-leifker in #7092
  • fix(ingest/snowflake): handle corrupted snowflake OCSP cache file by @hsheth2 in #7095
  • refactor(ingest): Refactoring container creation to common place by @treff7es in #6877
  • feat(ingest): move datahub-lite to optional dep and add shim when missing by @hsheth2 in #7097
  • fix(docker): support non amd64 dockerize in setup containers by @tonycsoka in #7091
  • test(ingest): fix kafka admin client mocking by @hsheth2 in #7098
  • fix(build): Fix postgres setup gha by @david-leifker in #7104
  • fix(ingest/profile): properly quoting approx_count_distinct by @treff7es in #7101
  • style(models): Replaces non-ASCII charactes in pdl files with ASCII c… by @nmbryant in #7105
  • feat(ingest): hide cartesian product warnings in GE profiler by @hsheth2 in #7096
  • feat(ingest): add removing partition pattern in spark lineage by @ssilb4 in #6605
  • feat(redshift): Fetch lineage from unload queries by @mmmeeedddsss in #7041
  • fix(ci): do not confirm on force for deletion by @anshbansal in #7106
  • fix(analytics): add missing usage events causing warning in logs by @anshbansal in #7109
  • feat(quickstart): Remove kafka-setup as a hard deployment requirement by @pedro93 in #7073
  • fix(tests): Fixing add_users smoke test by @jjoyce0510 in #7116
  • chore(deps): bump ua-parser-js from 0.7.32 to 0.7.33 in /docs-website by @dependabot in #7122
  • docs(gms): clarify behavior of soft deletion in UI by @aditya-radhakrishnan in #7117
  • fix(kafka-setup): Make topic name consistent with other images by @pedro93 in #7103
  • chore(deps): bump ua-parser-js from 0.7.32 to 0.7.33 in /datahub-web-react by @dependabot in #7123
  • feat(ingest): powerbi # add powerbi workspaces to containers by @looppi in #6532
  • fix(diffMode): prevent misconfiguration of diff mode by @RyanHolstien in #7127
  • fix(ui) Display glossary term name in analytics page properly by @chriscollins3456 in #7128
  • fix(ui): only use visible and enabled tabs for selected tab and routing in entity profiles by @Masterchen09 in #6629
  • fix(htrace): remove htrace jar by @szalai1 in #7126
  • feat(datahub-lite): simplify get response by @shirshanka in #7131
  • fix(doc/biquery): Updating bigquery capability doc by @treff7es in #7136
  • fix(ci): do not fail fast for matrix runs by @anshbansal in #7132
  • refactor(ui): refactor capitalization of platform name and sub types by @Masterchen09 in #7099
  • refactor(cli): extract method, change wording by @anshbansal in #7134
  • docs(lineage): Updating Lineage feature guide by @maggiehays in #6257
  • removing WIP by @laulpogan in #7140
  • docs(oidc): Updating + improving docs around OIDC configuration by @jjoyce0510 in #7141
  • fix(ingest): add message proto check by @tinolyu in #7130
  • fix(ingest): use snowflake median function in profiling by @hsheth2 in #6987
  • feat(ui): allow removing parentNodes of Glossary Nodes and Glossary Terms by @ngamanda in #7135
  • feat(ui) Add new embedded profile to be displayed in extension by @chriscollins3456 in #7113
  • feat(ingest): add --log-file option and show CLI logs in UI report by @hsheth2 in #7118
  • fix(misc): NPE and GraphQL case fixes by @david-leifker in #7149
  • fix(ingest/snowflake): fix regression in approx count distinct by @hsheth2 in #7146
  • [docs] fix typo / add missing line for docker compose / attach overwriting system action config for confluent. by @kdongho in #7142
  • reordering sidebar and adding homepage to apis by @laulpogan in #7139
  • fix(ingestion): powerbi # Not all arguments converted to string by @mohdsiddique in #7157
  • fix(ui): Sort top users by their query count in datasets stats tab by @jaykadambi in #7148
  • refactor(ui): Updates to Manual Lineage search by @jjoyce0510 in #7151
  • feat(ui) Build entity doesn't exist page for entity profiles by @chriscollins3456 in #7150
  • ci(ingest): fix broken CI workflow for metadata-ingestion by @hsheth2 in #7161
  • fix(ingest): azuread group mapping do not stop ingestion by @anshbansal in #7169
  • fix(docs): Fixes links to docs templates by @viniciusdsmello in #7171
  • refactor(ui ingest): Allow enabling / disabling ingestion schedule easily by @jjoyce0510 in #7162
  • fix(ingest): switch various sources to auto_stale_entity_removal helper by @hsheth2 in #7158
  • docs(townhall) Update Townhall History doc by @maggiehays in #7180
  • test(ingest/delta-lake): fix spurious directory creation by @hsheth2 in #7179
  • feat: add a linter for github actions workflows by @hsheth2 in #7178
  • fix(quickstart): adding back kafka-setup by @szalai1 in #7181
  • fix(docs) Fix broken links in ingestion docs by @chriscollins3456 in #7183
  • fix(ingest/GX): fix snowflake urn generated from connection string by @mayurinehate in #7173
  • feat(ingest): switch dbt to use auto_stale_entity_removal by @hsheth2 in #7160
  • fix(ingest): fix issue in glue tests by @hsheth2 in #7185
  • fix(log): logging timestamp in ISO8601 format instead of time by @anshbansal in #7188
  • feat(ingest): bigquery - extracts lineage metadata from catalog api by @PatrickfBraz in #7137
  • fix(ingest/tableau): show warning about token expiry for PATs by @hsheth2 in #7187
  • fix(ingest/vertica): Fixing missing container properties by @treff7es in #7197
  • chore(deps): bump Netty from 4.1.85.Final to 4.1.86.Final by @janhicken in #7191
  • docs(ingestion): powerbi # Add permission for DAX and mashup expressions by @mohdsiddique in #7195
  • feat(elasticsearch): Elasticsearch improvements by @david-leifker in #6894
  • fix(test): spark-lineage # build task as dependency of integrationTest by @mohdsiddique in #7189
  • chore(sample): add status removed aspect for sample data by @anshbansal in #7203
  • docs(managed datahub): release notes for v0.1.73 by @anshbansal in #7194
  • fix(bootstrapdata): update timestamp to be in the last 1 year by @szalai1 in #7206
  • fix(ingest/bigquery): quoting for APPROX_COUNT_DISTINCT in BigQuery by @mryorik in #7207
  • fix(versioning): Ensure that CLI version is always dot-delimited even in minor release versions by @jjoyce0510 in #7200
  • fix(test): missing variables in test causing error in logs by @anshbansal in #7210
  • feat(mlModel): mark downstream jobs as ml model downstreams lineage by @mayurinehate in #7205
  • ci(): fix datahub-upgrade quickstart regression by @hsheth2 in #7217
  • feat(ingest): Add custom properties to the ldap ingestion by @bda618 in #7125
  • fix(ingest): upgrade feast to avoid build issues by @hsheth2 in #7218
  • fix(ui) Increase the number of assertions that we query for in tab by @chriscollins3456 in #7215
  • fix(ci): trivy code scanning fix by @anshbansal in #7232
  • feat(glue): Use table name as human-readable name for Glue ingestion by @danielcmessias in #7213
  • feat(ui): Supporting display of columns and storage count in previews by @jjoyce0510 in #7198
  • fix(gms): Fixes delete references for single relationship aspects by @pedro93 in #7211
  • docs(ingest/lineage): clarify name field in entity config for file based lineage by @mayurinehate in #7225
  • fix(ui): typo 'Documenataion' by @vojtechneradatos in #7227
  • fix(cli/delete): skip references prompt if deleting an aspect by @hsheth2 in #7220
  • fix(ingest/tableau): implement workbook_page_size parameter by @hsheth2 in #7216
  • fix(gms): Corrects MCP generation in async mode by @pedro93 in #7214
  • fix(ingest): redshift # build late binding view lineage when sql written in upper case by @looppi in #7223
  • fix(siblings) Fix editing of schema fields for siblings with unequal schemas by @chriscollins3456 in #7199
  • fix(ingest-idp): emit empty GroupMembership when there are no groups by @aditya-radhakrishnan in #7196
  • feat(lineage): add time filtering for lineage edges by @aditya-radhakrishnan in #7159
  • chore(deps): bump http-cache-semantics from 4.1.0 to 4.1.1 in /docs-website by @dependabot in #7230
  • refactor(docs): Minor language updates for kafka source doc header by @jjoyce0510 in #7237
  • docs(website): fix feature availability dark mode styles by @jeffmerrick in #7233
  • chore(log/docs): improve error log, docs by @anshbansal in #7239
  • fix(dev.sh): Add context to kafka-setup build by @szalai1 in #7234
  • feat(cli): improve docker quickstart by @hsheth2 in #7184
  • fix(elasticsearch): fix orphan index clean up pattern, consistent top… by @david-leifker in #7242
  • chore(deps): bump http-cache-semantics from 4.1.0 to 4.1.1 in /datahub-web-react by @dependabot in #7231

New Contributors

  • @bossenti made their first contribution in #7044
  • @ruedigerblock made their first contribution in #7082
  • @feljen made their first contribution in #7087
  • @tonycsoka made their first contribution in #7091
  • @tinolyu made their first contribution in #7130
  • @kdongho made their first contribution in #7142
  • @jaykadambi made their first contribution in #7148
  • @viniciusdsmello made their first contribution in #7171
  • @mryorik made their first contribution in #7207
  • @danielcmessias made their first contribution in #7213
  • @vojtechneradatos made their first contribution in #7227

Full Changelog: v0.9.6...v0.9.7
v0.9.6.1

Release Highlights

Please disregard release v0.9.6 in favor of this release v0.9.6.1

Bug fix for secrets encryption

  • Prevents decryption errors for existing secrets
  • Affects reading ingestion secret created with a previous release
  • Affects native user password validation

What's Changed

Full Changelog: v0.9.6...v0.9.6.1

v0.9.6
​​# Release Highlights

User Experience

We now support embedding Dashboards, Charts, and Datasets. This allows us to do things like directly embed Looker / Tableau / Mode / Redash Looks, Dashboards, Explores into the Dataset pages themselves.

[Experimental] You can now customize the number of queries displayed on the Query tab of a Dataset entity

Improved error messaging for bulk editing via the UI

Metadata Ingestion

Update to data profiling to allow configurable number of sample values to be returned
Postgres ingestion now supports emitting lineage edges for Views - shoutout to @LucasRoesler for the contribution!
Snowflake ingestion now supports extracting tags - shoutout to @frsann for the contribution!
Vertica ingestion now supports projections and lineage- thanks for the contribution, @vishalkSimplify!
Glue ingestion now emits an s3 lineage edge when data was written with an s3a/s3n client - thanks for the contribution, @danielli-ziprecruiter!

Developer Experience

Fixes quickstart/docker compose issues for M1 machines
Improvements in reliability and performance of the Restli Service endpoints for ingestion:
Scale Restli Service thread pool based on CPU
Add retry (exp backoff) to Restli Entity Client
MCE no longer relies on GMS for Restli service
Converted Restli Service from standalone servlet to Spring injectable
Docker build externalized (significantly faster on m1, <7 minute build times, based on this)
Frontend asset generation refactor (causing tests to fail intermittently)

What's Changed

  • feat(ingest): add pydantic helper for removed fields by @hsheth2 in #6853
  • chore(0.9.5): Bump defaults for release v0.9.5 by @jjoyce0510 in #6856
  • Revert "fix(ci): remove warnings due to deprecated action" by @anshbansal in #6857
  • refactor(restli-mce-consumer) by @david-leifker in #6744
  • fix(ci): reduce smoke test run time by @anshbansal in #6841
  • fix(security): require signed/encrypted jwt tokens by @david-leifker in #6565
  • feat(ingest): update profiling to fetch configurable number of sample values by @mayurinehate in #6859
  • feat(ingest/airflow): support raw dataset urns in airflow lineage by @hsheth2 in #6854
  • refactor(graphql): make graphqlengine easier to use by @anshbansal in #6865
  • fix(kafka): datahub-upgrade job by @david-leifker in #6864
  • feat(ingest): pass timeout config in kafka admin client api calls by @mayurinehate in #6863
  • chore(ingest): loosen requirements file by @hsheth2 in #6867
  • feat(ingest): upgrade pydantic version by @cccs-eric in #6858
  • fix(elasticsearch): fixes out of order runId writes by @david-leifker in #6845
  • chore(ingest): loosen additional requirements by @hsheth2 in #6868
  • feat(ingest): bigquery/snowflake - Store last profile date in state by @treff7es in #6832
  • docs(google-analytics): Correct grammatical error in README.md by @jx2lee in #6870
  • feat(CI): add venv caching by @szalai1 in #6843
  • feat(ingest/snowflake): handle failures gracefully and raise permission failures by @mayurinehate in #6748
  • fix(runid): always update runid, except when queued by @david-leifker in #6876
  • fix(ingest): conditionally include env in assertion guid by @hsheth2 in #6811
  • chore(ci): update dependencies docs-website by @anshbansal in #6871
  • feat(ui) - Add a custom error message for bulk edit to add clarity by @mkamalas in #6775
  • docs(adding users): Refreshing the docs for adding new DataHub Users by @jjoyce0510 in #6879
  • test(mce-consumer): mockbeans by @david-leifker in #6878
  • feat(ingest): avoid embedding serialized json in metadata files by @hsheth2 in #6742
  • refactor(gradle): move the local docker registry to common location by @david-leifker in #6881
  • refactor(smoke): use env variables by @anshbansal in #6866
  • fix(lint): pin pydantic version by @anshbansal in #6886
  • refactor(docs): Correctly spell elasticsearch in docs by @jjoyce0510 in #6880
  • fix(ingest): okta undefined variable error by @anshbansal in #6882
  • fix(ci): reduce flakiness in add_users, siblings smoke test by @anshbansal in #6883
  • fix(ingest): fall back to default table comment method for all Trino query errors by @marvin-roesch in #6873
  • test(misc): misc test updates by @david-leifker in #6890
  • deprecate(ingest): bigquery - Removing bigquery-legacy source by @treff7es in #6851
  • chore(ingest): remove inferred args to MCPW, part 1 by @hsheth2 in #6819
  • test(ingest/kafka-connect): make docker setup more reliable by @hsheth2 in #6902
  • fix(ingest): profiling (bigquery) - Address biquery profiling query error due to timestamp vs data mismatch by @treff7es in #6874
  • fix(cli): Make datahub quickstart work with latest docker compose in M1 by @pedro93 in #6891
  • fix(cli): fix delete urn cli bug + stricter type annotations by @hsheth2 in #6903
  • fix(ingest/airflow): reorder imports to avoid cyclical dependencies by @stijndehaes in #6719
  • feat: remove jq requirement + tweak modeldocgen args by @hsheth2 in #6904
  • chore(ingest): loosen pyspark and pydeequ deps by @hsheth2 in #6908
  • docs(ingest/looker): fix typos + update lookml github action example by @hsheth2 in #6910
  • fix(ingest/metabase): use card_id in dashboard to chart lineage by @ccpypy in #6583
  • fix(es-setup): create data stream on non-aws by @szalai1 in #6926
  • Adding missing Platform logos by @maggiehays in #6892
  • feat(ingestion): PowerBI# Improve PowerBI source ingestion by @mohdsiddique in #6549
  • Fix compose context for kafka-setup by @szalai1 in #6923
  • feat(backend): Supporting Embeddable Previews for Dashboards, Charts, Datasets by @jjoyce0510 in #6875
  • chore(deps): bump json5 from 2.2.1 to 2.2.3 in /docs-website by @dependabot in #6930
  • chore(deps): bump json5 from 1.0.1 to 1.0.2 in /datahub-web-react by @dependabot in #6931
  • fix(ci): managed ingestion test fix by @anshbansal in #6946
  • feat(ingest): add include_table_location_lineage flag for SQL common by @hsheth2 in #6934
  • feat(ingest): allow extracting snowflake tags by @frsann in #6500
  • chore(ingest): unpin pydantic dep by @hsheth2 in #6909
  • chore(ingest): partially revert pyspark dep from #6908 by @hsheth2 in #6954
  • fix(ingest): use branch info when cloning git repos by @hsheth2 in #6937
  • chore(ingest): remove inferred args to MCPW, part 2 by @hsheth2 in #6905
  • fix(ingest/unity): simplify MCP generation and reporting by @hsheth2 in #6911
  • chore(ci): parallelise build and test workflow to reduce time by @anshbansal in #6949
  • fix(frontend): sasl.client.callback.handler.class by @szalai1 in #6962
  • chore(react): remove outdated cypress tests and dependency by @anshbansal in #6948
  • fix(ci): restrict GE to fix build issues by @anshbansal in #6967
  • feat(queries): [Experimental] Allow customization of # of queries in Query tab via env var by @gabe-lyons in #6964
  • feat(ingest/postgres): emit lineage for postgres views by @LucasRoesler in #6953
  • feat(ingest/vertica): support projections and lineage in vertica by @vishalkSimplify in #6785
  • fix(ingest): add missing dep for powerbi by @hsheth2 in #6969
  • Docs fixes week of 12 22 by @laulpogan in #6963
  • fix(ingest): unfreeze bigquery/snowflake column dataclass by @mayurinehate in #6921
  • chore(frontend) Remove unused dependencies from package.json by @chriscollins3456 in #6974
  • chore: misc fixes by @anshbansal in #6966
  • feat(ingest/glue): emit s3 lineage for s3a and s3n schemes by @danielli-ziprecruiter in #6788
  • fix(kafka-setup): Make kafka-setup run with multiple threads by @pedro93 in #6970
  • feat(ingest): mark database_alias and env as deprecated by @hsheth2 in #6901
  • fix(docs): Updating Tag, Glossary Term docs to point to correct GraphQL methods by @jjoyce0510 in #6965
  • chore(deps): bump certifi from 2020.12.5 to 2022.12.7 in /metadata-ingestion/src/datahub/ingestion/source/feast_image by @dependabot in #6979
  • fix(ingest): profiling - Fixing issue with the wrong timestamp stored in check by @treff7es in #6978
  • config(quickstart): enable auto-reindex for quickstart by @david-leifker in #6983
  • feat(privileges) - Create a privilege to manage glossary children recursively by @mkamalas in #6731
  • chore(ingest): finish removing feast-legacy by @hsheth2 in #6985
  • feat(ingest): add import descriptions of two or more nested messages by @wngus606 in #6959
  • feat(docs) Add feature guide for Manual Lineage by @chriscollins3456 in #6933
  • docs(rfc): Serialising GMS Updates with Preconditions by @mattmatravers in #5818
  • fix(ingest/kafka-connect) support newer version of debezium by @jaegwonseo in #6943
  • fix(docs): build and broken snowflake docs fix by @anshbansal in #6997
  • fix(ingest): bigquery - views in case more than 1 datasets with views by @anshbansal in #6995
  • fix(docs): Renaming Business Glossary Doc by @jjoyce0510 in #7001
  • fix(ingest/snowflake): fix type annotations + refactor get_connect_args by @hsheth2 in #7004
  • fix(docs): Changing the platform event topic name in kafka custom topic docs by @blankon123 in #7007
  • fix(docs): fix name of privilege referenced in posts doc by @aditya-radhakrishnan in #7002
  • fix(SSO): Correctly redirect to originally requested URL in SSO by @jjoyce0510 in #7011
  • fix(ingest): remove dead code from tests by @hsheth2 in #7005
  • feat(ingestion): Tableau # Embed links by @mohdsiddique in #6994
  • feat(auth) Update auth cookies to have same-site none for chrome extension by @chriscollins3456 in #6976
  • docs(website): DPG WIP by @maggiehays in #6998
  • docs: resize datahub logo by @hsheth2 in #7014
  • fix(kafka-setup): Remove reference to non-existing topic by @pedro93 in #7019
  • fix(ingest): powerbi # use display name field as title for powerbi report page by @looppi in #7017
  • feat(auth) Allow session ttl to be configurable by env variable by @chriscollins3456 in #7022
  • fix(ui): URL Encode all Entity Profile URLs by @jjoyce0510 in #7023
  • fix(ui ingest): Fix test connection when stateful ingest is enabled by @jjoyce0510 in #7013
  • docs(sso) move root user warning to earlier in SSO guides by @maggiehays in #7028
  • fix(ingest/looker): add clarity in chart input parsing logs by @hsheth2 in #7003
  • chore(ingest): remove duplicate data_platform.json file by @hsheth2 in #7026
  • feat(ingestion): PowerBI # Remove corpUserInfo aspect ingestion by @mohdsiddique in #7034
  • fix(metadata-models): remove unnecessary bin folder by @jjoyce0510 in #7035
  • fixing typos by @maggiehays in #7030

New Contributors

  • @marvin-roesch made their first contribution in #6873
  • @stijndehaes made their first contribution in #6719
  • @ccpypy made their first contribution in #6583
  • @LucasRoesler made their first contribution in #6953
  • @vishalkSimplify made their first contribution in #6785
  • @wngus606 made their first contribution in #6959
  • @jaegwonseo made their first contribution in #6943
  • @blankon123 made their first contribution in #7007

Full Changelog: v0.9.5...v0.9.6

v0.9.4
​​# Release Highlights

KNOWN ISSUES

There is a known issue with OIDC which we will address in a fast-follow release. If you use OIDC, please wait for v0.9.5 to upgrade.

User Experience

Manual Lineage is LIVE! You can now add and remove lineage between entities in the Lineage Visualization screen, making it easier than ever to manage the complex relationships between your data resources.

Our new Views feature makes it easy to create curated sets of Entities within DataHub. This is a great way to start to isolate the entities that matter most, and provide your DataHub end-users with a streamlined view of the assets that are relevant to their use cases.

In-App Product Tours are here! When logging into DataHub and/or visiting a new page type for the first time, new users will be prompted with a helpful walkthrough of core functionality to get them familiar with the platform. We’ll continue to add modules as we roll out new features!

Automatically send updates to Slack and/or Microsoft Teams when changes are made within DataHub by leveraging our the new Slack and Teams Actions

Metadata Ingestion

We’re continuing to improve the user experience for UI-based ingestion for the following sources:
dbt Cloud
DataBricks Unity Catalog
MySQL
Trino/Preso
MSSQL
MariaDB
If you’re just getting started with UI-based Ingestion, check out our new BigQuery & Snowflake guides
Stateful ingestion is now supported for Iceberg (thanks for the contrib, @cccs-Dustin!) and LDAP (thanks for the contrib, @bda618!)
Speaking of Stateful Ingestion, we’re taking some steps to simplify the code behind Sta

What's Changed

  • chore(): Updating default CLI version, update updating-datahub.md by @jjoyce0510 in #6590
  • fix(ingest): profiling - Profiling failed if column cardinality threw an error by @treff7es in #6582
  • fix(actions): add missing datahub-gms-protocol env var by @shirshanka in #6593
  • fix(ingest): restrict snowflake-connector-python dependency by @mayurinehate in #6594
  • feat(ingest/bigquery): avoid creating/deleting tables for profiling by @hsheth2 in #6578
  • fix(ingest): unify emit interface by @hsheth2 in #6592
  • fix(security): security version updates by @david-leifker in #6602
  • docs: remove Kafka Streams from documentation by @maver1ck in #6596
  • refactor(ui): Improving Kafka UI Ingestion Form, Create Domain, Create Secret Modals by @jjoyce0510 in #6588
  • fix(ingest): clarify tableau auth error messages by @hsheth2 in #6600
  • docs(graphql): fix deleteTest "Create"->"Delete" by @nickwu241 in #6574
  • fix(gms/startup): remove set -x from start.sh by @timcosta in #6589
  • feat(sql): Add SQL index on createdon field by @pedro93 in #6522
  • feat(ml model): updating view of ml model feature list by @gabe-lyons in #6576
  • fix(ingest/bigquery): ignore complex types from profiling by @treff7es in #6613
  • feat(ingest): add external url for snowflake objects by @mayurinehate in #6580
  • chore(ingest): bump and pin mypy by @hsheth2 in #6584
  • fix(ingest): only require github_info for lookml and not looker by @hsheth2 in #6608
  • docs(ingest): add airflow docs that use the PythonVirtualenvOperator by @hsheth2 in #6604
  • fix(ui) Fix double scroll in embedded list search sections by @chriscollins3456 in #6618
  • feat(ingest): print detailed GMS error messages by @djordje-mijatovic in #6519
  • Townhall agenda wikimedia by @maggiehays in #6622
  • fix(analytics): skip ListDomains if user cannot manage domains and have only one loading message by @aditya-radhakrishnan in #6624
  • feat(quickstart): add support for passing thru env vars needed by Sla… by @shirshanka in #6591
  • docs(actions): slack, teams by @shirshanka in #6632
  • fix(logging): Remove lombok as source of slf4j-api by @david-leifker in #6616
  • docs: add links from main README to slack, teams actions by @shirshanka in #6633
  • feat(ingest): Support config variable for specifying a direct privat… by @mayurinehate in #6609
  • Add AWS Postgres Iam Auth jar to GMS by @syedzoherer in #6371
  • feat(ingest/snowflake): support filtering by fully qualified schema_pattern by @mayurinehate in #6611
  • feat(ingest/kafka-connect): support MongoSourceConnector by @frsann in #6416
  • feat(graph) Add createdOn, createdActor, updatedOn, updatedActor to graph edges by @chriscollins3456 in #6615
  • refactor(ui): Making improvements to UI ingestion forms, adding MySQL, Trino, Presto, MSSQL, MariaDB forms by @jjoyce0510 in #6607
  • perf(ui-ingestion): cache on creation or deletion of ingestion sources to reduce latency by @aditya-radhakrishnan in #6647
  • feat(ingest): add dummy data source for automated testing by @anshbansal in #6550
  • docs(managed datahub): adding release notes for v0.1.70 by @anshbansal in #6655
  • feat(gms): Pluggable Authentication & Authorization Framework by @mohdsiddique in #6634
  • docs: move rfcs to separate repo by @laulpogan in #6621
  • fix(ingest): fix lingering demo-data source issues by @hsheth2 in #6659
  • feat(ingest): bigquery - Running lineage extraction after metadata extraction by @treff7es in #6653
  • fix(ingest): issue deprecation warning correctly by @hsheth2 in #6623
  • chore(ingest): remove feast-legacy by @hsheth2 in #6661
  • fix(ingest/snowflake): support domains for snowflake schema containers by @hsheth2 in #6662
  • build(deps): bump decode-uri-component from 0.2.0 to 0.2.2 in /datahub-web-react by @dependabot in #6617
  • feat(ingest/dbt): add support for latest DBT version 1.3 by @MatthieuBlais in #6651
  • docs: add languages to code highlighting by @hsheth2 in #5576
  • docs(typo) Correct typo in domains.md by @maggiehays in #6667
  • feat(gms): Enable auth-api publishing to maven by @mohdsiddique in #6671
  • fix(ingest/powerbi-report-server): deprecate unused graphql config by @daha in #6630
  • fix(docker): Fix datahub-frontend dockerfile by @jjoyce0510 in #6670
  • fix(ingest): profiling - Changing profiling defaults by @treff7es in #6640
  • feat(ci): add smoke test for domain mutation by @anshbansal in #6641
  • fix(datahub-protobuf): fix missing httpclient dependency by @shirshanka in #6672
  • feat(ingest): update snowflake docs, add simple validations by @mayurinehate in #6636
  • fix(gms): DataHub Auth API java doc fix by @mohdsiddique in #6674
  • feat(ingest): run profiler in more cardinality cases by @hsheth2 in #6397
  • docs(search) update broken youtube link by @maggiehays in #6678
  • docs(protobuf): update examples for protobuf by @david-leifker in #6681
  • feat(ingest): support knowledge links in business glossary by @mohdsiddique in #6375
  • fix(ingestion/vertica): support columns with timestamp precision by @inancdokurel in #6295
  • feat(ingest): add timestamps for snowflake objects by @mayurinehate in #6570
  • feat(onboarding): adds framework and some steps for onboarding steps UI by @aditya-radhakrishnan in #6462
  • feat(ingest): use entry point for registering transformers by @Masterchen09 in #6628
  • chore(ci): update base ingestion image requirements file by @anshbansal in #6687
  • fix(ci): reduce warnings due to deprecated action by @anshbansal in #6686
  • refactor(ui): Adding caching for users, groups, and roles by @jjoyce0510 in #6673
  • fix(ci): revert confluent kafka in base image by @anshbansal in #6690
  • fix(security): version bump to latest minor python image by @david-leifker in #6694
  • docs(ingest/salesforce): list required permissions by @orlandine in #6610
  • feat(ingest): bigquery - option to set on behalf project by @treff7es in #6660
  • ci: stop commenting test results on PR by @hsheth2 in #6700
  • fix(auth-api): Attempting to fix publish for auth-api by @jjoyce0510 in #6695
  • build(deps): bump qs from 6.5.2 to 6.5.3 in /smoke-test/tests/cypress by @dependabot in #6663
  • build(deps): bump express from 4.17.1 to 4.18.2 in /datahub-web-react by @dependabot in #6665
  • fix(ingest/tableau): support ssl_verify flag properly by @hsheth2 in #6682
  • fix(config): unify the handling of boolean environment variables by @Masterchen09 in #6684
  • fix(ui): fix search on policy builder by @aditya-radhakrishnan in #6703
  • build(deps): bump qs from 6.5.2 to 6.5.3 in /datahub-web-react by @dependabot in #6664
  • fix(ingest): cleanup config extra usage by @hsheth2 in #6699
  • docs(logos): add Great Expectations logo by @maggiehays in #6698
  • fix(security): play framework upgrade by @david-leifker in #6626
  • fix(ingest/sagemaker): handle missing ProcessingInputs field by @hsheth2 in #6697
  • build: add retries to gradle wrapper download in ingestion docker by @hsheth2 in #6704
  • test(quickstart): add debugging to quickstart test by @david-leifker in #6718
  • fix(setup): Bump setup images to alpine 3.14 with arch based on machine OS. by @pedro93 in #6612
  • fix(ingest): fix bug in auto_status_aspect by @hsheth2 in #6705
  • fix(security): commons-text, hadoop-commons versions by @david-leifker in #6723
  • fix(build): rename conflicting module auth-api by @david-leifker in #6728
  • docs(aws): edit markdown link by @jx2lee in #6706
  • fix(ingest): fix mysql ingestion issue with non-lowercase database by @mayurinehate in #6713
  • feat(ingest): redact configs reported in ingestion_run_summary by @hsheth2 in #6696
  • fix(ingest): rectify filter for BigQuery external tables by @janhicken in #6691
  • feat(ingest): add separate config for include_column_lineage in snowf… by @mayurinehate in #6712
  • fix(ci): flakiness due to onboarding tour in add user test by @anshbansal in #6734
  • feat(ui): Support DataBricks Unity Catalog Source in Ui Ingestion by @jjoyce0510 in #6707
  • feat(ingest/iceberg): add stateful ingestion by @cccs-Dustin in #6344
  • doc(restore): document restore indices API endpoint by @anshbansal in #6737
  • feat(): Views Feature Milestone 1 by @jjoyce0510 in #6666
  • feat(ingest): bigquery - external url support and a small profiling filter fix by @treff7es in #6714
  • test(ingest): make hive/trino test more reliable by @hsheth2 in #6741
  • Initial commit for bigquery ingestion guide by @treff7es in #6587
  • fix(ci): remove warnings due to deprecated action by @anshbansal in #6735
  • feat(ingest): add stateful ingestion to the ldap source by @bda618 in #6127
  • fix(ingest): fix codegen from_obj for empty dicts in unions with null by @hsheth2 in #6745
  • feat(ingest): start simplifying stateful ingestion state by @hsheth2 in #6740
  • docs(gms): plugins# auth-api as compileOnly dependency by @mohdsiddique in #6747
  • fix(elasticsearch): build in resilience against IO exceptions on httpclient by @RyanHolstien in #6680
  • ci: fix ingestion gradle retry by @hsheth2 in #6752
  • fix(ingest): support airflow mapped operators by @cccs-seb in #6738
  • fix(actions): fix mistype slack/teams base url by @ssilb4 in #6754
  • fix(smoke-test): fix stateful ingestion test regression by @hsheth2 in #6753
  • fix(auth): Renames metadata-auth archive name to not conflict with other modules. by @pedro93 in #6749
  • fix(ingest/lookml): fix directory handling and a config validation bug by @hsheth2 in #6751
  • refactor(ingest): bigquery-lineage - allow tables and datasets in uppercase by @PatrickfBraz in #6739
  • refactor(ux): Misc UX Improvements (tutorial copy, caching, filters) by @jjoyce0510 in #6743
  • Added build failed yarn error by @jakobhanna in #6757
  • feat(ingest): remove source config from DatahubIngestionCheckpoint by @hsheth2 in #6722
  • fix(python-sdk): DataHubGraph get_aspect should accept empty responses by @shirshanka in #6760
  • fix(datahub-web-react): Properly escape a quote in React by @jjoyce0510 in #6764
  • docs(ingest/airflow): clarify Airflow 1.x docs for airflow plugin by @hsheth2 in #6761
  • feat(ingest): simplify more stateful ingestion state by @hsheth2 in #6762
  • fix(ingest): bigquery - handling custom sql errors as warning in profiling by @treff7es in #6777
  • docs(docker): add section for adding community images by @anshbansal in #6770
  • docs(ingest): fix error in custom tags transformer example by @hsheth2 in #6767
  • feat(ingest): add datahub state inspect command by @hsheth2 in #6763
  • refactor(ui): Caching Ingestion Secrets by @jjoyce0510 in #6772
  • docs(snowflake) Snowflake quick ingestion guide by @maggiehays in #6750
  • Optimize kafka setup by @david-leifker in #6778
  • feat(ingest/lookml): add unreachable views to report by @hsheth2 in #6779
  • feat(ci): adding github security reporting to trivy scans by @shirshanka in #6773
  • fix(smoke-test): remove stateful ingestion config check by @hsheth2 in #6781
  • fix(ingest): correct external url for account identifier with account name by @mayurinehate in #6715
  • fix(tutorial): skip getting steps if there is no user by @aditya-radhakrishnan in #6786
  • fix(kafka-setup): fix return code check by @david-leifker in #6782
  • refactor(ui): Make include_tables and include_views default to True. Improve Tableau default recipe. by @jjoyce0510 in #6790
  • fix(ingest): prevent NullPointerException when non-jdbc SaveIntoDataS… by @danielli-ziprecruiter in #6803
  • docs(architecture): edit documents in architecture section by @jx2lee in #6798
  • fix(ingest/dbt): remove unsupported usage indicator by @hsheth2 in #6805
  • refactor(ui): Adding frontend caching + some misc. refactoring by @jjoyce0510 in #6796
  • fix(ingest): bigquery - sharded table support improvements by @treff7es in #6789
  • chore(ingest): pin black version by @hsheth2 in #6807
  • refactor(ingest/stateful): remove most remaining state classes by @hsheth2 in #6791
  • fix(profile): bigquery-legacy - Fix for TypeError-related failures in legacy plugin by @senapatim in #6806
  • Update Grafana Dashboard by @NavinSharma13 in #6076
  • refactor(ingest/stateful): remove IngestionJobStateProvider by @hsheth2 in #6792
  • chore(ingest): bump python package dependencies to resolve vulns by @cyberay01 in #6384
  • refactor(ingest/stateful): remove get_last_state method by @hsheth2 in #6794
  • fix(ui): URL encode urns for ownership entity links by @aditya-radhakrishnan in #6814
  • fix(posts): add deletePost GraphQL endpoint by @aditya-radhakrishnan in #6813
  • fix(policies): resolve the associated domain for a domain as the domain itself by @aditya-radhakrishnan in #6812
  • feat(lineage) Adds ability to edit lineage manually from the UI by @chriscollins3456 in #6816
  • fix(ui): change caching to happen post server-response when creating a UI ingestion recipe by @aditya-radhakrishnan in #6815
  • feat(ingest/stateful): remove platform_instance_id from state urn by @hsheth2 in #6795
  • feat(ui): Adding DBT Cloud support for UI ingestion by @jjoyce0510 in #6804
  • feat(kafka): expose default kafka producer mechanism by @djordje-mijatovic in #6381

New Contributors

  • @maver1ck made their first contribution in #6596
  • @MatthieuBlais made their first contribution in #6651
  • @inancdokurel made their first contribution in #6295
  • @orlandine made their first contribution in #6610
  • @janhicken made their first contribution in #6691
  • @cccs-Dustin made their first contribution in #6344
  • @cccs-seb made their first contribution in #6738
  • @ssilb4 made their first contribution in #6754
  • @senapatim made their first contribution in #6806
  • @cyberay01 made their first contribution in #6384

Full Changelog: v0.9.3...v0.9.4

V0.9.3
​​# Release Highlights

User Experience

Column Level Lineage Impact Analysis is live! Read more about it here
You can now sort Dataset field names alphabetically - this is super handy for finding columns within wide datasets that may not have an easy-to-follow order by default [gif]
Miscellaneous UX improvements:
“Explore All” button on home page, making it easier to jump into the search experience [gif]
“Share” button on entity pages [screenshot]
[Community Contribution] You can now assign the same user as different owner types - thanks for the contrib, @rtekal!

Metadata Ingestion

Snowflake Automated PII Classification is here! We’re eager for feedback on the utility of this feature - check out this guide, take it for a spin, and let us know what you think!
We’ve simplified the configs required to add stateful ingestion to an ingestion source - check out the updated docs here
Speaking of stateful ingestion, it’s now supported with:
Looker & LookML ingestion sources
[Community Contribution] Container-level ingestion – thanks for the contrib, @wangsaisai!

Developer Experience

NEW! dbt Cloud ingestion is ready for ya - check out the module details here
[Community Contribution] For those of you deploying DataHub with Neo4j, we now support Lineage Impact analysis via Neoj4 mulithop functionality. Thanks for the contrib, @djordje-mijatovic!
We’ve loosened our SQLAlchemy dependencies to support Airflow 2.3+

What's Changed

  • fix(spark-lineage): Smoke test fix + smoke test m1 support by @treff7es in #6372
  • feat(ingest): supports MCEs in domain transformer by @hsheth2 in #6364
  • feat(ingest): enable container stateful ingestion by @wangsaisai in #6343
  • build(ingest): pin mypy version by @hsheth2 in #6391
  • build: use acryl's gradle-avro-plugin by @hsheth2 in #6390
  • fix(ingest): unity - add missing date type by @ms32035 in #6385
  • fix(ingest): unity-catalog - Removing unneeded sqlalchemy dependency to fix install by @treff7es in #6379
  • feat(ingest/tableau): re-authenticate if the token expires by @hsheth2 in #6380
  • fix(ingest): use profiler config settings correctly by @hsheth2 in #6354
  • fix(ingest): handle error when query returns no columns in snowflake lineage by @mayurinehate in #6404
  • fix(ingest): fix missing snowflake lineage when table_pattern is set by @mayurinehate in #6410
  • feat(ingest): loosen sqlalchemy dep & support airflow 2.3+ by @hsheth2 in #6204
  • fix(ingest/s3): add status aspect for detected s3 datasets by @mayurinehate in #6402
  • fix(ingest/snowflake): loosen snowflake connector version requirement by @hsheth2 in #6418
  • fix(mysql): fix native data type for mysql set type by @mayurinehate in #6407
  • perf(ui): virtualized schema table rows by @stanbaker in #6287
  • fix(ui) Improve HoverEntityTooltip and truncate parent glossary nodes by @chriscollins3456 in #6417
  • feat(ingest): support incremental lineage to dbt node from external platform by @mayurinehate in #6392
  • fix(ingest): init dataset props if missing in transformer by @hsheth2 in #6429
  • fix(change-event): remove unnecessary dependencies on EntityChangeEventGeneratorRegistryFactory by @aditya-radhakrishnan in #6431
  • build(deps): bump moment-timezone from 0.5.34 to 0.5.35 in /datahub-web-react by @dependabot in #5783
  • feat(frontend): Adding support to show externalUrl and institutionalMemoryFields for MLModels by @lurecas in #6053
  • feat(model): adds properties, ownership, deprecated, institutional memory and tags as aspects for data platform instance entity by @sgomezvillamor in #5728
  • docs(ingest/airflow): clarify docs around 1.x compat by @hsheth2 in #6436
  • feat(recommendations): add last edited entities by @CorentinDuhamel in #6329
  • fix(ingest): correctly compute entity change percentage by @hsheth2 in #6438
  • docs(townhall) Updating Townhall History by @maggiehays in #6336
  • Neo4j multihop support by @djordje-mijatovic in #6104
  • fix(mae-consumer): Set proper variable expansion for JMX_OPTS and JAVA_OPTS in MAE docker by @skrydal in #6378
  • docs(ingest): move prerequisite section before the ingestion recipe example by @mayurinehate in #6341
  • fix(dataset): improve glossary term load performance for datasets by @Reilman79 in #6396
  • feat(lineage) Implement CLL impact analysis for inputFields by @chriscollins3456 in #6426
  • feat(ui) Add upgrade step to enable CLL impact analysis for existing data by @chriscollins3456 in #6427
  • Added functionality to copy fieldpath and urn of each column by @Ankit-Keshari-Vituity in #6398
  • fix(ingestion): add output converters for ODBC unsuported datatype in… by @LavinaVRovine in #6134
  • fix(ui) Fix parentNodes overfetching everywhere it's used by @chriscollins3456 in #6446
  • fix(ingest): snowflake - Fixing top query trimming in snowflake by @treff7es in #6447
  • feat(elasticsearch): Updates to elasticsearch configuration, dao, tests by @david-leifker in #6269
  • chore(ingest): fix mssql lint by @hsheth2 in #6453
  • fix(ingest): add cli info to ingestion reporter by @hsheth2 in #6451
  • fix(ui) Fix glossary side browser width fluctuating by @chriscollins3456 in #6457
  • fix(python): Fix python dependencies for doc generation by @david-leifker in #6460
  • docs(website): add homepage links by @jeffmerrick in #6458
  • build(ingest): loosen jinja2 dependency for superset by @KulykDmytro in #6433
  • fix(ingest): lowercase db name in mssql ingestion by @hsheth2 in #6448
  • fix(ingest): handle missing schema in transformer by @hsheth2 in #6445
  • feat(ingest): allow specific profiler config fields to override profile_table_level_only by @hsheth2 in #6366
  • docs(enrichment) updating enrichment landing page by @maggiehays in #6286
  • fix(home-page): remove redundant getAuthenticatedUser query by @aditya-radhakrishnan in #6464
  • feat(ingest): detect old or missing docker compose by @hsheth2 in #6466
  • feat(ingestion): powerbi # Power BI report support by @mohdsiddique in #6339
  • fix(ingest/dbt): disable incremental lineage by default by @hsheth2 in #6467
  • fix(loggin): print logging timestamp in ISO8601 format instead of jus… by @szalai1 in #6474
  • docs(ingest/trino): add example of http connection by @hsheth2 in #6461
  • refactor(ui): Simplify base glossary page toolbar by @jjoyce0510 in #6469
  • revert: mssql - lowercase db name in mssql ingestion by @hsheth2 in #6481
  • build: remove Jinja2 dependency from superset by @KulykDmytro in #6476
  • fix(roles): allows role service to unassign roles by @aditya-radhakrishnan in #6434
  • fix(docs): update the Okta and Azure AD docs to clarify the point of ingesting users by @aditya-radhakrishnan in #6465
  • Highlighted the description text on search by @Ankit-Keshari-Vituity in #6400
  • Ownership type is deprecated by @jakobhanna in #6477
  • feat(ui): Adding Explore all button on home page search by @jjoyce0510 in #6468
  • fix(ingest): fix athena and GE lint errors by @hsheth2 in #6482
  • refactor(ingest): simplify stateful ingestion config by @hsheth2 in #6454
  • docs(ingest/tableau): required permissions + doc formatting by @hsheth2 in #6484
  • feat(ingest): presto - Adding presto source by @treff7es in #6459
  • fix(ui) Fix lineage graph rendering with duplicate nodes by @chriscollins3456 in #6480
  • docs(cypress): adding local cypress running instructions by @gabe-lyons in #6492
  • fix(managed ingestion): updating snowflake schema pattern placeholder text by @gabe-lyons in #6493
  • feat(ui): Adding External URLs to search preview for Dataset, Container, DataFlow, DataJob by @jjoyce0510 in #6496
  • fix(ingest/tableau): check tableName existence on datasource response by @lustefaniak in #6478
  • fix(build): do not use neo4j for dev by @anshbansal in #6501
  • docs(gms): update search example, do not use deprecated clause by @mayurinehate in #6340
  • feat(ingest): add stateful ingestion support to looker and lookml source by @mayurinehate in #6443
  • feat(ingest): dbt cloud integration by @hsheth2 in #6323
  • fix(tableau): extra defensive error-handling by @hsheth2 in #6503
  • fix(ingest): remove redundant types by @hsheth2 in #6486
  • fix(ingest/snowflake): fix lineage allow/deny pattern typo by @hsheth2 in #6506
  • fix(docs): add missing docs for 0.9.1 by @anshbansal in #6515
  • feat(ui): Introducing Share Button on Entity Pages by @jjoyce0510 in #6450
  • Added I AM auth for Opensearch by @syedzoherer in #6370
  • fix(ingest): correctly handle transformer patch semantics by @hsheth2 in #6505
  • feat(ingest/csv-enrich): handle BOM character by @hsheth2 in #6509
  • feat(airflow): support kafka hook in the airflow plugin by @hsheth2 in #6508
  • fix(patch): cover case where patch is used to create an entity by @RyanHolstien in #6504
  • build(deps): bump loader-utils from 2.0.0 to 2.0.4 in /docs-website by @dependabot in #6452
  • fix(ingest): add alias for bigquery-beta by @hsheth2 in #6521
  • feat(ingest): add config for ingesting delta table without files by @mayurinehate in #6403
  • fix(ingest): fix typo in unique count profiling by @mayurinehate in #6517
  • fix(ui) Fix roles not always displaying on page load by @chriscollins3456 in #6524
  • feat(datahub-upgrade): Added msk IAM auth as a build dependency. by @pghazanfari in #6439
  • feat(kafka-setup): Added support for MSK IAM authentication. by @pghazanfari in #6435
  • Added sorting method to fieldpath column of schema tab by @Ankit-Keshari-Vituity in #6510
  • fix(ingest): make kafka emit callback optional by @hsheth2 in #6525
  • feat(ingest): automated term classification for snowflake by @mayurinehate in #6376
  • fix(ingest): fix typo in urn utilities by @bskim45 in #6520
  • fix(ingest): fix trino properties and tests by @mayurinehate in #6518
  • fix(build): remove warnings in github actions by @anshbansal in #6512
  • fix(security): Bump ranger plugin commons dependency by @pedro93 in #6535
  • fix(ingest): kafka - properly picking doc from union type by @treff7es in #6472
  • feat(ingest): disable stateful_ingestion fail-safe by default by @hsheth2 in #6537
  • fix(ingest/airflow): respect enabled flag in airflow plugin by @hsheth2 in #6528
  • refactor(ui): Adding apollo caching to manage domains page. by @jjoyce0510 in #6494
  • refactor(recommendations): Filtering for specific entity types in recommendations by @jjoyce0510 in #6538
  • fix(ingest): handle groupby custom label case by @phongvu99 in #6456
  • build(ingest): support flake8 6.0.0 by @hsheth2 in #6540
  • fix(ui) Wrap schema field descriptions to allow read more/less always by @chriscollins3456 in #6541
  • fix(ui) Display duplicate nodes in lineage viz by @chriscollins3456 in #6526
  • style(ingest): fix lint checks for superset by @mayurinehate in #6548
  • fix(envs): remove DATASET_ENABLE_SCSI stale env var by @szalai1 in #6546
  • feat(upgrade): Make restore from backup logic generic by @pedro93 in #6536
  • feat(ingest): refractor classification mixin, support new infotypes by @mayurinehate in #6545
  • fix(ingest): bigquery - missing sqlalchemy dep and row count fix by @treff7es in #6553
  • fix(ingest): bigquery - Fixing querying non-date partition columns in profiling by @treff7es in #6554
  • feat(ingest): powerbi # scan all accessible workspaces by @looppi in #6441
  • fix(ingest): bigquery - Setting partition id for profiling data and project_id fix by @treff7es in #6558
  • fix(gms): fix java.lang.NoClassDefFoundError: com/sun/syndication/io/FeedException for apache-ranger authorizer by @mohdsiddique in #6560
  • feat(ui): Add Test Connection Support for BigQuery ingestion source by @jjoyce0510 in #6543
  • fix(contrib): Update base python image for es7-upgrade by @david-leifker in #6562
  • fix(ingest): handle docker-compose version v prefix by @hsheth2 in #6561
  • docs(ingest/kafka): add field descriptions of kafka-related configs to pydantic by @mmmeeedddsss in #6559
  • feat(platform): Support @Searchable + @Relationship Annotations for Timeseries Aspects by @jjoyce0510 in #6455
  • feat(models): Adding 'created', 'lastModified' timestamp to Dataset, Container, Dashboard, Chart by @jjoyce0510 in #6527
  • fix(ingest): set DataProcessInstance created ts to start time by @hsheth2 in #6566
  • feat(docs-site): fast reload command for markdown edits by @hsheth2 in #6539
  • fix(ingest): graceful error handling in snowflake classification by @mayurinehate in #6568
  • ci(label): add smoke test label by @anshbansal in #6571
  • fix(ingest): fix types changes in clickhouse sqlalchemy 0.2.3 by @mayurinehate in #6572
  • fix(tests): Misc updates for tests, auth log level, and quickstart by @david-leifker in #6491
  • feat(ui) Add owner to dataset - allow same owner with a different type by @rtekal in #6463
  • fix(verions): Update opentelemetry and updates from pr-5239 by @david-leifker in #6563
  • refactor(airflow): remove verbose log from airflow plugin by @bskim45 in #6516
  • feat(cli): remove inconsistency check command by @anshbansal in #6569
  • fix(ingest): restrict snowflake's sqlalchemy dep by @hsheth2 in #6579
  • docs(notes): add release notes for v0.1.69 managed DataHub by @anshbansal in #6573
  • fix(test): fix delete smoke test by @david-leifker in #6585

New Contributors

  • @wangsaisai made their first contribution in #6343
  • @stanbaker made their first contribution in #6287
  • @lurecas made their first contribution in #6053
  • @Reilman79 made their first contribution in #6396
  • @LavinaVRovine made their first contribution in #6134
  • @KulykDmytro made their first contribution in #6433
  • @jakobhanna made their first contribution in #6477
  • @lustefaniak made their first contribution in #6478
  • @syedzoherer made their first contribution in #6370
  • @phongvu99 made their first contribution in #6456
  • @looppi made their first contribution in #6441
  • @rtekal made their first contribution in #6463

Full Changelog: v0.9.2...v0.9.3
V0.9.2
​​# Release Highlights

User Experience

Metadata Ingestion

New ingestion source PowerBI Report Server

DataHub Docs Site

What's Changed

Don't miss a new datahub release

NewReleases is sending notifications on new releases.