Release Highlights
We’re excited to announce the release of DataHub v0.9.0!
This minor release includes an upgrade to Java 11 and surfacing Column-Level Lineage support within the DataHub UI.
Here are some additional highlights:
User Experience
- Column-Level Lineage is now surfaced within the DataHub UI!
- Advanced Search now supports searching by Column-level details (i.e. name, description, tag, etc.), as well as complex AND/OR statements. For example:
- Show results that match any filters
- Show results that match all filters
- Owner is either of Shannon or Mark
- Oner is not Shannon nor Mark
- Try it in demo here
- You can now add invite users and assign them to a default DataHub Role
- Improvements to site performance during the Browse experience
Developer Experience
- DataHub has been upgraded to Java 11!
- Improved tracking of GraphQL errors for bug resolution
- CorpUser and CorpGroup are now available via the Python SDK
Metadata Ingestion
- Automatically extract Column-Level Lineage from Snowflake & Looker sources
- dbt Meta Mapping is now supported at the Column Level - this means you can automatically extract Tags and Glossary Terms from your dbt model and surface them in DataHub
What's Changed
- fix(ingest): bigquery-beta - Getting datasets with biquery client by @treff7es in #6039
- feat(roles): add ability to invite users into a role by @aditya-radhakrishnan in #6015
- refactor(java11) - convert most modules to java 11 by @leifker in #5836
- docs(readme): Fixing broken article link by @davrax in #6042
- refactor(ingest): streamline pydantic configs by @hsheth2 in #6011
- docs(ingest): add example of dbt column_meta_mapping by @hsheth2 in #6038
- refactor(ingest): use aspect map in transformers by @hsheth2 in #6040
- feat(ui): Adding placeholder entity for DataPlatform by @jjoyce0510 in #6045
- feat(ingest): implement compression for CheckpointState by @alexey-kravtsov in #6007
- feat(advanced-search): adding select value modal by @gabe-lyons in #6026
- fix(ingest): bigquery-beta - Additional fixes for Bigquery beta by @treff7es in #6051
- feat(advanced search): adding advanced search filter component & prereqs for it by @gabe-lyons in #6055
- docs(ingest): add path spec examples for s3 by @mayurinehate in #6050
- fix(deps): metadata-io - remove parquet dependency by @shirshanka in #6046
- fix(ingestion): Tableau test case execution fix by @mohdsiddique in #6005
- feat(ingest): list referenced env variables in recipe by @hsheth2 in #6043
- fix(ingest): compat with mypy 0.981 by @hsheth2 in #6056
- fix(elasticsearch_index): create datahub_usage_event index where
datahub_analytics_enabled
set tofalse
by @GyuhoonK in #5974 - docs(approval workflows): adding approval workflow docs by @gabe-lyons in #5896
- feat(retention): disable applying retention on bootstrap by @anshbansal in #6066
- fix(ingest): correct tableau browse paths by @hsheth2 in #6064
- fix(ingest): bigquery-beta - handling complex types properly by @treff7es in #6062
- docs: create SECURITY.md by @laulpogan in #6069
- fix(containers): show soft deleted status of containers by @gabe-lyons in #6072
- docs(ingest): clarify bigquery-beta multiproject setup by @hsheth2 in #6071
- chore(setup): change defaults for partitions by @anshbansal in #6074
- refactor(browse): Improving Browse Feature Performance by @jjoyce0510 in #6073
- feat(ingest): add column-level lineage support for snowflake by @mayurinehate in #6034
- feat(ingest): looker - support for simple column level lineage by @shirshanka in #6084
- fix(elastic-setup) Fixing env var logic by @pedro93 in #6079
- Revert "chore(setup): change defaults for partitions (#6074)" by @pedro93 in #6086
- fix(mae-consumer): fix regression on base64 encoding by @codesorcery in #6061
- fix(elasticsearch) Analytics indices creation on AWS ES by @tomas-kubin in #5502
- docs(ingest): note that Athena doesn't support lineage by @hsheth2 in #6081
- fix(ingest): alias for mssql-odbc source by @hsheth2 in #6080
- fix(ingest): presto-on-hive - Setting display name properly by @treff7es in #6065
- fix(schema filter): fix schema infinite rerender by @gabe-lyons in #6082
- feat(monitoring): track graphql errors in metrics by @szalai1 in #6087
- feat(advanced search): Add component to show all advanced search filters & add new filter by @gabe-lyons in #6058
- fix(ingest): bump
lkml
version by @hsheth2 in #6091 - fix(ingest): lookml - extract column correctly by @shirshanka in #6093
- feat(retention): change default policy, add API to apply retention by @anshbansal in #6088
- fix(lineage): fix missed casing in lineage registry by @gabe-lyons in #6078
- fix(ingest): bigquery-beta - Lowering a bit memory footprint of bigquery usage by @treff7es in #6095
- feat(ingest): remove hardcoded env variable default for cli version by @shirshanka in #6075
- docs: add information about mapping ports for datahub-gms by @shirshanka in #6092
- chore(deps): upgrade graphql-java deps to 19.0 by @shirshanka in #6099
- chore(deps): upgrade neo4j to 4.4.x by @shirshanka in #6101
- feat(docs): Improve documentation about Search by @szalai1 in #5889
- feat(ingest): add async option to ingest proposal endpoint by @RyanHolstien in #6097
- chore(deps): upgrade opentelemetry dependencies by @shirshanka in #6100
- refactor(recommendations): Bump default max recommendations count for Platforms by @jjoyce0510 in #6113
- feat(ingest): add Sandbox support by @rgudic in #6105
- fix(mae): use JAVA_TOOL_OPTIONS instead of JDK_JAVA_OPTIONS by @szalai1 in #6114
- feat(advanced-search): Complete Advanced Search: backend changes & tying UI together by @gabe-lyons in #6068
- feat(search): improved search snippet FE logic by @gabe-lyons in #6109
- feat(ingest): add CorpUser and CorpGroup to the Python SDK by @ttaubermarshall-stripe in #5930
- fix(ingest): hide deprecated path_spec option from config by @hsheth2 in #5944
- feat(posts): add posts feature to DataHub by @aditya-radhakrishnan in #6110
- fix(ingest): remove unused mysql golden file by @hsheth2 in #6106
- fix(ingestion): fix percent change computation in stale_entity_removal by @rslanka in #6121
- refactor(ingest): use pydantic utilities for NamingPattern by @hsheth2 in #6013
- fix(ingest): presto-on-hive - not failing on Hive type parsing error by @treff7es in #6118
- fix(ingest): ignore usage and operation for snowflake datasets withou… by @mayurinehate in #6112
- refactor(ingest): remove typing workarounds by @hsheth2 in #6108
- Added information about AUTH_OIDC_EXTRACT_GROUPS_ENABLED by @PrashantKhadke in #6120
- feat(lineage): show fully qualified task name in lineage UI by @gabe-lyons in #6126
- docs(tableau): adding a ingestion video by @shirshanka in #6124
- Sending "getting started" direct to quickstart by @laulpogan in #6125
- build: Update JNA for M1 Mac by @david-leifker in #6116
- fix(ingest): bigquery-beta - fix for missing key error if dataset list was empty by @treff7es in #6133
- fix(ingest): file - add configurability for counting all elements bef… by @shirshanka in #6136
- Worked on the feature to update group title by @Ankit-Keshari-Vituity in #6047
- fix(ingest): add trino package max version restriction by @hsheth2 in #6137
- test(KafkaEmitter): Enable ability to run test locally by @david-leifker in #6123
- fix(ingest): add column name quoting for approximate count distinct by @hsheth2 in #6107
- fix(ingestion): add fallback to trino by @IceS2 in #6044
- perf(search): temporarily disable fetching input fields for search results by @gabe-lyons in #6139
- feat(lineage) Add Column-Level to Lineage Visualization by @chriscollins3456 in #6138
- feat(tracking): add telemetry for frontend events by @aditya-radhakrishnan in #6129
- docs(approvals): update approval permission docs by @gabe-lyons in #6143
- fix(ingest): fetch workbook tags in workbooks graphql query by @mayurinehate in #6102
- fix(lineage) Fix possible null pointer exception in UpstreamLineagesMapper by @chriscollins3456 in #6147
- fix(ingest): bigquery-beta - Eliminate the need for data.read permission for table schema by @treff7es in #6146
- fix(lineage) Fix batching to ES for impact analysis by @chriscollins3456 in #6149
- feat(ingest/lookml): add support for local/remote dependencies by @hsheth2 in #6150
- fix(auth): fix login endpoint to respect session expiration env var by @aditya-radhakrishnan in #6151
- fix(impact analysis): fixing filtering on impact analysis + cypress tests by @gabe-lyons in #6152
- docs(favicon): add docs for customizing favicon by @gabe-lyons in #6155
- fix(ingest): bigquery-beta - ensure that status aspect is emitted for… by @shirshanka in #6154
- fix(ingest): bigquery - Fix syntax error in get_all_schema_tables_query by @hieunt-itfoss in #6159
- fix(ingest): allow snowflake profiling to work with geography type by @mayurinehate in #6162
- feat(ingest): support enabled flag for airflow config by @hsheth2 in #6089
- refactor(ingest): Tableau cleanup by @hsheth2 in #6131
- fix(ingest): bigquery-beta - turning sql parsing off in lineage extraction by @treff7es in #6163
- fix(ingest): allow hiding some fields from the schema by @hsheth2 in #6077
New Contributors
- @davrax made their first contribution in #6042
- @laulpogan made their first contribution in #6069
- @tomas-kubin made their first contribution in #5502
- @rgudic made their first contribution in #6105
- @ttaubermarshall-stripe made their first contribution in #5930
- @PrashantKhadke made their first contribution in #6120
- @david-leifker made their first contribution in #6116
- @IceS2 made their first contribution in #6044
Full Changelog: v0.8.45...v0.9.0