github datahub-project/datahub v0.9.0
DataHub v0.9.0

latest releases: v0.13.2, v0.13.1, v0.13.1rc2...
19 months ago

Release Highlights

We’re excited to announce the release of DataHub v0.9.0!

This minor release includes an upgrade to Java 11 and surfacing Column-Level Lineage support within the DataHub UI.

Here are some additional highlights:

User Experience

  • Column-Level Lineage is now surfaced within the DataHub UI!
  • Advanced Search now supports searching by Column-level details (i.e. name, description, tag, etc.), as well as complex AND/OR statements. For example:
    • Show results that match any filters
    • Show results that match all filters
    • Owner is either of Shannon or Mark
    • Oner is not Shannon nor Mark
    • Try it in demo here
  • You can now add invite users and assign them to a default DataHub Role
  • Improvements to site performance during the Browse experience

Developer Experience

  • DataHub has been upgraded to Java 11!
  • Improved tracking of GraphQL errors for bug resolution
  • CorpUser and CorpGroup are now available via the Python SDK

Metadata Ingestion

  • Automatically extract Column-Level Lineage from Snowflake & Looker sources
  • dbt Meta Mapping is now supported at the Column Level - this means you can automatically extract Tags and Glossary Terms from your dbt model and surface them in DataHub

What's Changed

  • fix(ingest): bigquery-beta - Getting datasets with biquery client by @treff7es in #6039
  • feat(roles): add ability to invite users into a role by @aditya-radhakrishnan in #6015
  • refactor(java11) - convert most modules to java 11 by @leifker in #5836
  • docs(readme): Fixing broken article link by @davrax in #6042
  • refactor(ingest): streamline pydantic configs by @hsheth2 in #6011
  • docs(ingest): add example of dbt column_meta_mapping by @hsheth2 in #6038
  • refactor(ingest): use aspect map in transformers by @hsheth2 in #6040
  • feat(ui): Adding placeholder entity for DataPlatform by @jjoyce0510 in #6045
  • feat(ingest): implement compression for CheckpointState by @alexey-kravtsov in #6007
  • feat(advanced-search): adding select value modal by @gabe-lyons in #6026
  • fix(ingest): bigquery-beta - Additional fixes for Bigquery beta by @treff7es in #6051
  • feat(advanced search): adding advanced search filter component & prereqs for it by @gabe-lyons in #6055
  • docs(ingest): add path spec examples for s3 by @mayurinehate in #6050
  • fix(deps): metadata-io - remove parquet dependency by @shirshanka in #6046
  • fix(ingestion): Tableau test case execution fix by @mohdsiddique in #6005
  • feat(ingest): list referenced env variables in recipe by @hsheth2 in #6043
  • fix(ingest): compat with mypy 0.981 by @hsheth2 in #6056
  • fix(elasticsearch_index): create datahub_usage_event index where datahub_analytics_enabled set to false by @GyuhoonK in #5974
  • docs(approval workflows): adding approval workflow docs by @gabe-lyons in #5896
  • feat(retention): disable applying retention on bootstrap by @anshbansal in #6066
  • fix(ingest): correct tableau browse paths by @hsheth2 in #6064
  • fix(ingest): bigquery-beta - handling complex types properly by @treff7es in #6062
  • docs: create SECURITY.md by @laulpogan in #6069
  • fix(containers): show soft deleted status of containers by @gabe-lyons in #6072
  • docs(ingest): clarify bigquery-beta multiproject setup by @hsheth2 in #6071
  • chore(setup): change defaults for partitions by @anshbansal in #6074
  • refactor(browse): Improving Browse Feature Performance by @jjoyce0510 in #6073
  • feat(ingest): add column-level lineage support for snowflake by @mayurinehate in #6034
  • feat(ingest): looker - support for simple column level lineage by @shirshanka in #6084
  • fix(elastic-setup) Fixing env var logic by @pedro93 in #6079
  • Revert "chore(setup): change defaults for partitions (#6074)" by @pedro93 in #6086
  • fix(mae-consumer): fix regression on base64 encoding by @codesorcery in #6061
  • fix(elasticsearch) Analytics indices creation on AWS ES by @tomas-kubin in #5502
  • docs(ingest): note that Athena doesn't support lineage by @hsheth2 in #6081
  • fix(ingest): alias for mssql-odbc source by @hsheth2 in #6080
  • fix(ingest): presto-on-hive - Setting display name properly by @treff7es in #6065
  • fix(schema filter): fix schema infinite rerender by @gabe-lyons in #6082
  • feat(monitoring): track graphql errors in metrics by @szalai1 in #6087
  • feat(advanced search): Add component to show all advanced search filters & add new filter by @gabe-lyons in #6058
  • fix(ingest): bump lkml version by @hsheth2 in #6091
  • fix(ingest): lookml - extract column correctly by @shirshanka in #6093
  • feat(retention): change default policy, add API to apply retention by @anshbansal in #6088
  • fix(lineage): fix missed casing in lineage registry by @gabe-lyons in #6078
  • fix(ingest): bigquery-beta - Lowering a bit memory footprint of bigquery usage by @treff7es in #6095
  • feat(ingest): remove hardcoded env variable default for cli version by @shirshanka in #6075
  • docs: add information about mapping ports for datahub-gms by @shirshanka in #6092
  • chore(deps): upgrade graphql-java deps to 19.0 by @shirshanka in #6099
  • chore(deps): upgrade neo4j to 4.4.x by @shirshanka in #6101
  • feat(docs): Improve documentation about Search by @szalai1 in #5889
  • feat(ingest): add async option to ingest proposal endpoint by @RyanHolstien in #6097
  • chore(deps): upgrade opentelemetry dependencies by @shirshanka in #6100
  • refactor(recommendations): Bump default max recommendations count for Platforms by @jjoyce0510 in #6113
  • feat(ingest): add Sandbox support by @rgudic in #6105
  • fix(mae): use JAVA_TOOL_OPTIONS instead of JDK_JAVA_OPTIONS by @szalai1 in #6114
  • feat(advanced-search): Complete Advanced Search: backend changes & tying UI together by @gabe-lyons in #6068
  • feat(search): improved search snippet FE logic by @gabe-lyons in #6109
  • feat(ingest): add CorpUser and CorpGroup to the Python SDK by @ttaubermarshall-stripe in #5930
  • fix(ingest): hide deprecated path_spec option from config by @hsheth2 in #5944
  • feat(posts): add posts feature to DataHub by @aditya-radhakrishnan in #6110
  • fix(ingest): remove unused mysql golden file by @hsheth2 in #6106
  • fix(ingestion): fix percent change computation in stale_entity_removal by @rslanka in #6121
  • refactor(ingest): use pydantic utilities for NamingPattern by @hsheth2 in #6013
  • fix(ingest): presto-on-hive - not failing on Hive type parsing error by @treff7es in #6118
  • fix(ingest): ignore usage and operation for snowflake datasets withou… by @mayurinehate in #6112
  • refactor(ingest): remove typing workarounds by @hsheth2 in #6108
  • Added information about AUTH_OIDC_EXTRACT_GROUPS_ENABLED by @PrashantKhadke in #6120
  • feat(lineage): show fully qualified task name in lineage UI by @gabe-lyons in #6126
  • docs(tableau): adding a ingestion video by @shirshanka in #6124
  • Sending "getting started" direct to quickstart by @laulpogan in #6125
  • build: Update JNA for M1 Mac by @david-leifker in #6116
  • fix(ingest): bigquery-beta - fix for missing key error if dataset list was empty by @treff7es in #6133
  • fix(ingest): file - add configurability for counting all elements bef… by @shirshanka in #6136
  • Worked on the feature to update group title by @Ankit-Keshari-Vituity in #6047
  • fix(ingest): add trino package max version restriction by @hsheth2 in #6137
  • test(KafkaEmitter): Enable ability to run test locally by @david-leifker in #6123
  • fix(ingest): add column name quoting for approximate count distinct by @hsheth2 in #6107
  • fix(ingestion): add fallback to trino by @IceS2 in #6044
  • perf(search): temporarily disable fetching input fields for search results by @gabe-lyons in #6139
  • feat(lineage) Add Column-Level to Lineage Visualization by @chriscollins3456 in #6138
  • feat(tracking): add telemetry for frontend events by @aditya-radhakrishnan in #6129
  • docs(approvals): update approval permission docs by @gabe-lyons in #6143
  • fix(ingest): fetch workbook tags in workbooks graphql query by @mayurinehate in #6102
  • fix(lineage) Fix possible null pointer exception in UpstreamLineagesMapper by @chriscollins3456 in #6147
  • fix(ingest): bigquery-beta - Eliminate the need for data.read permission for table schema by @treff7es in #6146
  • fix(lineage) Fix batching to ES for impact analysis by @chriscollins3456 in #6149
  • feat(ingest/lookml): add support for local/remote dependencies by @hsheth2 in #6150
  • fix(auth): fix login endpoint to respect session expiration env var by @aditya-radhakrishnan in #6151
  • fix(impact analysis): fixing filtering on impact analysis + cypress tests by @gabe-lyons in #6152
  • docs(favicon): add docs for customizing favicon by @gabe-lyons in #6155
  • fix(ingest): bigquery-beta - ensure that status aspect is emitted for… by @shirshanka in #6154
  • fix(ingest): bigquery - Fix syntax error in get_all_schema_tables_query by @hieunt-itfoss in #6159
  • fix(ingest): allow snowflake profiling to work with geography type by @mayurinehate in #6162
  • feat(ingest): support enabled flag for airflow config by @hsheth2 in #6089
  • refactor(ingest): Tableau cleanup by @hsheth2 in #6131
  • fix(ingest): bigquery-beta - turning sql parsing off in lineage extraction by @treff7es in #6163
  • fix(ingest): allow hiding some fields from the schema by @hsheth2 in #6077

New Contributors

Full Changelog: v0.8.45...v0.9.0

Don't miss a new datahub release

NewReleases is sending notifications on new releases.