Release Highlights
- Important bug-fixes:
properties
for DataJob and DataFlow,descriptions
for Datasets should now correctly show in the UI - Search redesign! Single search experience across all entity types with left filter bar
- Added searchAcrossEntities endpoint on both GraphQL and Rest.li that pulls search results for all entity types and mixes them together
- Dataset level lineages - Added support for ingesting dataset level lineages for bigquery. Added support for linking external tables in redshift to the corresponding table in the external data catalog.
- Performance optimization: graphql will now directly call the entity service instead of calling the entity resource over http to hydrate graphql models.
- The “filter” input model used for “search” API now supports disjunctive normal form. (OR of ANDs). The previous filter model should continue to work as expected. (criteria array)
- Adding foundations (models) for search insights, or highlights shown in the search result previews.
- Add owner experience improvements: using full text search to find users and groups.
- User & Group Management Screens!
- View all users (and those who have logged in)
- View all groups
- Create new groups
- Add and remove group members
Breaking Changes
None
What's Changed
- feat(ui): Improve add owner search experience by @jjoyce0510 in #3306
- (fix) Set ebean transaction level to be repeatable read by @xdl in #3285
- fix(fonts): fix manrope styling by @gabe-lyons in #3311
- docs(datahub-frontend): add build instructions for the datahub-frontend docker image by @thebouv in #3314
- feat(ingest): support for primary and foreign key extraction from sql sources by @swaroopjagadish in #3316
- feat(transform): adds replace_existing config to set_dataset_browse_path by @sgomezvillamor in #3313
- feat(redshift): added ability to extract external schema from Redshift spectrum by @varunbharill in #3321
- fix(docs): patch link to Airflow Docker compose file by @kevinhu in #3322
- docs: Fix topic_pattern typo in kafka ingestion docs by @serefacet in #3317
- fix(graphql): add ElasticSearch path prefix configuration by @zhoxie-cisco in #3297
- fix(ingest): more robust error handling in lookml sql parsing by @swaroopjagadish in #3325
- fix(ingest): Fix sasl exception for hive ingestion by @serefacet in #3326
- fix(ingest): no error when there are no partition keys by @aseembansal-gogo in #3328
- fix(docs): fix graphql deprecated comment by @gabe-lyons in #3327
- feat(dbt-ingestion): added tags and owner from dbt by @AndreasTA-AW in #3270
- fix(oidc): Tolerate null emails by @jjoyce0510 in #3330
- feat(Snowflake Lineage Ingestion) by @rslanka in #3331
- feat(ingest): support user group filtering for Azure AD by @vlavorini in #3312
- feat(ingest): Redash add parse_table_names_from_sql feature and multiple refactor by @taufiqibrahim in #3267
- feat(ingest): add support for github and looker links in looker views… by @swaroopjagadish in #3332
- fix(git-ignore): Git ignore generated python and avro artifacts by @dexter-mh-lee in #3320
- fix(ingestion): make dbt tag prefix configurable by @remisalmon in #3334
- feat(ingest): add trino source in metadata-ingestion by @mayurinehate in #3307
- feat(ingestion): support Airflow cluster config by @hsheth2 in #3336
- feat: add support for specialization of models through subtypes with … by @swaroopjagadish in #3338
- feat(search): Redesign search page - left filter pane by @dexter-mh-lee in #3337
- feat(users & groups): User & Groups Management GraphQL APIs + UI by @jjoyce0510 in #3318
- fix(pk + autocomplete): some ui fixes by @gabe-lyons in #3347
- fix(urns): prevent corrupted urns from being created by @gabe-lyons in #3348
- fix(ingestion-docker): Codegen and build again by @dexter-mh-lee in #3342
- docs(ingest): fix trino doc by @mayurinehate in #3339
- fix(docker-quickstart): Fix volume mount paths when using quickstart by @dexter-mh-lee in #3341
- fix(autocomplete): Fix empty autocomplete server error by @jjoyce0510 in #3346
- fix(Add custom elastic field mappings for all timeseries fields) by @rslanka in #3350
- fix(gitignore): Fix gitignore to ignore whole directory by @dexter-mh-lee in #3361
- fix(mce_builder): deleted alias by @vlavorini in #3356
- feat(data-platform): Add science and airflow data platform by @dexter-mh-lee in #3363
- fix(ui): fix url encoding issues by @gabe-lyons in #3359
- fix(gitignore): Update gitignore again - remove metadata-ingestion objects by @dexter-mh-lee in #3365
- fix(ci): add run_id to the task instance constructor for airflow by @swaroopjagadish in #3366
- fix(aws-deploy-docs): Fix documentation for elasticsearch by @dexter-mh-lee in #3360
- fix(bigquery_usage): Gracefully failing while parsing GCP log events. by @varunbharill in #3367
- feat(ingest): allow disabling sample values in profiling by @aseembansal-gogo in #3355
- fix(docs): fix docs for developing on metadata ingestion by @aseembansal-gogo in #3353
- test(CI): Timeout build job by @EnricoMi in #3364
- docs(OIDC): add note that root user is still accessible by @aseembansal-gogo in #3372
- test(metadata-io): Run metadata-io tests in parallel by @EnricoMi in #3358
- test(ElasticSearch): Retry ES requests by @EnricoMi in #3377
- fix(ingest): redshift usage properly count queries by @treff7es in #3370
- feat(subtypes): Support Viz for "view" subtypes by @jjoyce0510 in #3376
- fix(graphql): Correctly return tags and legacy global tags field by @jjoyce0510 in #3378
- fix(ingest): fixing support for kafka key schemas when only key schemas are present by @swaroopjagadish in #3379
- fix(search): Small bug fixes for search redesign by @dexter-mh-lee in #3381
- test(airflow): remove unneeded execution_date parameter from test by @hsheth2 in #3368
- feat(ingest): add mariadb as possible source by @aseembansal-gogo in #3245
- fix(search): fixing user and group links in search results by @gabe-lyons in #3383
- fix(subtypes): Fix subtypes tab visibility by @jjoyce0510 in #3386
- Revert "test(ElasticSearch): Retry ES requests" by @gabe-lyons in #3385
- Revert "Revert "test(ElasticSearch): Retry ES requests"" by @gabe-lyons in #3392
- Adding kafka connect data platform by @jjoyce0510 in #3388
- Replace big query logo with the latest by @jjoyce0510 in #3387
- oidc: Add "name" claim extraction if present by @jjoyce0510 in #3384
- feat(ingest): teaching lookml source that athena has 2 parts in its dataset names by @swaroopjagadish in #3393
- fix(ingest): fix issues with lookml view file resolution on non-view … by @swaroopjagadish in #3397
- feat(search): Search insights foundations by @jjoyce0510 in #3391
- fix(graphQL): Populating deprecated Dataset description field by @jjoyce0510 in #3403
- feat(search): Support Boolean OR Filters in Rest.li APIs by @jjoyce0510 in #3344
- fix(lookml): Fixing lookml integration test. by @varunbharill in #3405
- fix(browse): Add more special character handling by @dexter-mh-lee in #3404
- fix(search): Reduce default batch size by @dexter-mh-lee in #3407
- fix(ui): Extract customProperties map from "properties" OR "info" entity field by @jjoyce0510 in #3410
- fix(gms): Add Rest.li Validation to ingestProposal by @jjoyce0510 in #3409
- fix(ingest): set athena dataset name with 2 parts in redash source by @Rukesh-Kapuluru in #3406
- feat(bigquery): Ingest lineage metadata from Bigquery logs. by @varunbharill in #3389
- docs(ingest): Add required permissions to Azure AD source doc by @jjoyce0510 in #3414
- fix(ingest): switch to avro from deprecated avro-python3 by @hsheth2 in #3412
- feat(spark): add spark logo and dataplatform. by @varunbharill in #3417
- fix(graph service): fix case where certain mcps can incorrectly delete graph edges by @gabe-lyons in #3418
- fix(datahub-upgrade): Update datahub upgrade to use MCL instead of MAE by @dexter-mh-lee in #3411
- feat(ingest): add complex types support in hive and trino source by @mayurinehate in #3375
- fix(docs): Add disk usage req to quickstart doc by @dexter-mh-lee in #3415
- test(modelValidation): Enhance Error Message by @RyanHolstien in #3394
- feat(metadata-service): Introducing EntityClient interface to avoid unnecessary HTTP calls. by @jjoyce0510 in #3421
- fix(deletes): make sure deletion removes lineage by @gabe-lyons in #3423
- feat(react): dynamically hide entity types that haven't been ingested by @gabe-lyons in #3419
- feat(ingest): support profiling tables in parallel by @hsheth2 in #3369
- fix(ingest): allow database alias, remove extra removal from connect_… by @aseembansal-gogo in #3352
- fix(bigquery): Fix error when computing lineage in bigquery is turned off by @varunbharill in #3428
- fix(oidc): Fix the oidc lastModifiedAt bug by @jjoyce0510 in #3429
- fix(dupe edges): Fix datajob duplicate edges in elastic by @jjoyce0510 in #3426
- fix(ingest): resolve click-default-group deprecation warning by @hsheth2 in #3427
- fix(browse): fix browse for entities without default browse logic by @gabe-lyons in #3422
- feat(ingest): add parallelism to looker source and datahub rest sink by @swaroopjagadish in #3431
- docs: Peloton adoption of Datahub by @arunvasudevan in #3433
- Docs branding by @jeffmerrick in #3432
New Contributors
- @xdl made their first contribution in #3285
- @thebouv made their first contribution in #3314
- @varunbharill made their first contribution in #3321
- @serefacet made their first contribution in #3317
- @zhoxie-cisco made their first contribution in #3297
- @AndreasTA-AW made their first contribution in #3270
- @mayurinehate made their first contribution in #3307
- @treff7es made their first contribution in #3370
- @Rukesh-Kapuluru made their first contribution in #3406
- @jeffmerrick made their first contribution in #3432
Full Changelog: v0.8.15...v0.8.16