github datahub-project/datahub v0.8.7
DataHub v0.8.7

latest releases: v0.13.2, v0.13.1, v0.13.1rc2...
pre-release2 years ago

Release Stability

  • There are a few bugs reported on this release that are fixed in 0.8.8. Users are highly recommended to skip past this release!

Release Highlights

  • Dataset Profiling and support for time-series metadata
  • UI for ML Models, Features; support for AWS SageMaker and Feast
  • Cli: support for rollback operations after ingestion
  • Integration fixes for Looker, dbt, and many more.
  • Demos for all these features are available in our July Townhall video

ChangeLog

  • #3021 @kevinhu feat(ingest): extract dbt versions into custom properties
  • #3020 @gabe-lyons fix(caching): refetch query on update
  • #3019 @kevinhu fix(ingest): don't assume Glue job description always exists
  • #3000 @topwebtek7 fix(react): fix weird 0 rendering possible bugs
  • #3018 @dexter-mh-lee feat(ingest): add kafka emitters for MetadataChangeProposal format
  • #2999 @jjoyce0510 fix(gms): Adding Rest.li Write-Time Model Validation
  • #3009 @jjoyce0510 fix(quickstart): Bumping Default Memory for GMS and Frontend
  • #3007 @jjoyce0510 fix(gms): better logging on failed MCL / MAE
  • #3008 @gabe-lyons fix(blank pages): removing apollo caching
  • #3006 @jjoyce0510 fix(ci): using AspectExtractor instead of removed SnapshotToAspectMap
  • #2998 @gabe-lyons fix(graphql): fetching data platforms using standard procedure
  • #2944 @EnricoMi refactor(test): Refactor GraphService tests
  • #2972 @jameslamb fix(ingest): map all LookML dimension types to corresponding avro types
  • #3005 @dexter-mh-lee fix(ingestion): Safeguard against empty values for profile ingestion
  • #3002 @dexter-mh-lee fix(datahub-upgrade) add config registry to datahub upgrade container
  • #3003 @jjoyce0510 fix(dataset stats): Fix checks for existence of row and column counts
  • #2997 @topwebtek7 feat(react): update dataset documents tab with a merged document column
  • #2991 @topwebtek7 feat(react): update search result has result counts for each entities that has result
  • #2983 @jjoyce0510 Introducing TimeSeries Aspects + Dataset Profile (Stats) Aspect
  • #2984 @dexter-mh-lee fix(browse): Fix browse pagination and multi-browse path issue
  • #2995 @aseembansal-gogo docs(ingest): Add instructions to install required dependency
  • #2960 @gabe-lyons feat(deletes): add run commands (list, show, rollback) to datahub ingest
  • #2994 @chinmay-bhat docs(ingest): fixed Snowflake recipe to escape dollar-sign
  • #2981 @hsheth2 docs: remove a few outdated docs
  • #2988 @jjoyce0510 docs: add docs on extracting container logs
  • #2963 @hsheth2 test(ingestion): run full tests on both python versions
  • #2967 @jameslamb fix(ingest): add more debug logging to LookML metadata ingestion
  • #2966 @jameslamb fix(ingest): ensure that LookML files are always parsed in the same order
  • #2965 @jameslamb fix(ingest): ensure workunits are created for all LookML views
  • #2982 @gabe-lyons fix(tags): fixing tag applied to module for tags w/ colons in the name
  • #2961 @gabe-lyons feat(ml-model): adding ml models and ml model groups
  • #2975 @kevinhu feat(ingest): type stubs for boto3
  • #2979 @jameslamb perf(ingest): remove unused variable in Looker ingestion
  • #2980 @hsheth2 fix(ingest): infer bigquery project identifier
  • #2978 @chinmay-bhat fix(ingest): fix hive ingestion to respect database configuration
  • #2976 @hsheth2 feat(ingest): stricter deserialization for MCE JSONs
  • #2959 @kevinhu feat(docs): tutorial for writing a custom transformer
  • #2977 @hsheth2 fix(ingestion): isolate dependency requirements of airflow hooks
  • #2962 @hsheth2 feat(ingest): add timezone validation to bigquery usage
  • #2974 @dexter-mh-lee fix(elasticsearch-setup): fix elasticsearch setup for aws
  • #2952 @hsheth2 text(ingestion): test multiple python versions in CI
  • #2958 @hsheth2 feat(ingest): add Airflow TaskFlow example
  • #2950 @kevinhu fix(ingest): patch lookml types and refactor ingestion sources layout
  • #2957 @jameslamb fix(ingest): match nested LookML files mentioned in 'include' statements
  • #2956 @gabe-lyons Revert "fix(gql): removing data platform caching in gql (#2947)"
  • #2955 @kevinhu feat(ingest): ingest descriptions from dbt models
  • #2948 @hsheth2 fix(ingestion): add more mypy annotations
  • #2946 @hsheth2 feat(ingestion): test GMS connections before ingestion
  • #2947 @gabe-lyons fix(gql): removing data platform caching in gql
  • #2949 @hsheth2 test(ingestion): fix flaky package discovery test
  • #2951 @kevinhu feat(docs): update videos and integration logos
  • #2953 @hsheth2 fix(ingestion): resolve test bugs for 3.6
  • #2943 @kevinhu feat(ingest): add logo and platform entry for Glue
  • #2940 @hsheth2 fix(ingest): handle quotes in lookml properly
  • #2938 @kevinhu feat(models): remove versions from metrics and hyperparams
  • #2942 @hsheth2 fix(ingestion): make snowflake database names lowercase
  • #2939 @hsheth2 feat(ingest): use urn builders in looker and validate data platforms
  • #2941 @aseembansal-gogo refactor(ingest): make code pythonic
  • #2937 @kevinhu fix(ingest): allow custom Glue scripts
  • #2921 @kafkahw refactor(datahub-web): removing frontend Ember app (i.e. datahub-web folder)
  • #2913 @hsheth2 fix(ingest): refactor + fix recursion in lookml file loading logic
  • #2925 @hsheth2 feat(ingest): improve bigquery-usage robustness and docs
  • #2931 @aseembansal-gogo fix(ingest): fix workunit name to be consistent with other sources
  • #2935 @kevinhu fix(ingest): fix browsepaths and ownership urns
  • #2930 @aseembansal-gogo fix(ingest): glue add support for mapping varchar, decimal types
  • #2929 @kevinhu feat(ingest): refactor mlModel grouping and add browsepaths
  • #2934 @hsheth2 docs(ingest): update looker + docker script docs
  • #2926 @hsheth2 feat(ingest): add make_data_platform_urn method to builder
  • #2932 @topwebtek7 feat(react): surface edited descriptions on search preview for dataset, datajob, dataflow, chart, dashboard
  • #2911 @hsheth2 fix(ingest): add quotes to secured kafka yaml config example
  • #2927 @kevinhu feat(ingest): dbt aliases
  • #2806 @saxo-lalrishav fix(react): enable relation between glossary term and datasets searchable
  • #2910 @kevinhu feat(ingest): extract SageMaker metrics, hyperparameters, and external URLs
  • #2915 @aseembansal-gogo docs: update docs for consistency in naming
  • #2922 @kevinhu feat(ingest): test dbt ingestion with and without schemas
  • #2924 @hsheth2 fix(ingest): note that views are not supported for Athena
  • #2920 @hsheth2 feat(ingestion): support multiple project IDs in bigquery usage stats
  • #2923 @hsheth2 fix(ingest): pin snowflake sqlalchemy connector
  • #2909 @hsheth2 feat(ingest): add support for Oracle spatial types
  • #2917 @kevinhu docs(ingest): update sample recipe and test input for dbt
  • #2887 @topwebtek7 feat(mlFeatureTable): add graphql, ui/ux for mlFeatureTable, mlFeature, mlPrimaryKey entities
  • #2916 @kevinhu fix(ingest): stringify all dbt custom props
  • #2898 @aseembansal-gogo feat(ingest): Add option to change name of database for postgres
  • #2912 @hsheth2 fix(ingest): issue a warning if the column list is empty
  • #2894 @kevinhu feat(ingest): lineage for SageMaker model endpoints and groups
  • #2905 @hsheth2 feat(ingest): add can_add_aspect method for MCEs
  • #2906 @hsheth2 test(ingest): update tox test configurations and test airflow 2.x by default
  • #2904 @jjoyce0510 fix(frontend): Don't use Apollo Cache for IsAnalyticsEnabled query.
  • #2877 @remisalmon feat(ingest): use node comment as description if existing else default to key
  • #2889 @hsheth2 fix(react): avoid displaying "0" for ignored timestamps
  • #2890 @gabe-lyons fix(search): fixing case where someone issues a null query
  • #2893 @hsheth2 fix(ingest): use logger.warning instead of logger.warn
  • #2888 @jameslamb fix(ingest): change LookMLSource._get_upsteam_lineage() to _get_upstream_lineage()
  • #2901 @topwebtek7 feat(react): update schema history visualizing, truncate long type, original desc bug
  • #2891 @hsheth2 fix(ingest): correct globs in lookml model discovery
  • #2902 @kevinhu feat(ingest): add connectivity check for Looker
  • #2597 @wan54 feat(react): configure Cypress + MirageJS + GraphQL mock for functional testing plus a couple of example tests
  • #2903 @shirshanka docs: update docs for July townhall
  • #2900 @kevinhu fix(ingest): string-ify dbt custom props
  • #2899 @jjoyce0510 fix(docs): fixing miscellaneous docs
  • #2788 @saxo-lalrishav fix(glossary):default browse path for glossary term
  • #2868 @kevinhu feat(ingest): extract lineage between SageMaker jobs and models
  • #2884 @dexter-mh-lee fix(search): Fix index builder
  • #2883 @hsheth2 docs: revamp adoption section
  • #2882 @hsheth2 fix(ingest): fix druid misconfiguration bug
  • #2881 @hsheth2 fix(ingest): default to unlimited query log delay in bigquery-usage
  • #2790 @saxo-lalrishav fix(search): enable search on business glossary terms
  • #2872 @hsheth2 build(ingest): reduce dependencies for dev install
  • #2874 @topwebtek7 fix(react): fix bug in description update modal
  • #2866 @hsheth2 build(ingestion): add version prompt to release script
  • #2869 @kevinhu feat(ingest): update golden files only when diff fails
  • #2876 @kevinhu feat(ingest): extract dbt meta fields
  • #2875 @hsheth2 docs(quickstart): add default password to quickstart
  • #2873 @hsheth2 fix(quickstart): update compose spec version
  • #2862 @hsheth2 build(ingest): separate metadata-ingestion build workflow fully
  • #2867 @hsheth2 fix(build): increase retries for dependency fetches
  • #2849 @kevinhu feat(ingest): add browse paths + dataplatform for Feast features
  • #2859 @kevinhu feat(docs): swap Medium and videos sections
  • #2858 @hsheth2 feat(ingest): support dynamic imports for transfomer methods

Don't miss a new datahub release

NewReleases is sending notifications on new releases.