github vmware/versatile-data-kit v0.11
Versatile Data Kit 0.11

latest releases: v1.4, v1.3, v1.2...
19 months ago

Major features include:

Introduce data quality checks (pre-alpha) (for scd1 template)

Allow quality checks to be made before the data is inserted into the target table.
Currently, the checks done on the processing step are not covering if the semantics of the data is correct. Therefore, bad data could went into the target table which could be unwanted behavior.

Example:

    def sample_check_true(tmp_table_name):
        return False if "bad" in tmp_table_name else True 

    template_args["check"] = sample_check 
    job_input.execute_template(
        template_name="load/dimension/scd1",
        template_args=template_args,
    )

Jobs Query API (GraphQL) wildcard matching filter for team and job names

When querying information about jobs now users of the Jobs QUery API can use wildcard matches :
wildcard matching for example *search* in graphQl filters for job name and team name as well as before exact matching of search strings

Provide User Agent when using VDK CLI

Users are looking to be able to determine where requests originated from when analyzing and browsing the telemetry data about VDK Control Service usage.

export VDK_CONTROL_SERVICE_USER_AGENT = foo 

or in config.ini

[vdk]
vdk_control_service_user_agent=foo

If not set it would default to "vdk-control-cli/{version} ({os.name}; {sys.platform})" + {python version}

New plugin: vdk-notebook

A new VDK plugin that supports running data jobs which consists of .ipynb files. You can see VDK Notebook plugin page for more information.

vdk-ipython

This extension introduces a magic command for Jupyter. The command enables the user to load job_input for his current data job and use it freely while working with Jupyter.
You can see VDK ipython plugin page for more information.

Installation

Check the installation page

What's Changed

New Contributors

Full Changelog: v0.10...v0.11

Don't miss a new versatile-data-kit release

NewReleases is sending notifications on new releases.