New
- Added an example that demonstrates what a complete repository that takes advantage of many Dagster features might look like. Includes usage of IO Managers, modes / resources, unit tests, several cloud service integrations, and more! Check it out at
examples/hacker_news
! retry_number
is now available onSolidExecutionContext
, allowing you to determine within a solid function how many times the solid has been previously retried.- Errors that are surfaced during solid execution now have clearer stack traces.
- When using Postgres or MySQL storage, the database mutations that initialize Dagster tables on startup now happen in atomic transactions, rather than individual SQL queries.
- The tags for Dagster-provided images in the Helm chart will now default to the current chart version.
- Removed the
PIPELINE_INIT_FAILURE
event type. A failure that occurs during pipeline initialization will now produce aPIPELINE_FAILURE
as with all other pipeline failures.
Bugfixes
- When viewing run logs in Dagit, in the stdout/stderr log view, switching the filtered step did not work. This has been fixed. Additionally, the filtered step is now present as a URL query parameter.
- The
get_run_status
method on the Python GraphQL client now returns aPipelineRunStatus
enum instead of the raw string value in order to align with the mypy type annotation. Thanks to Dylan Bienstock for surfacing this bug! - When a docstring on a solid doesn’t match the reST, Google, or Numpydoc formats, Dagster no longer raises an error.
- Fixed a bug where memoized runs would sometimes fail to execute when specifying a non-default IO manager key.
Experimental
- Added the
k8s_job_executor
, which executes solids in separate kubernetes jobs. With the addition of this executor, you can now choose at runtime between single pod and multi-pod isolation for solids in your run. Previously this was only configurable for the entire deployment - you could either use the K8sRunLauncher with the default executors (in_process and multiprocess) for low isolation, or you could use the CeleryK8sRunLauncher with the celery_k8s_job_executor for pod-level isolation. Now, your instance can be configured with the K8sRunLauncher and you can choose between the default executors or the k8s_job_executor. - The
DagsterGraphQLClient
now allows you to specify whether to use HTTP or HTTPS when connecting to the GraphQL server. In addition, error messages during query execution or connecting to dagit are now clearer. Thanks to @emily-hawkins for raising this issue! - Added experimental hook invocation functionality. Invoking a hook will call the underlying decorated function. For example:
from dagster import build_hook_context
my_hook(build_hook_context(resources={"foo_resource": "foo"}))
- Resources can now be directly invoked as functions. Invoking a resource will call the underlying decorated initialization function.
from dagster import build_init_resource_context
@resource(config_schema=str)
def my_basic_resource(init_context):
return init_context.resource_config
context = build_init_resource_context(config="foo")
assert my_basic_resource(context) == "foo"
- Improved the error message when a pipeline definition is incorrectly invoked as a function.
Documentation
- Added a section on testing custom loggers: https://docs.dagster.io/master/concepts/logging/loggers#testing-custom-loggers