New
- Asset backfills launched from the asset graph now respect partition mappings. For example, if partition N of asset2 depends on partition N-1 of asset1, and both of those partitions are included in a backfill, asset2’s partition N won’t be backfilled until asset1’s partition N-1 has been materialized.
- Asset backfills launched from the asset graph will now only materialize each non-partitioned asset once - after all upstream partitions within the backfill have been materialized.
- Executors can now be configured with a
tag_concurrency_limits
key that allows you to specify limits on the number of ops with certain tags that can be executing at once within a single run. See the docs for more information. ExecuteInProcessResult
, the type returned bymaterialize
,materialize_to_memory
, andexecute_in_process
, now has anasset_value
method that allows you to fetch output values by asset key.AssetIn
s can now acceptNothing
for theirdagster_type
, which allows omitting the input from the parameters of the@asset
- or@multi_asset
- decorated function. This is useful when you want to specify a partition mapping or metadata for a non-managed input.- The
start_offset
andend_offset
arguments ofTimeWindowPartitionMapping
now work acrossTimeWindowPartitionsDefinitions
with different start dates and times. - If
add_output_metadata
is called multiple times within an op, asset, or IO managerhandle_output
, the values will now be merged, instead of later dictionaries overwriting earlier ones. materialize
andmaterialize_to_memory
now both accept atags
argument.- Added
SingleDimensionDependencyMapping
, aPartitionMapping
object that defines a correspondence between an upstream single-dimensional partitions definition and a downstreamMultiPartitionsDefinition
. - The
RUN_DEQUEUED
event has been removed from the event log, since it was duplicative with theRUN_STARTING
event. - When an Exception is raised during the execution of an op or asset, Dagit will now include the original Exception that was raised, even if it was caught and another Exception was raised instead. Previously, Dagit would only show exception chains if the Exception was included using the
raise Exception() from e
syntax. - [dagit] The Asset Catalog table in Dagit is now a virtualized infinite-scroll table. It is searchable and filterable just as before, and you can now choose assets for bulk materialization without having to select across pages.
- [dagit] Restored some metadata to the Code Locations table, including image, python file, and module name.
- [dagit] Viewing a partition on the asset details page now shows both the latest materialization and also all observations about that materialization.
- [dagit] Improved performance of the loading time for the backfills page
- [dagit] Improved performance when materializing assets with very large partition sets
- [dagit] Moving around asset and op graphs while selecting nodes is easier - drag gestures no longer clear your selection.
- [dagster-k8s] The Dagster Helm chart now allows you to set an arbitrary kubernetes config dictionary to be included in the launched job and pod for each run, using the
runK8sConfig
key in thek8sRunLauncher
section. See the docs for more information. - [dagster-k8s]
securityContext
can now be set in thek8sRunLauncher
section of the Dagster Helm chart. - [dagster-aws] The
EcsRunLauncher
can now be configured with cpu and memory resources for each launched job. Previously, individual jobs needed to be tagged with CPU and memory resources. See the docs for more information. - [dagster-aws] The
S3ComputeLogManager
now takes in an argumentupload_extra_args
which are passed through as theExtraArgs
parameter to the file upload call. - [dagster-airflow] added
make_dagster_definitions_from_airflow_dags_path
andmake_dagster_definitions_from_airflow_dag_bag
which are passed through as theExtraArgs
parameter to the file upload call.
Bugfixes
- Fixed a bug where ad-hoc materializations of assets were not correctly retrieving metadata of upstream assets.
- Fixed a bug that caused
ExperimentalWarning
s related toLogicalVersions
to appear even when version-based staleness was not in use. - Fixed a bug in the asset reconciliation sensor that caused multi-assets to be reconciled when some, but not all, of the assets they depended on, were reconciled.
- Fixed a bug in the asset reconciliation sensor that caused it to only act on one materialization per asset per tick, even when multiple partitions of an asset were materialized.
- Fixed a bug in the asset reconciliation sensor that caused it to never attempt to rematerialize assets which failed in their last execution. Now, it will launch the next materialization for a given asset at the same time that it would have if the original run had completed successfully.
- The
load_assets_from_modules
andload_assets_from_package_module
utilities now will also load cacheable assets from the specified modules. - The
dequeue_num_workers
config setting onQueuedRunCoordinator
is now respected. - [dagit] Fixed a bug that caused a “Maximum recursion depth exceeded” error when viewing partitioned assets with self-dependencies.
- [dagit] Fixed a bug where “Definitions loaded” notifications would constantly show up in cases where there were multiple dagit hosts running.
- [dagit] Assets that are partitioned no longer erroneously appear "Stale" in the asset graph.
- [dagit] Assets with a freshness policy no longer appear stale when they are still meeting their freshness policy.
- [dagit] Viewing Dagit in Firefox no longer results in erroneous truncation of labels in the left sidebar.
- [dagit] Timestamps on the asset graph are smaller and have an appropriate click target.
- [dagster-databricks] The
databricks_pyspark_step_launcher
will now cancel the relevant databricks job if the Dagster step execution is interrupted. - [dagster-databricks] Previously, the
databricks_pyspark_step_launcher
could exit with an unhelpful error after receiving an HTTPError from databricks with an empty message. This has been fixed. - [dagster-snowflake] Fixed a bug where calling
execute_queries
orexecute_query
on asnowflake_resource
would raise an error unless theparameters
argument was explicitly set. - [dagster-aws] Fixed a bug in the
EcsRunLauncher
when launching many runs in parallel. Previously, each run risked hitting aClientError
in AWS for registering too many concurrent changes to the same task definition family. Now, theEcsRunLauncher
recovers gracefully from this error by retrying it with backoff. - [dagster-airflow] Added
make_dagster_definitions_from_airflow_dags_path
andmake_dagster_definitions_from_airflow_dag_bag
for creating Dagster definitions from a given airflow Dag file path or DagBag
Community Contributions
- Fixed a metadata loading error in
UPathIOManager
, thanks @danielgafni! - [dagster-aws]
FakeS3Session
now includes additional functions and improvements to align with the boto3 S3 client API, thanks @asharov! - Typo fix from @vpicavet, thank you!
- Repository license file year and company update, thanks @vwbusguy!
Experimental
- Added experimental
BranchingIOManager
to model use case where you wish to read upstream assets from production environments and write them into a development environment. - Add
create_repository_using_definitions_args
to allow for the creation of named repositories. - Added the ability to use Python 3 typing to define and access op and asset config.
- [dagster-dbt] Added
DbtManifestAssetSelection
, which allows you to define selections of assets loaded from a dbt manifest using dbt selection syntax (e.g.tag:foo,path:marts/finance
).
Documentation
- There’s now only one Dagster Cloud Getting Started guide, which includes instructions for both Hybrid and Serverless deployment setups.
- Lots of updates throughout the docs to clean up remaining references to
@repository
, replacing them withDefinitions
. - Lots of updates to the dagster-airflow documentation, a tutorial for getting started with Dagster from an airflow background, a migration guide for going to Dagster from Airflow and a terminology/concept map for Airflow onto Dagster.
All Changes
1.1.7...1.1.8
See All Contributors
Adjust resources guide to be in a Definitions world
by @schrockn
add thread name prefix to run dequeue workers (#11155)
by @alangenfeld
make schedules produced by build_schedule_from_partitioned_job more p… (#11147)
by @sryza
[k8s launcher] security context (#9788)
by @alangenfeld
[run coordinator] fix threaded tests (#11139)
by @alangenfeld
[docs] - [definitions] Update Dagit + tutorial screenshots (#11089)
by @erinkcochran87
[docs] - [definitions] Update Partitions concept docs (#11030)
by @erinkcochran87
Port asset sensor guide to Definitions
by @schrockn
Use buildkite_deps.txt to declare explicit buildkite deps (#11025)
by @jmsanders
Trigger builds when .ini files change (#11161)
by @jmsanders
Revert "keep track of max timestamps client side for code location up… (#11162)
by @prha
[docs] - Fix links (#11163)
by @erinkcochran87
Add create_repository_using_definitions_args
by @schrockn
Change Graph-backed asset guide code examples to be on Definitions
by @schrockn
Delete repository unit testing using load_all_definitions in testing guide
by @schrockn
1.1.7 Changelog (#11165)
by @jamiedemaria
Do a single pip install in tox suites [OSS] (#11164)
by @gibsondan
Update declarative scheduling guide to include Definitions
by @schrockn
add thread name prefix to grpc server (#11158)
by @alangenfeld
Automation: versioned docs for 1.1.7
by @elementl-devtools
Fixup rename of requirements.txt -> buildkite_deps.txt (#11160)
by @jmsanders
fix precedence ordering when merging dictionaries in container context (#11169)
by @gibsondan
support in and len on PartitionsSubsets (#11172)
by @sryza
Disable breaking azure tests in master (#11183)
by @schrockn
[docs] Fix typo in title (#11156)
by @vpicavet
Fix regression with passing in None to snowflake resource (#11182)
by @gibsondan
[dagit] Add tests for partition health data parsing / accessors (#11114)
by @bengotow
Allow setting resources on EcsContainerContext and EcsRunLauncher (#11170)
by @gibsondan
fix a small bug in UPathIOManager (#11110)
by @danielgafni
[dagster-dbt] in tests, pin dbt rpc < 0.3.0 (#11196)
by @OwenKephart
Move core_tests/storage_tests to storage_tests (#11180)
by @schrockn
skip sqlite env var test on windows (#11195)
by @gibsondan
Move old_sqlalchemy_tests to only run on storage_tests (#11181)
by @schrockn
[dagster-airflow] airflow terminology mapping (#11015)
by @Ramshackle-Jamathon
Move core_tests/definitions_tests to definitions_tests (#11184)
by @schrockn
fix dequeue_num_workers setting (#11198)
by @alangenfeld
Move core_tests/asset_defs_tests to asset_defs_tests (#11186)
by @schrockn
update timeout in sensor run tests (#11193)
by @jamiedemaria
Move core_tests/launcher_tests to launcher_tests (#11187)
by @schrockn
add assets example to AssetSelection apidoc (#11194)
by @sryza
Move various logging tests into logging_tests (#11188)
by @schrockn
feat(dbt-cloud): add Dagster run id to dbt Cloud run (#11005)
by @rexledesma
[docs] snowflake reference page (#10985)
by @jamiedemaria
[declarative-scheduling] Fix bug with declarative scheduling where repeated calls to get_latest_materialization_record could return incorrect results (#11214)
by @OwenKephart
only reconcile multi-assets if all parents are reconciled (#11190)
by @sryza
[dagit] Fix dragging on the DAG clearing your selection / clicking links (#11202)
by @bengotow
[dagster-aws] Extend fake S3 resource (#11105)
by @asharov
[dagit] Optimizations to backfill UI for large partition key sets (#11201)
by @bengotow
do not compute projected logical versions of partitioned assets (#11204)
by @sryza
factor out asset reconciliation graph traversal into util (#11206)
by @sryza
[dagit] Fresh + Stale should not show “Stale” on the Asset Graph (#11234)
by @bengotow
[dagit] Round middle truncate calculations for Firefox (#11203)
by @bengotow
support tags argument on materialize and materialize_to_memory (#11225)
by @sryza
docs: update license to include year and company (#11231)
by @vwbusguy
[docs] - [definitions] Update OSS deployment overview (#11199)
by @erinkcochran87
split backfill table so partition status is fetched lazily (#11205)
by @prha
fix tslint (#11242)
by @alangenfeld
[dagit] webpack-bundle-analyzer (#11230)
by @hellendag
[dagit] Virtualized Asset Catalog (#11168)
by @hellendag
Can remove hardcoded_resource from project_fully_featured because of Definitions (#11243)
by @schrockn
[dagit] Replace moment-timezone (#11197)
by @hellendag
Remove unused backfillStatus, which loads individual run status (#11246)
by @prha
Retry registering task definitions (#11192)
by @jmsanders
Rerun dagster_tests with --snapshot-update (#11208)
by @schrockn
Add test case for binding assets before passing to Definitions (#11216)
by @schrockn
[dagster-airflow] from airflow to dagster guide updates (#11218)
by @Ramshackle-Jamathon
[graphql test] share schema instance (#11236)
by @alangenfeld
Use FromSourceAsset instead of FromRootInputManager when loading assets with input managers (#11233)
by @jamiedemaria
Use bare objects for the hacker news resources (#11249)
by @schrockn
Use bare I/O manager in fully featured (#11250)
by @schrockn
Delete unused fixed_s3_pickle_io_manager (#11251)
by @schrockn
Consolidate _resolve_bound_config and have logger and resource use the same one (#11209)
by @schrockn
Delete op version of _resolve_bound_config and call generic one (#11211)
by @schrockn
Skip race condition tests (#11286)
by @jmsanders
Skip flaky test (#11292)
by @jmsanders
[dagit] Replace remaining moment usage (#11278)
by @hellendag
Update op-retries.mdx - no solid (#11284)
by @yuhan
[docs] airbyte guide repository -> definitions (#11296)
by @yuhan
[docs] fivetran guide repository -> definitions (#11297)
by @yuhan
[docs] dbt guide repository -> definitions (#11298)
by @yuhan
[docs] dbt cloud guide repository -> definitions (#11299)
by @yuhan
[declarative-scheduling] Rework constraint passing (#11229)
by @OwenKephart
[dagster-airflow] basic airflow migration guide (#11012)
by @Ramshackle-Jamathon
[dagstermill] add retries to flaky tests (#11291)
by @jamiedemaria
Product tour component (#11227)
by @salazarm
[dagit] Utility for timezone-aware date/time formatting (#11285)
by @hellendag
[dagit] Restore some metadata on Code Locations page (#11281)
by @hellendag
[docs] - [definitions] Update Loggers Concept docs (#11171)
by @erinkcochran87
[docs] - Update ECS deployment guide (#11289)
by @erinkcochran87
[docs] - [definitions] Update Dagster instance docs (#11241)
by @erinkcochran87
Pass duckdb_path to __init__ rather than relying on context (#11300)
by @schrockn
[structured config] Base structured config implementation (#11268)
by @benpankow
[structured config] Add support for default values (#11272)
by @benpankow
Add gql pin (#11312)
by @gibsondan
[structured config] Add support for class, field descriptions (#11274)
by @benpankow
Move env var injection earlier in step command (#11239)
by @gibsondan
accrete metadata with multiple calls to add_output_metadata (#9518)
by @sryza
[docs] - [definitions] - Update Executors docs (#11247)
by @erinkcochran87
[docs] - [definitions] Update Helm guide (#11320)
by @erinkcochran87
Eliminating unnecessary output_context.resource_config check (#11313)
by @schrockn
[docs] - Consolidate sections in Resources concept doc (#11316)
by @erinkcochran87
[docs] - [definitions] Update SDA Concept docs (#11018)
by @erinkcochran87
[docs] - [definitions] Update Run launchers guide (#11200)
by @erinkcochran87
[docs] - Re-do Cloud Getting Started guides (#10429)
by @erinkcochran87
[docs] - [definitions] Update Docker guide (#11295)
by @erinkcochran87
backfill perf: swap backfill requested for num cancelable (#11304)
by @prha
allow upload config to pass through s3 compute log manager (#11317)
by @prha
[apidoc] define_asset_job repository -> Definitions (#11302)
by @yuhan
Ignore stale timestamps from code location updates (#11173)
by @prha
[structured config] Fix usage with Assets (#11327)
by @benpankow
Include parent exceptions in Dagster errors, even if they weren't explicitly raised (#11306)
by @gibsondan
[docs] - [definitions] Update Dagster daemon docs (#11226)
by @erinkcochran87
Docs release backfill 1.1.7 (#11328)
by @yuhan
Use dark mode logo on README.md (#11326)
by @hellendag
AssetGraph.from_external_assets().get_required_multi_asset_keys() (#11318)
by @sryza
[declarative-scheduling] Update retry logic to attempt to retry failed materializations after some time has passed (#11294)
by @OwenKephart
Add instance property to InputContext (#11331)
by @schrockn
chore: add method to strip error stack trace (#11307)
by @rexledesma
Hoist schema and database to DbIOManager constructor (#11301)
by @schrockn
Rename _resolve_bound_config to resolve_bound_config (#11287)
by @schrockn
Add asset_materialization property to EventLogEntry (#11340)
by @schrockn
store repository on external asset graph (#11332)
by @sryza
Revert "Use dark mode logo on README.md (#11326)" (#11345)
by @hellendag
[dagster-io/eslint-config] v1.0.6 (#11324)
by @hellendag
Add get_implicit_global_asset_job on Definitions and RepositoryDefinition (#11279)
by @schrockn
feat: add retry number and url to integration api call failure (#11308)
by @rexledesma
Fix dagster-graphql circular import (BK broken) (#11339)
by @smackesey
Skipping dbt rpc resource tests (#11357)
by @schrockn
skip test_threaded_ephemeral_instance (#11359)
by @schrockn
Allow specifying resources in Op/Asset params list (#11322)
by @benpankow
change version placeholder 0+dev -> 1!0+dev (#11334)
by @smackesey
[structured config] Structured-config-backed Resources (#11321)
by @benpankow
[structured config] Add traditional resource wrapper base class (#11337)
by @benpankow
[structured config] Fix test importing functools.cached_property on py37 tests (#11361)
by @benpankow
[structured config] Structured-config-backed IO managers (#11343)
by @benpankow
fix some bugs in ExternalAssetGraph (#11350)
by @sryza
more refactors to reconciliation sensor (#11223)
by @sryza
[dagster-io/eslint-config] v1.0.7 (#11347)
by @hellendag
Add Materialize Button hook (#11319)
by @salazarm
Allow setting raw k8s config at the run launcher / container context level (#11333)
by @gibsondan
Make it more likely that we hit our lock (#11290)
by @jmsanders
fix __contains__ of TimeWindowPartitionsSubset (#11380)
by @sryza
[docs] - add note on pandas integration to redirect users interested in pandas w/out validation (#11342)
by @slopp
fix asset reconciliation bug that ignores earlier partitions (#11336)
by @sryza
fix typo on partitions concepts page (#11349)
by @sryza
Clicking on top-level concept sections takes you to a page (#10140)
by @sryza
Add single dimension -> multidimension partition mapping (#10910)
by @clairelin135
[RFC] Update cached_status_data column 1/2 (#10821)
by @clairelin135
Remove stray console.log (#11410)
by @gibsondan
ExecuteInProcessResult.asset_value (#11403)
by @sryza
Elminate PIPELINE_DEQUEUED event (#11393)
by @gibsondan
[dagit] Begin adding new graphql-codegen (#11411)
by @hellendag
Docs for new ECS resource options (#11391)
by @gibsondan
Docs for new k8s configuration options (#11390)
by @gibsondan
Fix malformed dagster_cloud.yaml in code location docs (#11416)
by @gibsondan
[dagit] Refactor mutations to new GraphQL codegen (#11414)
by @hellendag
[dagit] Refactor queries in src/workspace (#11418)
by @hellendag
[dagit] dedupeFragments (#11426)
by @hellendag
relax matching criteria for TimeWindowPartitionMappings w/offsets (#11422)
by @sryza
[dagster-databricks] handle empty responses (#11430)
by @OwenKephart
Export RunRecord in the public API (#11427)
by @gibsondan
[dagit] Refactor queries in src/runs (#11419)
by @hellendag
[dagit] Refactor queries in Assets (#11423)
by @hellendag
[dagit] Refactor queries in instance, instigation, launchpad (#11429)
by @hellendag
Branching I/O Manager (#11315)
by @schrockn
Fix issues with a SerializableErrorInfo being coerced to a GraphenePythonError (#11437)
by @gibsondan
export product tour component (#11314)
by @salazarm
[dagit] Refactor queries in remaining src (#11435)
by @hellendag
Eliminate unused config_or_config_fn arg in copy_for_configured (#11433)
by @schrockn
[dagit] Don’t show assets as stale if projectedLogicalVersion is null (#11224)
by @bengotow
[dagit] Show observations about the latest materialization on Asset > Partitions (#11288)
by @bengotow
asset backfill core logic (#11377)
by @sryza
Allow AssetIn(dagster_type=Nothing) (#11436)
by @sryza
Rename and refactor in structured_config.py for clarity (#11372)
by @schrockn
Consolidate all logic and trickery to support private cached properties in pydantic in a single class (#11373)
by @schrockn
deprecate job-level memoization in docs (#11392)
by @sryza
Add StructuredIOManagerAdapter (#11383)
by @schrockn
[dagit] Fix line height and size of timestamps on the asset graph (#11465)
by @bengotow
Default new-style config mappings to *Source equivalents rather than raw scalars (#11386)
by @schrockn
Unexperimentalize LogicalVersion and quell warning messages whenever assets are materialized (#11407)
by @sryza
Passthrough pydantic.ModelField.required to dagster.Field.is_required (#11388)
by @schrockn
Support pydantic aliases in config schema mapping (#11389)
by @schrockn
Make conversion to *Source types work on direct annotations of the config parameter (#11469)
by @schrockn
[dagster-databricks] handle execution interrupts (#11421)
by @OwenKephart
Skip flaky race condition test (#11471)
by @jmsanders
remove AssetStoreHandle (#11412)
by @sryza
refactor: sequester job backfill code (#11252)
by @sryza
Be more resilient to user code errors when dequeuing runs (#11406)
by @gibsondan
Make all arguments on DagsterInstance.create_run keyword-only (#11446)
by @schrockn
Elimiinate default value for asset_selection param on DagsterInstance.create_run (#11447)
by @schrockn
Eliminate default values for solid_selection, external_pipeline_origin, and pipeline_code_origin on create_run (#11448)
by @schrockn
rename create_run to create_run_for_test on TestQueuedRunCoordinator to increase greppability of create_run (#11449)
by @schrockn
[dagit] Remove old GraphQL Codegen (#11474)
by @hellendag
[dagit] Delete apollo CLI dep (#11478)
by @hellendag
Add tag_concurrency_limits config to executors (#11472)
by @gibsondan
add run filter for updated before (#11481)
by @prha
docs(dagster-dbt): clarify that the integration supports arbitrary dbt profiles (#11351)
by @rexledesma
[structured config] Add support for Permissive fields (#11275)
by @benpankow
handle pure asset backfills in backfill daemon (#11378)
by @sryza
Fix missing callsite of tag_concurrency_limits (#11496)
by @gibsondan
fix(docs): format docs (#11501)
by @rexledesma
[dagster-dbt] add DbtManifestAssetSelection (#11473)
by @OwenKephart
Add parameter invariants around external_pipeline_origin and pipeline_code_origin arguments (#11450)
by @schrockn
supply solid_selection to fix submitting runs from pure asset backfills (#11502)
by @sryza
telemetry: add num_assets_in_repo in repo-level metadata (#11490)
by @yuhan
fix(docs): run mdx-format again (#11504)
by @rexledesma
Tighten solids_to_execute and solid_selection invariant (#11451)
by @schrockn
Bump json5 from 1.0.1 to 1.0.2 in /js_modules/dagit (#11485)
by @dependabot[bot]
Add invariant for asset_selection (#11452)
by @schrockn
Typehint pipeline_name, run_id, and mode (#11453)
by @schrockn
graphql for pure asset backfills (#11379)
by @sryza
docs for new tag_concurrency_limits feature on executor (#11499)
by @gibsondan
Add support for loading cacheable assets from module (#10389)
by @benpankow
enable filtering by asset tag when asset tags table is not present (#11509)
by @sryza
only count materializations within backfill (#11506)
by @sryza
telemetry: add num_assets_in_repo in log_repo_stats metadata (#11513)
by @yuhan
telemetry: add location_name in repo-level metadata (#11514)
by @yuhan
[dagster-airflow] warn on airflow dataset (#11498)
by @Ramshackle-Jamathon
[dagster-airflow] add
make_dagster_definitions_from_airflow_dags_pathand
make_dagster_definitions_from_airflow_dag_bag apis (#11441)
by @Ramshackle-Jamathon
Changelog 1.1.8 (#11526)
by @clairelin135
1.1.8
by @elementl-devtools