github dagster-io/dagster 0.11.8

latest releases: 1.7.9, 1.7.9rc0, 1.7.8...
3 years ago

New

  • The @solid decorator can now wrap a function without a context argument, if no context information is required. For example, you can now do:
@solid
def basic_solid():
    return 5

@solid
def solid_with_inputs(x, y):
    return x + y

however, if your solid requires config or resources, then you will receive an error at definition time.

  • It is now simpler to provide structured metadata on events. Events that take a metadata_entries argument may now instead accept a metadata argument, which should allow for a more convenient API. The metadata argument takes a dictionary with string labels as keys and EventMetadata values. Some base types (str, int, float, and JSON-serializable list/dicts) are also accepted as values and will be automatically coerced to the appropriate EventMetadata value. For example:
@solid
def old_metadata_entries_solid(df):
   yield AssetMaterialization(
       "my_asset",
       metadata_entries=[
           EventMetadataEntry.text("users_table", "table name"),
           EventMetadataEntry.int(len(df), "row count"),
           EventMetadataEntry.url("http://mysite/users_table", "data url")
       ]
   )

@solid
def new_metadata_solid(df):
    yield AssetMaterialization(
       "my_asset",
       metadata={
           "table name": "users_table",
           "row count": len(df),
           "data url": EventMetadata.url("http://mysite/users_table")
       }
   )
  • The dagster-daemon process now has a --heartbeat-tolerance argument that allows you to configure how long the process can run before shutting itself down due to a hanging thread. This parameter can be used to troubleshoot failures with the daemon process.
  • When creating a schedule from a partition set using PartitionSetDefinition.create_schedule_definition, the partition_selector function that determines which partition to use for a given schedule tick can now return a list of partitions or a single partition, allowing you to create schedules that create multiple runs for each schedule tick.

Bugfixes

  • Runs submitted via backfills can now correctly resolve the source run id when loading inputs from previous runs instead of encountering an unexpected KeyError.
  • Using nested Dict and Set types for solid inputs/outputs now works as expected. Previously a structure like Dict[str, Dict[str, Dict[str, SomeClass]]] could result in confusing errors.
  • Dagstermill now correctly loads the config for aliased solids instead of loading from the incorrect place which would result in empty solid_config.
  • Error messages when incomplete run config is supplied are now more accurate and precise.
  • An issue that would cause map and collect steps downstream of other map and collect steps to mysteriously not execute when using multiprocess executors has been resolved.

Documentation

  • Typo fixes and improvements (thanks @elsenorbw !)
  • Improved documentation for scheduling partitions

Don't miss a new dagster release

NewReleases is sending notifications on new releases.