Metaflow 2.0.4 Release Notes

Improvements
- Expose retry_count in Current
- Mute superfluous ThrottleExceptions in AWS Batch job logs
Bug Fixes
- Set proper thresholds for retrying DescribeJobs API for AWS Batch
- Explicitly override PYTHONNOUSERSITE for conda environments
- Preempt AWS Batch job log collection when the job fails to get into a RUNNING state

The Metaflow 2.0.4 release is a minor patch release.

Improvements

Expose `retry_count` in `Current`

You can now use the current singleton to access the retry_count of your task. The first attempt of the task will have retry_count as 0 and subsequent retries will increment the retry_count. As an example:

@retry
@step
def my_step(self):
    from metaflow import current
    print("retry_count: %s" % current.retry_count)
    self.next(self.a)

Mute superfluous `ThrottleExceptions` in AWS Batch job logs

The AWS Logs API for get_log_events has a global hard limit on 10 requests per sec. While we have retry logic in place to respect this limit, some of the ThrottleExceptions usually end up in the job logs causing confusion to the end-user. This release addresses this issue (also documented in #184).

Bug Fixes

Set proper thresholds for retrying `DescribeJobs` API for AWS Batch

The AWS Batch API for describe_jobs throws ThrottleExceptions when managing a flow with a very wide for-each step. This release adds retry behavior with backoffs to add proper resiliency (addresses #138).

Explicitly override `PYTHONNOUSERSITE` for `conda` environments

In certain user environments, to properly isolate conda environments, we have to explicitly override PYTHONNOUSERSITE rather than simply relying on python -s (addresses #178).

Preempt AWS Batch job log collection when the job fails to get into a `RUNNING` state

Fixes a bug where if the AWS Batch job crashes before entering the RUNNING state (often due to incorrect IAM perms), the previous log collection behavior would fail to print the correct error message making it harder to debug the issue (addresses #185).

Netflix/metaflow 2.0.4 2.0.4 (Apr 28th, 2020) on GitHub

Metaflow 2.0.4 Release Notes

Improvements

Expose retry_count in Current

Mute superfluous ThrottleExceptions in AWS Batch job logs

Bug Fixes

Set proper thresholds for retrying DescribeJobs API for AWS Batch

Explicitly override PYTHONNOUSERSITE for conda environments

Preempt AWS Batch job log collection when the job fails to get into a RUNNING state

Netflix/metaflow 2.0.4
2.0.4 (Apr 28th, 2020)

on GitHub

Expose `retry_count` in `Current`

Mute superfluous `ThrottleExceptions` in AWS Batch job logs

Set proper thresholds for retrying `DescribeJobs` API for AWS Batch

Explicitly override `PYTHONNOUSERSITE` for `conda` environments

Preempt AWS Batch job log collection when the job fails to get into a `RUNNING` state