github sorentwo/oban v2.11.0

latest releases: v2.16.1, v2.16.0, v2.15.4...
2 years ago

Oban v2.11 Upgrade Guide

⚠️📓 Oban v2.11 requires a v11 migration, Elixir v1.11+ and Postgres v10.0+

Oban v2.11 focused on reducing database load, bolstering telemetry-powered introspection, and improving the production experience for all users. To that end, we've extracted functionality from Oban Pro and switched to a new global coordination model.

Leadership

Coordination between nodes running Oban is crucial to how many plugins operate. Staging jobs once a second from multiple nodes is wasteful, as is pruning, rescuing, or scheduling cron jobs. Prior Oban versions used transactional advisory locks to prevent plugins from running concurrently, but there were some issues:

  • Plugins don't know if they'll take the advisory lock, so they still need to run a query periodically.

  • Nodes don't usually start simultaneously, and time drifts between machines. There's no guarantee that the top of the minute for one node is the same as another's—chances are, they don't match.

Oban 2.11 introduces a table-based leadership mechanism that guarantees only one node in a cluster, where "cluster" means a bunch of nodes connected to the same Postgres database, will run plugins. Leadership is transparent and designed for resiliency with minimum chatter between nodes.

See the [Upgrade Guide][upg] for instructions on how to create the peers table and get started with leadership. If you're curious about the implementation details or want to use leadership in your application, take a look at docs for Oban.Peer.

Alternative PG (Process Groups) Notifier

Oban relies heavily on PubSub, and until now it only provided a Postgres adapter. Postres is amazing, and has a highly performant PubSub option, but it doesn't work in every environment (we're looking at you, PG Bouncer).

Fortunately, many Elixir applications run in a cluster connected by distributed Erlang. That means Process Groups, aka PG, is available for many applications.

So, we pulled Oban Pro's PG notifier into Oban to make it available for everyone! If your app runs in a proper cluster, you can switch over to the PG notifier:

config :my_app, Oban,
  notifier: Oban.Notifiers.PG,
  ...

Now there are two notifiers to choose from, each with their own strengths and weaknesses:

  • Oban.Notifiers.Postgres — Pros: Doesn't require distributed erlang, publishes insert events to trigger queues; Cons: Doesn't work with PGBouncer intransaction mode, Doesn't work in tests because of the sandbox.

  • Oban.Notifiers.PG — Pros: Works PG Bouncer in transaction mode, Works in tests; Cons: Requires distributed Erlang, Doesn't publish insert events.

Basic Lifeline Plugin

When a queue's producer crashes or a node shuts down before a job finishes executing, the job may be left in an executing state. The worst part is that these jobs—which we call "orphans"—are completely invisible until you go searching through the jobs table.

Oban Pro has awlays had a "Lifeline" plugin for just this ocassion—and now we've brought a basic Lifeline plugin to Oban.

To automatically rescue orphaned jobs that are still executing, include the Oban.Plugins.Lifeline in your configuration:

config :my_app, Oban,
  plugins: [Oban.Plugins.Lifeline],
  ...

Now the plugin will search and rescue orphans after they've lingered for 60 minutes.

🌟 Note: The Lifeline plugin may transition jobs that are genuinely executing and cause duplicate execution. For more accurate rescuing or to rescue jobs that have exhausted retry attempts see the DynamicLifeline plugin in Oban Pro.

Reindexer Plugin

Over time various Oban indexes (heck, any indexes) may grow without VACUUM cleaning them up properly. When this happens, rebuilding the indexes will release bloat and free up space in your Postgres instance.

The new Reindexer plugin makes index maintenance painless and automatic by periodically rebuilding all of your Oban indexes concurrently, without any locks.

By default, reindexing happens once a day at midnight UTC, but it's configurable with a standard cron expression (and timezone).

config :my_app, Oban,
  plugins: [Oban.Plugins.Reindexer],
  ...

See Oban.Plugins.Reindexer for complete options and implementation details.

Improved Telemetry and Logging

The default telemetry backed logger includes more job fields and metadata about execution. Most notably, the execution state and formatted error reports when jobs fail.

Here's an example of the default output for a successful job:

{
  "args":{"action":"OK","ref":1},
  "attempt":1,
  "duration":4327295,
  "event":"job:stop",
  "id":123,
  "max_attempts":20,
  "meta":{},
  "queue":"alpha",
  "queue_time":3127905,
  "source":"oban",
  "state":"success",
  "tags":[],
  "worker":"Oban.Integration.Worker"
}

Now, here's an sample where the job has encountered an error:

{
  "attempt": 1,
  "duration": 5432,
  "error": "** (Oban.PerformError) Oban.Integration.Worker failed with {:error, \"ERROR\"}",
  "event": "job:exception",
  "state": "failure",
  "worker": "Oban.Integration.Worker"
}

2.11.0 — 2022-02-13

Enhancements

  • [Migration] Change the order of fields in the base index used for the primary Oban queries.

    The new order is much faster for frequent queries such as scheduled job staging. Check the v2.11 upgrade guide for instructions on swapping the index in existing applications.

  • [Worker] Avoid spawning a separate task for workers that use timeouts.

  • [Engine] Add insert_job, insert_all_jobs, retry_job, and retry_all_jobs as required callbacks for all engines.

  • [Oban] Raise more informative error messages for missing or malformed plugins.

    Now missing plugins have a different error from invalid plugins or invalid options.

  • [Telemetry] Normalize telemetry metadata for all engine operations:

    • Include changeset for insert
    • Include changesets for insert_all
    • Include job for complete_job, discard_job, etc
  • [Repo] Include [oban_conf: conf] in telemetry_options for all Repo operations.

    With this change it's possible to differentiate between database calls made by Oban versus the rest of your application.

Bug Fixes

  • [Telemetry] Emit discard rather than error events when a job exhausts all retries.

    Previously discard_job was only called for manual discards, i.e., when a job returned :discard or {:discard, reason}. Discarding for exhausted attempts was done within error_job in error cases.

  • [Cron] Respect the current timezone for @reboot jobs. Previously, @reboot expressions were evaluated on boot without the timezone applied. In that case the expression may not match the calculated time and jobs wouldn't trigger.

  • [Cron] Delay CRON evaluation until the next minute after initialization. Now all cron scheduling ocurrs reliably at the top of the minute.

  • [Drainer] Introduce discard accumulator for draining results. Now exhausted jobs along with manual discards count as a discard rather than a failure or success.

  • [Oban] Expand changeset wrapper within multi function.

    Previously, Oban.insert_all could handle a list of changesets, a wrapper map with a :changesets key, or a function. However, the function had to return a list of changesets rather than a changeset wrapper. This was unexpected and made some multi's awkward.

  • [Testing] Preserve attempted_at/scheduled_at in perform_job/3 rather than overwriting them with the current time.

  • [Oban] Include false as a viable queue or plugin option in typespecs

Deprecations

  • [Telemetry] Hard deprecate Telemetry.span/3, previously it was soft-deprecated.

Removals

  • [Telemetry] Remove circuit breaker event documentation because :circuit events aren't emitted anymore.

Don't miss a new oban release

NewReleases is sending notifications on new releases.