Release Notes (What's New)

Version 2.3.0 (October 14, 2025)

For a list of all changes in this release, see the full changelog. Below are the changes we think users may wish to be aware of.

Features

Added a new metric:
- Percent within X: scores.continuous.percent_within_x. See PR #865.
Added one new metric and two supporting functions. Following the publication of Taggart & Wilke (2025), these have been moved from scores.emerging to scores.categorical:
- Risk matrix score: scores.categorical.risk_matrix_score.
- Risk matrix score - matrix weights to array: scores.categorical.matrix_weights_to_array.
- Risk matrix score - warning scaling to weight array: scores.categorical.weights_from_warning_scaling.
  Note: while removing the functions from scores.emerging is technically a breaking change, breaking changes that only impact the "emerging" section of the API do not trigger major releases. This is because the "emerging" section of the API is designed to hold metrics while they are undergoing peer review and it is expected they will be moved out of "emerging" once peer review has concluded.
  See PR #904.
Updated the weighting method used by all scores functions that allow the user to supply weights. The updated weighting method normalises the user-supplied weights rather than applying them directly. While both approaches can be valid, the revised approach is more in keeping with general expectations and is conistent with the default approach taken by other libraries. As a part of this change, users can no longer supply weights that contain NaNs (zeroes may be used instead where appropriate). The "Introduction to weighting and masking" tutorial has been updated and substantially expanded to explain what the weighting does mathematically. See PR #899.
Added optional automatic generation of thresholds for the receiver (relative) operating characteristic (ROC) curve (scores.probability.roc_curve_data). See PR #882.

Bug Fixes

Updated scores.continuous.quantile_interval_score so it now recognises preserve_dims='all'. Beforehand, it was not recognising the special case of preserve_dims='all' and was raising an error unless a list of dimensions was supplied. (Note: the score calculations were not incorrect, it was only that preserve_dims='all' was not recognised.) See PR #893.

Documentation

Added "Percent Within X" tutorial. See PR #865.
Substantially updated and expanded the "Introduction to weighting and masking" tutorial, following changes to the weighting method used by all scores functions that allow the user to supply weights. The updated and expanded tutorial explains what the weighting does mathematically. See PR #899.
Updated the "Quantile-Quantile (Q-Q) Plots for Comparing Forecasts and Observations" tutorial so that the plots render in Read the Docs. See PR #883.
Updated the description of the second figure in the "Threshold Weighted Continuous Ranked Probability Score (twCRPS) for ensembles" tutorial. See PR #897.
Updated multiple sections of the documentation following the risk matrix score moving from scores.emerging to scores.categorical, including:
- updating docstrings and docs/included.md,
- updating the tutorial with the new categorical methods, and
- updating references in several sections of the documentation, following the publication of Taggart & Wilke (2025).
  See PR #904.
Updated several tutorials to subtract the LEAD_TIME Timedelta from the base times in the forecast data to make the forecast and observation data line up correctly. See PR #920.
In the README, "Detailed Installation Guide" and "Contributing Guide", updated pip install commands to use quotation marks where square brackets are used to specify optional dependencies. This is to ensure compatibility with zsh (the default on macOS) while still working as expected on bash. See PR #917.
Added thumbnail images to multiple entries in the tutorial gallery. See PR #874, PR #875, PR #877, PR #879, PR #880, PR #881 and PR #884.

Internal Changes

In multiple tutorials, added the keyword argument decode_timedelta=True to xarray.open_dataset for the downloaded files forecast_grid.nc and analysis_grid.nc. See PR #894.
Perform input checking earlier in various function calls to improve efficiency, so that error messages can be raised before incurring computational expenses. See PR #905.

Contributors to this Release

Thomas C. Pagano* (@thomaspagano), Paul R. Smith* (@prs247au), J. Smallwood* (@jdgsmallwood), Tennessee Leeuwenburg (@tennlee), Nicholas Loveday (@nicholasloveday), Nikeeth Ramanathan (@nikeethr), Stephanie Chong (@Steph-Chong), Robert J. Taggart (@rob-taggart) and Mohammadreza Khanarmuei (@reza-armuei).

* indicates that this release contains their first contribution to scores.

scores 2.3.0 on Python PyPI