github facebookresearch/balance 0.21.0
0.21.0 (2026-06-02)

6 hours ago

New Features

  • balance.stats_and_plots.love_plot.love_plot, BalanceDFCovars.love_plot(),
    and BalanceDFCovars.plot(dist_type="love_plot")
    — visual
    covariate-imbalance diagnostic in the spirit of R's cobalt::love.plot.
    Supports interactive Plotly figures (the new default), static
    seaborn/matplotlib axes (library="seaborn"), and LLM-friendly ASCII
    (library="balance"). With both series it draws the canonical
    before-vs-after scatter; with only before, a single-series scatter with
    optional threshold reference line. The ASCII backend renders both series on
    a shared axis with o/* markers, wider default bar_width=50, and
    direction legends. New options include line= (toggle connectors) and
    order_by={"diff", "before", "after", "alphabetical", "none"}; the default
    order_by="diff" surfaces regressions at the top.
    BalanceDFCovars.love_plot(metric=...) dispatches across metric
    {"asmd" (default), "kld", "emd", "cvmd", "ks"}, with the
    threshold default resolving to the cobalt 0.1 cutoff for ASMD only.
    .plot(dist_type="love_plot", library=...) (or the "love" alias) routes
    covariate views to the same diagnostic.
  • Rake now supports fit-time metadata persistence and predict_weights()
    reconstruction.
    • rake(..., store_fit_metadata=True) stores contingency-table artifacts
      and fit-time metadata required to rebuild weights later.
    • BalanceFrame.fit(method="rake") enables store_fit_metadata=True by
      default so fitted rake models can be reused with
      BalanceFrame.predict_weights() without refitting.
    • In-place replay (predict_weights() with no data=) works with any
      transformations, including the rake default transformations="default".
    • Transfer scoring (predict_weights(data=...)) requires deterministic
      transformations: raises ValueError for models fitted with
      transformations="default" and for explicit dicts that directly
      reference data-dependent helpers (quantize, fct_lump). Pass
      deterministic transformations at fit time or re-fit on the scoring data.
  • Poststratify now supports transfer scoring with predict_weights(data=...).
    BalanceFrame.fit(method="poststratify", store_fit_metadata=True) stores
    the transformation origin needed to safely replay fitted cell ratios on a
    new sample/target pair: predict_weights(data=holdout_bf) applies stored
    ratios to the holdout sample's design weights and rescales to the holdout
    target's total weight. Same restriction as rake transfer scoring: rejects
    transformations="default" and direct quantize / fct_lump references.
    Pre-0.21.0 pickles lack transformations_origin and must be re-fit for
    transfer scoring; in-place predict_weights() continues to work.
  • balance.interop.diff_diff — thin adapter to
    diff-diff (>=3.3.0,<4) for
    survey-weighted Difference-in-Differences. Provides to_survey_design(),
    to_panel_for_did(), fit_did(), and as_balance_diagnostic(). Install
    via pip install "balance[did]". The submodule is lazy-imported, so
    import balance still works cleanly when diff-diff isn't installed — the
    import guard rewrites the ImportError to point users at the balance[did]
    extra. Shared adapter helpers (active_weight_column,
    drop_history_columns, validate_row_count, attach_balance_provenance)
    live in balance/interop/_common.py and column-name conventions in
    balance/interop/conventions.py so a future balance.interop.svy adapter
    can reuse them.
  • balance.stats_and_plots.weights_stats.kish_deff_stats — bundled
    Kish-design-effect diagnostic returning a KishStats(deff, ess, essp)
    namedtuple. Computes design_effect once and derives ESS and ESSP from it,
    avoiding three separate Deff computations when all three are needed.
    kish_ess(w) and kish_essp(w) are also exposed as ergonomic singletons
    over the existing design_effect. BalanceFrame._design_effect_diagnostics
    now routes through kish_deff_stats so the canonical Kish identities live
    in one place.
  • BalanceFrame.adjustment_history records compound adjustment steps.
    Sequential adjust() / set_fitted_model() workflows now keep a
    chronological, best-effort read-only copy of each adjustment step while
    preserving model as the latest fitted model for backwards compatibility.
    Baseline resets such as set_as_pre_adjust() clear the history together
    with the current model.
  • CLI --formula now accepts JSON lists for model-matrix formula lists.
    In addition to a single formula string, CLI users can pass values such as
    --formula='["age", "gender"]', which are parsed and forwarded as
    list[str] to IPW/CBPS model-matrix construction. Malformed, empty, or
    non-string JSON lists now fail during argument parsing.

Bug Fixes

  • rake() now correctly incorporates per-row design weights in final
    weights.
    Previously, every unit in the same raking cell received the same
    weight m_fit[c] / m_sample[c], ignoring its own design weight. The correct
    formula w_final_i = w_design_i × m_fit[c] / m_sample[c] is now applied,
    matching poststratify semantics and ensuring weighted marginals recover
    the target distribution when design weights are non-uniform. No-op when
    design weights are uniform (the common case).
  • rake() now gracefully handles single-variable adjustments. When
    rake(...) resolves to exactly one adjustment variable, it logs a warning
    and delegates to poststratify(...) instead of raising an assertion. This
    preserves passthrough behaviour for transformations, NA handling, trimming
    controls, and fit-metadata persistence while making
    BalanceFrame.fit(method="rake") more robust for one-variable inputs. In
    this delegated path, model metadata records method='poststratify' while
    returned weights keep the canonical rake_weight name.
  • CLI --num_lambdas now parses as a positive integer. Fractional,
    zero, negative, and non-numeric values fail fast during argument parsing
    instead of being accepted after coercion/truncation or failing later
    during IPW adjustment.
  • Validation-path cleanup in asmd, poststratify, and rake removes
    redundant/unreachable branches with no behavior loss:
    • asmd(...) uses a single authoritative invalid-std_type error path
      (Unknown std_type ...).
    • poststratify(..., store_fit_metadata=True) drops an unreachable
      "missing stored training weights" guard, and predict-time ratio-column
      collisions now use deterministic suffix-based naming (_cell_ratio,
      _cell_ratio_tmp, _cell_ratio_tmp2, ...).
    • rake._predict_weights_from_model(...) uses already-validated fit-time
      target weights directly for non-transfer replay.
  • Security: ws updated from 8.20.0 to 8.20.1 in website dependencies.
    Fixes CVE-2026-45736 (GHSA-58qx-3vcg-4xpx): uninitialized memory disclosure
    in websocket.close() when a TypedArray is passed as the reason argument.

Documentation

  • README cross-link to diff-diff. New "Design-based inference" parent
    section in
    README.md
    introduces the diff-diff integration above the API tour with a canonical
    Sample.from_frameset_targetadjustfit_did snippet. The
    Docusaurus tutorials index and the website landing page
    (HomepageFeatures.js) gain matching cross-references;
    .github/copilot-instructions.md gets a new review-checklist bullet for
    changes that touch balance/interop/diff_diff.py.
  • Survey-weighted DiD tutorial. New
    tutorials/balance_diff_diff_brfss.ipynb walks through a BRFSS-style
    staggered-adoption smoking-ban DiD use case end-to-end: load synthetic
    survey microdata via dd.generate_survey_did_data, reweight to ACS
    demographic marginals via balance.ipw, aggregate to a state-quarter panel
    via to_panel_for_did, fit Callaway-Sant'Anna doubly-robust DiD via
    fit_did, run HonestDiD sensitivity, build a combined diagnostic via
    as_balance_diagnostic, and contrast with the unweighted estimate.
    Self-contained and deterministic — CI re-executes via nbconvert. Committed
    with cleared outputs; deploy-website.yml re-runs and bakes outputs into
    the rendered Docusaurus pages.

Code Quality & Refactoring

  • All-zero weight inputs to _check_weights_series_are_valid now emit a
    UserWarning
    (when require_positive=False, the default). Previously,
    weighted statistics over an all-zero weight vector silently produced NaN /
    inf (sum(w*x)/sum(w) = 0/0). Callers that already passed
    require_positive=True (e.g. design_effect, nonparametric_skew,
    prop_above_and_below, weighted_median_breakdown_point) keep their
    ValueError behaviour. This affects internal callers like
    descriptive_statsasmd, which previously masked the failure mode.
  • Removed the scheduled migration FutureWarnings from SampleFrame.weight_column,
    SampleFrame.id_column, and BalanceFrame.id_column; the accessors continue to return
    column names, while weight_series and id_series return data.
  • Diagnostics construction now wires adjustment_failure metadata from model
    outputs when available (instead of hardcoding success), and supports an
    optional adjustment_failure_reason diagnostics row for richer failure
    reporting in downstream tooling.
  • Plotly distribution plotting now gracefully handles missing notebook mime
    rendering dependencies (nbformat) by skipping the interactive plot with a
    warning, avoiding runtime crashes in non-notebook/test environments.

Tests

  • Expanded targeted test coverage for predict-time metadata validation,
    replay/transfer edge cases, and error/warning paths in
    weighted_comparisons_stats, poststratify, and rake.
  • CI matrix entry for diff-diff integration. The Build and Test workflow
    exercises tests/test_interop_diff_diff.py on Python 3.12 against both the
    minimum pin (==3.3.0) and the resolved-latest within >=3.3.0,<4, via
    new extras and diff-diff-pin matrix axes restricted to ubuntu-latest.
    The bare-import-balance matrix (no [did] extra) continues to cover Python
    3.9–3.14 across ubuntu-latest / macos-latest / windows-latest. A new
    notebook-ci.yml workflow nbconvert-executes the BRFSS tutorial on every
    PR touching tutorials/**.ipynb or balance/interop/**.py. A new
    diff-diff-canary.yml workflow runs weekly against the latest diff-diff
    release from PyPI and opens (or updates) a GitHub issue tagged
    diff-diff-incompatibility on failure. The coverage.yml workflow installs
    with the [did] extra so the new tests run under coverage. Internal Buck
    test-matrix targets (:balance_tests_pss2, :balance_tests_pss3) gain a
    direct fbsource//third-party/pypi/diff-diff:diff-diff dep.

Contributors

@talgalili ,@neuralsorcerer

Full Changelog

0.20.0...0.21.0

Don't miss a new balance release

NewReleases is sending notifications on new releases.