facebookresearch/balance 0.21.0 on GitHub

New Features

balance.stats_and_plots.love_plot.love_plot, BalanceDFCovars.love_plot(),
and BalanceDFCovars.plot(dist_type="love_plot") — visual
covariate-imbalance diagnostic in the spirit of R's cobalt::love.plot.
Supports interactive Plotly figures (the new default), static
seaborn/matplotlib axes (library="seaborn"), and LLM-friendly ASCII
(library="balance"). With both series it draws the canonical
before-vs-after scatter; with only before, a single-series scatter with
optional threshold reference line. The ASCII backend renders both series on
a shared axis with o/* markers, wider default bar_width=50, and
direction legends. New options include line= (toggle connectors) and
order_by={"diff", "before", "after", "alphabetical", "none"}; the default
order_by="diff" surfaces regressions at the top.
BalanceDFCovars.love_plot(metric=...) dispatches across metric ∈
{"asmd" (default), "kld", "emd", "cvmd", "ks"}, with the
threshold default resolving to the cobalt 0.1 cutoff for ASMD only.
.plot(dist_type="love_plot", library=...) (or the "love" alias) routes
covariate views to the same diagnostic.
Rake now supports fit-time metadata persistence and predict_weights()
reconstruction.
- rake(..., store_fit_metadata=True) stores contingency-table artifacts
  and fit-time metadata required to rebuild weights later.
- BalanceFrame.fit(method="rake") enables store_fit_metadata=True by
  default so fitted rake models can be reused with
  BalanceFrame.predict_weights() without refitting.
- In-place replay (predict_weights() with no data=) works with any
  transformations, including the rake default transformations="default".
- Transfer scoring (predict_weights(data=...)) requires deterministic
  transformations: raises ValueError for models fitted with
  transformations="default" and for explicit dicts that directly
  reference data-dependent helpers (quantize, fct_lump). Pass
  deterministic transformations at fit time or re-fit on the scoring data.
Poststratify now supports transfer scoring with predict_weights(data=...).
BalanceFrame.fit(method="poststratify", store_fit_metadata=True) stores
the transformation origin needed to safely replay fitted cell ratios on a
new sample/target pair: predict_weights(data=holdout_bf) applies stored
ratios to the holdout sample's design weights and rescales to the holdout
target's total weight. Same restriction as rake transfer scoring: rejects
transformations="default" and direct quantize / fct_lump references.
Pre-0.21.0 pickles lack transformations_origin and must be re-fit for
transfer scoring; in-place predict_weights() continues to work.
balance.interop.diff_diff — thin adapter to
diff-diff (>=3.3.0,<4) for
survey-weighted Difference-in-Differences. Provides to_survey_design(),
to_panel_for_did(), fit_did(), and as_balance_diagnostic(). Install
via pip install "balance[did]". The submodule is lazy-imported, so
import balance still works cleanly when diff-diff isn't installed — the
import guard rewrites the ImportError to point users at the balance[did]
extra. Shared adapter helpers (active_weight_column,
drop_history_columns, validate_row_count, attach_balance_provenance)
live in balance/interop/_common.py and column-name conventions in
balance/interop/conventions.py so a future balance.interop.svy adapter
can reuse them.
balance.stats_and_plots.weights_stats.kish_deff_stats — bundled
Kish-design-effect diagnostic returning a KishStats(deff, ess, essp)
namedtuple. Computes design_effect once and derives ESS and ESSP from it,
avoiding three separate Deff computations when all three are needed.
kish_ess(w) and kish_essp(w) are also exposed as ergonomic singletons
over the existing design_effect. BalanceFrame._design_effect_diagnostics
now routes through kish_deff_stats so the canonical Kish identities live
in one place.
BalanceFrame.adjustment_history records compound adjustment steps.
Sequential adjust() / set_fitted_model() workflows now keep a
chronological, best-effort read-only copy of each adjustment step while
preserving model as the latest fitted model for backwards compatibility.
Baseline resets such as set_as_pre_adjust() clear the history together
with the current model.
CLI --formula now accepts JSON lists for model-matrix formula lists.
In addition to a single formula string, CLI users can pass values such as
--formula='["age", "gender"]', which are parsed and forwarded as
list[str] to IPW/CBPS model-matrix construction. Malformed, empty, or
non-string JSON lists now fail during argument parsing.

Bug Fixes

rake() now correctly incorporates per-row design weights in final
weights. Previously, every unit in the same raking cell received the same
weight m_fit[c] / m_sample[c], ignoring its own design weight. The correct
formula w_final_i = w_design_i × m_fit[c] / m_sample[c] is now applied,
matching poststratify semantics and ensuring weighted marginals recover
the target distribution when design weights are non-uniform. No-op when
design weights are uniform (the common case).
rake() now gracefully handles single-variable adjustments. When
rake(...) resolves to exactly one adjustment variable, it logs a warning
and delegates to poststratify(...) instead of raising an assertion. This
preserves passthrough behaviour for transformations, NA handling, trimming
controls, and fit-metadata persistence while making
BalanceFrame.fit(method="rake") more robust for one-variable inputs. In
this delegated path, model metadata records method='poststratify' while
returned weights keep the canonical rake_weight name.
CLI --num_lambdas now parses as a positive integer. Fractional,
zero, negative, and non-numeric values fail fast during argument parsing
instead of being accepted after coercion/truncation or failing later
during IPW adjustment.
Validation-path cleanup in asmd, poststratify, and rake removes
redundant/unreachable branches with no behavior loss:
- asmd(...) uses a single authoritative invalid-std_type error path
  (Unknown std_type ...).
- poststratify(..., store_fit_metadata=True) drops an unreachable
  "missing stored training weights" guard, and predict-time ratio-column
  collisions now use deterministic suffix-based naming (_cell_ratio,
  _cell_ratio_tmp, _cell_ratio_tmp2, ...).
- rake._predict_weights_from_model(...) uses already-validated fit-time
  target weights directly for non-transfer replay.
Security: ws updated from 8.20.0 to 8.20.1 in website dependencies.
Fixes CVE-2026-45736 (GHSA-58qx-3vcg-4xpx): uninitialized memory disclosure
in websocket.close() when a TypedArray is passed as the reason argument.

Documentation

README cross-link to diff-diff. New "Design-based inference" parent
section in
README.md
introduces the diff-diff integration above the API tour with a canonical
Sample.from_frame → set_target → adjust → fit_did snippet. The
Docusaurus tutorials index and the website landing page
(HomepageFeatures.js) gain matching cross-references;
.github/copilot-instructions.md gets a new review-checklist bullet for
changes that touch balance/interop/diff_diff.py.
Survey-weighted DiD tutorial. New
tutorials/balance_diff_diff_brfss.ipynb walks through a BRFSS-style
staggered-adoption smoking-ban DiD use case end-to-end: load synthetic
survey microdata via dd.generate_survey_did_data, reweight to ACS
demographic marginals via balance.ipw, aggregate to a state-quarter panel
via to_panel_for_did, fit Callaway-Sant'Anna doubly-robust DiD via
fit_did, run HonestDiD sensitivity, build a combined diagnostic via
as_balance_diagnostic, and contrast with the unweighted estimate.
Self-contained and deterministic — CI re-executes via nbconvert. Committed
with cleared outputs; deploy-website.yml re-runs and bakes outputs into
the rendered Docusaurus pages.

Code Quality & Refactoring

All-zero weight inputs to _check_weights_series_are_valid now emit a
UserWarning (when require_positive=False, the default). Previously,
weighted statistics over an all-zero weight vector silently produced NaN /
inf (sum(w*x)/sum(w) = 0/0). Callers that already passed
require_positive=True (e.g. design_effect, nonparametric_skew,
prop_above_and_below, weighted_median_breakdown_point) keep their
ValueError behaviour. This affects internal callers like
descriptive_stats → asmd, which previously masked the failure mode.
Removed the scheduled migration FutureWarnings from SampleFrame.weight_column,
SampleFrame.id_column, and BalanceFrame.id_column; the accessors continue to return
column names, while weight_series and id_series return data.
Diagnostics construction now wires adjustment_failure metadata from model
outputs when available (instead of hardcoding success), and supports an
optional adjustment_failure_reason diagnostics row for richer failure
reporting in downstream tooling.
Plotly distribution plotting now gracefully handles missing notebook mime
rendering dependencies (nbformat) by skipping the interactive plot with a
warning, avoiding runtime crashes in non-notebook/test environments.

Tests

Expanded targeted test coverage for predict-time metadata validation,
replay/transfer edge cases, and error/warning paths in
weighted_comparisons_stats, poststratify, and rake.
CI matrix entry for diff-diff integration. The Build and Test workflow
exercises tests/test_interop_diff_diff.py on Python 3.12 against both the
minimum pin (==3.3.0) and the resolved-latest within >=3.3.0,<4, via
new extras and diff-diff-pin matrix axes restricted to ubuntu-latest.
The bare-import-balance matrix (no [did] extra) continues to cover Python
3.9–3.14 across ubuntu-latest / macos-latest / windows-latest. A new
notebook-ci.yml workflow nbconvert-executes the BRFSS tutorial on every
PR touching tutorials/**.ipynb or balance/interop/**.py. A new
diff-diff-canary.yml workflow runs weekly against the latest diff-diff
release from PyPI and opens (or updates) a GitHub issue tagged
diff-diff-incompatibility on failure. The coverage.yml workflow installs
with the [did] extra so the new tests run under coverage. Internal Buck
test-matrix targets (:balance_tests_pss2, :balance_tests_pss3) gain a
direct fbsource//third-party/pypi/diff-diff:diff-diff dep.

Contributors

@talgalili ,@neuralsorcerer

Full Changelog

0.20.0...0.21.0

facebookresearch/balance 0.21.0 0.21.0 (2026-06-02) on GitHub

New Features

Bug Fixes

Documentation

Code Quality & Refactoring

Tests

Contributors

Full Changelog

facebookresearch/balance 0.21.0
0.21.0 (2026-06-02)

on GitHub