New Features
balance.stats_and_plots.love_plot.love_plot,BalanceDFCovars.love_plot(),
andBalanceDFCovars.plot(dist_type="love_plot")— visual
covariate-imbalance diagnostic in the spirit of R'scobalt::love.plot.
Supports interactive Plotly figures (the new default), static
seaborn/matplotlib axes (library="seaborn"), and LLM-friendly ASCII
(library="balance"). With both series it draws the canonical
before-vs-after scatter; with onlybefore, a single-series scatter with
optional threshold reference line. The ASCII backend renders both series on
a shared axis witho/*markers, wider defaultbar_width=50, and
direction legends. New options includeline=(toggle connectors) and
order_by={"diff", "before", "after", "alphabetical", "none"}; the default
order_by="diff"surfaces regressions at the top.
BalanceDFCovars.love_plot(metric=...)dispatches acrossmetric∈
{"asmd"(default),"kld","emd","cvmd","ks"}, with the
threshold default resolving to the cobalt 0.1 cutoff for ASMD only.
.plot(dist_type="love_plot", library=...)(or the"love"alias) routes
covariate views to the same diagnostic.- Rake now supports fit-time metadata persistence and
predict_weights()
reconstruction.rake(..., store_fit_metadata=True)stores contingency-table artifacts
and fit-time metadata required to rebuild weights later.BalanceFrame.fit(method="rake")enablesstore_fit_metadata=Trueby
default so fitted rake models can be reused with
BalanceFrame.predict_weights()without refitting.- In-place replay (
predict_weights()with nodata=) works with any
transformations, including the rake defaulttransformations="default". - Transfer scoring (
predict_weights(data=...)) requires deterministic
transformations: raisesValueErrorfor models fitted with
transformations="default"and for explicit dicts that directly
reference data-dependent helpers (quantize,fct_lump). Pass
deterministic transformations at fit time or re-fit on the scoring data.
- Poststratify now supports transfer scoring with
predict_weights(data=...).
BalanceFrame.fit(method="poststratify", store_fit_metadata=True)stores
the transformation origin needed to safely replay fitted cell ratios on a
new sample/target pair:predict_weights(data=holdout_bf)applies stored
ratios to the holdout sample's design weights and rescales to the holdout
target's total weight. Same restriction as rake transfer scoring: rejects
transformations="default"and directquantize/fct_lumpreferences.
Pre-0.21.0 pickles lacktransformations_originand must be re-fit for
transfer scoring; in-placepredict_weights()continues to work. balance.interop.diff_diff— thin adapter to
diff-diff (>=3.3.0,<4) for
survey-weighted Difference-in-Differences. Providesto_survey_design(),
to_panel_for_did(),fit_did(), andas_balance_diagnostic(). Install
viapip install "balance[did]". The submodule is lazy-imported, so
import balancestill works cleanly when diff-diff isn't installed — the
import guard rewrites theImportErrorto point users at thebalance[did]
extra. Shared adapter helpers (active_weight_column,
drop_history_columns,validate_row_count,attach_balance_provenance)
live inbalance/interop/_common.pyand column-name conventions in
balance/interop/conventions.pyso a futurebalance.interop.svyadapter
can reuse them.balance.stats_and_plots.weights_stats.kish_deff_stats— bundled
Kish-design-effect diagnostic returning aKishStats(deff, ess, essp)
namedtuple. Computesdesign_effectonce and derives ESS and ESSP from it,
avoiding three separate Deff computations when all three are needed.
kish_ess(w)andkish_essp(w)are also exposed as ergonomic singletons
over the existingdesign_effect.BalanceFrame._design_effect_diagnostics
now routes throughkish_deff_statsso the canonical Kish identities live
in one place.BalanceFrame.adjustment_historyrecords compound adjustment steps.
Sequentialadjust()/set_fitted_model()workflows now keep a
chronological, best-effort read-only copy of each adjustment step while
preservingmodelas the latest fitted model for backwards compatibility.
Baseline resets such asset_as_pre_adjust()clear the history together
with the current model.- CLI
--formulanow accepts JSON lists for model-matrix formula lists.
In addition to a single formula string, CLI users can pass values such as
--formula='["age", "gender"]', which are parsed and forwarded as
list[str]to IPW/CBPS model-matrix construction. Malformed, empty, or
non-string JSON lists now fail during argument parsing.
Bug Fixes
rake()now correctly incorporates per-row design weights in final
weights. Previously, every unit in the same raking cell received the same
weightm_fit[c] / m_sample[c], ignoring its own design weight. The correct
formulaw_final_i = w_design_i × m_fit[c] / m_sample[c]is now applied,
matchingpoststratifysemantics and ensuring weighted marginals recover
the target distribution when design weights are non-uniform. No-op when
design weights are uniform (the common case).rake()now gracefully handles single-variable adjustments. When
rake(...)resolves to exactly one adjustment variable, it logs a warning
and delegates topoststratify(...)instead of raising an assertion. This
preserves passthrough behaviour for transformations, NA handling, trimming
controls, and fit-metadata persistence while making
BalanceFrame.fit(method="rake")more robust for one-variable inputs. In
this delegated path, model metadata recordsmethod='poststratify'while
returned weights keep the canonicalrake_weightname.- CLI
--num_lambdasnow parses as a positive integer. Fractional,
zero, negative, and non-numeric values fail fast during argument parsing
instead of being accepted after coercion/truncation or failing later
during IPW adjustment. - Validation-path cleanup in
asmd,poststratify, andrakeremoves
redundant/unreachable branches with no behavior loss:asmd(...)uses a single authoritative invalid-std_typeerror path
(Unknown std_type ...).poststratify(..., store_fit_metadata=True)drops an unreachable
"missing stored training weights" guard, and predict-time ratio-column
collisions now use deterministic suffix-based naming (_cell_ratio,
_cell_ratio_tmp,_cell_ratio_tmp2, ...).rake._predict_weights_from_model(...)uses already-validated fit-time
target weights directly for non-transfer replay.
- Security:
wsupdated from 8.20.0 to 8.20.1 in website dependencies.
Fixes CVE-2026-45736 (GHSA-58qx-3vcg-4xpx): uninitialized memory disclosure
inwebsocket.close()when aTypedArrayis passed as the reason argument.
Documentation
- README cross-link to diff-diff. New "Design-based inference" parent
section in
README.md
introduces the diff-diff integration above the API tour with a canonical
Sample.from_frame→set_target→adjust→fit_didsnippet. The
Docusaurus tutorials index and the website landing page
(HomepageFeatures.js) gain matching cross-references;
.github/copilot-instructions.mdgets a new review-checklist bullet for
changes that touchbalance/interop/diff_diff.py. - Survey-weighted DiD tutorial. New
tutorials/balance_diff_diff_brfss.ipynbwalks through a BRFSS-style
staggered-adoption smoking-ban DiD use case end-to-end: load synthetic
survey microdata viadd.generate_survey_did_data, reweight to ACS
demographic marginals viabalance.ipw, aggregate to a state-quarter panel
viato_panel_for_did, fit Callaway-Sant'Anna doubly-robust DiD via
fit_did, run HonestDiD sensitivity, build a combined diagnostic via
as_balance_diagnostic, and contrast with the unweighted estimate.
Self-contained and deterministic — CI re-executes via nbconvert. Committed
with cleared outputs;deploy-website.ymlre-runs and bakes outputs into
the rendered Docusaurus pages.
Code Quality & Refactoring
- All-zero weight inputs to
_check_weights_series_are_validnow emit a
UserWarning(whenrequire_positive=False, the default). Previously,
weighted statistics over an all-zero weight vector silently producedNaN/
inf(sum(w*x)/sum(w) = 0/0). Callers that already passed
require_positive=True(e.g.design_effect,nonparametric_skew,
prop_above_and_below,weighted_median_breakdown_point) keep their
ValueErrorbehaviour. This affects internal callers like
descriptive_stats→asmd, which previously masked the failure mode. - Removed the scheduled migration
FutureWarnings fromSampleFrame.weight_column,
SampleFrame.id_column, andBalanceFrame.id_column; the accessors continue to return
column names, whileweight_seriesandid_seriesreturn data. - Diagnostics construction now wires
adjustment_failuremetadata from model
outputs when available (instead of hardcoding success), and supports an
optionaladjustment_failure_reasondiagnostics row for richer failure
reporting in downstream tooling. - Plotly distribution plotting now gracefully handles missing notebook mime
rendering dependencies (nbformat) by skipping the interactive plot with a
warning, avoiding runtime crashes in non-notebook/test environments.
Tests
- Expanded targeted test coverage for predict-time metadata validation,
replay/transfer edge cases, and error/warning paths in
weighted_comparisons_stats,poststratify, andrake. - CI matrix entry for diff-diff integration. The
Build and Testworkflow
exercisestests/test_interop_diff_diff.pyon Python 3.12 against both the
minimum pin (==3.3.0) and the resolved-latest within>=3.3.0,<4, via
newextrasanddiff-diff-pinmatrix axes restricted toubuntu-latest.
The bare-import-balance matrix (no[did]extra) continues to cover Python
3.9–3.14 acrossubuntu-latest/macos-latest/windows-latest. A new
notebook-ci.ymlworkflow nbconvert-executes the BRFSS tutorial on every
PR touchingtutorials/**.ipynborbalance/interop/**.py. A new
diff-diff-canary.ymlworkflow runs weekly against the latest diff-diff
release from PyPI and opens (or updates) a GitHub issue tagged
diff-diff-incompatibilityon failure. Thecoverage.ymlworkflow installs
with the[did]extra so the new tests run under coverage. Internal Buck
test-matrix targets (:balance_tests_pss2,:balance_tests_pss3) gain a
directfbsource//third-party/pypi/diff-diff:diff-diffdep.