github facebookresearch/balance 0.20.0
0.20.0 (2026-04-26)

6 hours ago

Breaking Changes

  • id_column now returns the column name (str) on SampleFrame,
    BalanceFrame, and Sample. Previously it returned ID data
    (pd.Series), which was inconsistent with weight_column (which returned
    a name after 0.19.0). Use id_series for ID data, weight_series for
    weight data. The accessor naming convention is now consistent:

    • *_column → column name (str): id_column, weight_column
    • *_series → column data (pd.Series): id_series, weight_series
    • df_* → DataFrame: df_covars, df_weights, df_outcomes

    Both id_column and weight_column emit FutureWarnings pointing at the
    new data-returning accessors; warnings will be removed after 2026-06-01.
    The BalanceDFSource protocol was updated accordingly (id_column
    id_series); custom implementations must rename this property.

  • Unknown kwargs to poststratify(...) now raise TypeError instead of
    being silently ignored, to catch typos. store_fit_metadata must be a
    boolean.

New Features

  • Added sklearn-style fit / predict_weights workflow on BalanceFrame

    • New entry points BalanceFrame.fit(...), design_matrix(on=..., data=...),
      predict_proba(on=..., output=..., data=...), and predict_weights(data=...)
      enable training a weighting model on one BalanceFrame and applying it to new
      data via data=... for one-liner holdout scoring
      (fitted.predict_weights(data=holdout_bf)).
    • Supports IPW, CBPS, and poststratify methods. Each stores the fit-time
      metadata needed to reconstruct weights (training design weights, trimming
      options, CBPS coefficients, poststratify cell-ratio tables, NA handling).
    • BalanceFrame.set_fitted_model(fitted) applies a fitted model from one
      BalanceFrame to another for holdout scoring workflows.
  • Added formula support in poststratify

    • poststratify(...) and BalanceFrame.adjust(method="poststratify", ...)
      now accept formula= (string or list) as an alternative to variables=.
    • Only interaction-style operators are supported: : (interaction),
      . (all common columns), - (exclude), and optional leading ~.
      Additive + and * are explicitly rejected because post-stratification
      defines cells by the joint distribution — a + b, a * b, and a:b
      would all produce identical cells, and rejecting +/* prevents users
      from silently writing what looks like a main-effects model.
    • Strict formula validation: empty/non-string entries, unknown variables,
      transformed terms, and passing both variables and formula all raise
      explicit ValueError. Note: raking operates on marginals, so +/*
      will be meaningful when raking gains its own formula= argument.
  • Added BalanceFrame.set_as_pre_adjust() to lock in the current
    responder state as the new pre-adjust baseline. Supports inplace=False
    (default, immutable) and inplace=True. Clears the adjustment model and
    unadjusted link since the object is no longer considered adjusted.

Code Quality & Refactoring

  • Safer target replacement on adjusted objectsBalanceFrame.set_target(...)
    now warns when replacing the target on an adjusted object in-place, since
    this resets responder weights to pre-adjust values and drops the current
    adjustment result.

  • Modernized weight dtype checks for pandas 3.0 compatibility
    SampleFrame.set_weights() paths now use explicit exact-float64 checks
    instead of deprecated float-dtype helpers, and always coerce assigned
    weights to exact float64.

  • Cleaned up legacy Python 2 __future__ compatibility imports in
    weighting methods
    — replaced obsolete imports with
    from __future__ import annotations in adjust_null, cbps,
    poststratify, and rake. Note: with this future import enabled,
    runtime annotation introspection changes (__annotations__ become
    postponed/stringized).

Tests

  • Added coverage for the new fit/predict_weights workflow (IPW, CBPS,
    poststratify), pickle/deepcopy roundtrips of fitted BalanceFrames,
    raw-covariate fit-matrix persistence (use_model_matrix=False),
    near-separation stability, empty-input validation, formula parsing edge
    cases (interaction syntax, dot expansion, explicit ~ formulas,
    validation failures), set_as_pre_adjust() behavior (copy / in-place /
    inherited-view sync), target-replacement warning emission, weight-column
    casting when active weight dtype is non-float/float32, and chained
    IPW→poststratify adjustment behavior.

Contributors

@talgalili, @sahil350 ,@neuralsorcerer

Full Changelog

0.19.0...0.20.0

Don't miss a new balance release

NewReleases is sending notifications on new releases.