github facebookresearch/balance 0.13.0
0.13.0 (2025-12-02)

2 hours ago

New Features

  • Propensity modeling beyond static logistic regression
    • ipw() now accepts any sklearn classifier via the model argument,
      enabling the use of models like random forests and gradient boosting while
      preserving all existing trimming and diagnostic features. Dense-only
      estimators and models without linear coefficients are fully supported.
      Propensity probabilities are stabilized to avoid numerical issues.
    • Allow customization of logistic regression by passing a configured
      :class:~sklearn.linear_model.LogisticRegression instance through the
      model argument. Also, the CLI now accepts
      --ipw_logistic_regression_kwargs JSON to build that estimator directly for
      command-line workflows.
  • Covariate diagnostics
    • Added KL divergence calculations for covariate comparisons (numeric and
      one-hot categorical), exposed via BalanceDF.kld() alongside linked-sample
      aggregation support.
  • Weighting Methods
    • rake() and poststratify() now honour weight_trimming_mean_ratio and
      weight_trimming_percentile, trimming and renormalising weights through the
      enhanced trim_weights(..., target_sum_weights=...) API so the documented
      parameters work as expected
      (#147).

Documentation

  • Added comprehensive post-stratification tutorial notebook
    (balance_quickstart_poststratify.ipynb)
    (#141,
    #142,
    #143).
  • Expanded poststratify docstring with clear examples and improved statistical
    methods documentation
    (#141).
  • Added project badges to README for build status, Python version support, and
    release tracking
    (#145).
  • Added IPW quickstart tutorial showcasing default logistic regression and
    custom sklearn classifier usage in (balance_quickstart.ipynb).
  • Shorten the welcome message (for when importing the package).

Code Quality & Refactoring

  • Raking algorithm refactor

    • Removed ipfn dependency and replaced with a vectorized NumPy
      implementation (_run_ipf_numpy) for iterative proportional fitting,
      resulting in significant performance improvements and eliminating external
      dependency (#135).
  • IPW method refactoring

    • Reduced Cyclomatic Complexity Number (CCN) by extracting repeated code
      patterns into reusable helper functions: _compute_deviance(),
      _compute_proportion_deviance(), _convert_to_dense_array().
    • Removed manual ASMD improvement calculation and now uses existing
      compute_asmd_improvement() from weighted_comparisons_stats.py
  • Type safety improvements

    • Migrated 32 Python files from # pyre-unsafe to # pyre-strict mode,
      covering core modules, statistics, weighting methods, datasets, and test
      files
    • Modernized type hints to PEP 604 syntax (X | Y instead of Union[X, Y])
      across 11 files for improved readability and Python 3.10+ alignment
    • Type alias definitions in typing.py retain Union syntax for Python 3.9
      compatibility
    • Enhanced plotting function type safety with TypedDict definitions and
      proper type narrowing
    • Replaced assert-based type narrowing with _verify_value_type() helper for
      better error messages and pyre-strict compliance
  • Renamed BalanceDF to BalanceDF****

    • BalanceCovarsDF to BalanceDFCovars
    • BalanceOutcomesDF to BalanceDFOutcomes
    • BalanceWeightsDF to BalanceDFWeights

Bug Fixes

  • Utility Functions
    • Fixed quantize() to preserve column ordering and use proper TypeError
      exceptions (#133)
  • Statistical Functions
    • Fixed division by zero in asmd_improvement() when asmd_mean_before is
      zero, now returns 0.0 for 0% improvement
  • CLI & Infrastructure
    • Replaced deprecated argparse FileType with pathlib.Path
      (#134)
  • Weight Trimming
    • Fixed trim_weights() to consistently return pd.Series with
      dtype=np.float64 and preserve original index across both trimming methods
    • Fixed percentile-based winsorization edge case: _validate_limit() now
      automatically adjusts limits to prevent floating-point precision issues
      (#144)
    • Enhanced documentation for trim_weights() and _validate_limit() with
      clearer examples and explanations

Tests

  • Enhanced test coverage for weight trimming with
    test_trim_weights_return_type_consistency and 11 comprehensive tests for
    _validate_limit() covering edge cases, error conditions, and boundary
    conditions

Contributors

@neuralsorcerer, @talgalili, @wesleytlee

Full Changelog: 0.12.1...0.13.0

Don't miss a new balance release

NewReleases is sending notifications on new releases.