New Features
- Implemented
r_indicator()with validated sample-variance formula- Added a public
r_indicator(sample_p, target_p)implementation in
weighted_comparisons_statsusing the documented Eq. 2.2.2 formulation
over concatenated propensity vectors and explicit input-size validation. - Added validation for non-finite and out-of-range propensity values,
and expanded unit coverage for formula correctness and edge cases. - Added
BalanceDFWeights.r_indicator()as a convenience wrapper, so
sample.weights().r_indicator()computes the r-indicator directly.
- Added a public
Deprecations
Sample.design_effect()is deprecated — usesample.weights().design_effect()instead.
The method already exists onBalanceDFWeights; theSamplemethod now emits a
DeprecationWarningand delegates. Will be removed in balance 0.19.0.Sample.design_effect_prop()is deprecated — usesample.weights().design_effect_prop()instead.
New method added toBalanceDFWeights. Will be removed in balance 0.19.0.Sample.plot_weight_density()is deprecated — usesample.weights().plot()instead.
Will be removed in balance 0.19.0.Sample.covar_means()is deprecated — usesample.covars().mean()instead
(with.rename(index={'self': 'adjusted'}).reindex([...]).Tfor the same format).
Will be removed in balance 0.19.0.Sample.outcome_sd_prop()is deprecated — usesample.outcomes().outcome_sd_prop()instead.
New method added toBalanceDFOutcomes. Will be removed in balance 0.19.0.Sample.outcome_variance_ratio()is deprecated — usesample.outcomes().outcome_variance_ratio()instead.
New method added toBalanceDFOutcomes. Will be removed in balance 0.19.0.
LLM/GenAI
- Added
CLAUDE.mdproject context files for Claude Code users, covering architecture,
build/test instructions (Meta and open-source), code conventions, and pre-submit checklist. - Updated
.github/copilot-instructions.mdreview checklist to reduce duplication with
CLAUDE.mdand add missing conventions (MIT license header,from __future__ import annotations,
factory pattern, seed fixing, deprecation style).
Bug Fixes
prepare_marginal_dist_for_raking/_realize_dicts_of_proportions: fixed memory explosion from LCM expansion- When proportions had high decimal precision or many covariates were passed,
the LCM of the individual per-variable array lengths could reach tens of
millions (or more), causing OOM crashes. - Both functions now accept a
max_lengthparameter (default10000). When
the natural LCM exceedsmax_length, the output is capped atmax_length
rows and counts are allocated via the Hare-Niemeyer (largest remainder)
method, which guarantees the total stays exactlymax_lengthwith minimal
rounding error per category. - A warning is logged whenever the cap is applied.
- A new internal helper
_hare_niemeyer_allocationimplements the allocation logic.
- When proportions had high decimal precision or many covariates were passed,
Contributors
Full Changelog: 0.17.0...0.18.0