VowpalWabbit/vowpal_wabbit 9.7.0 on GitHub

Eigen Memory Tree

An Eigen Memory Tree (EMT) is a memory based learning reduction. EMTs will remember previous training examples and use this memory to assign labels to future requested predictions. For more information, see the EMT wiki page

Robust confidence sequence estimator

#4297

Cubic config oracle in automl

We are now able to search over cubic interactions on top of quadratic interactions in automl . Automl

Vector CPU instructions

Vector CPU instructions for faster computation in the CB with Large Action Space reduction. LAS

Predict only models

Ability to save predict only models from some reductions (automl, epsilon-decay). this removes the reductions from the reduction stack and allows older versions of VW to predict.

Enforce minimum probability for SquareCB

[SquareCB] (https://github.com/VowpalWabbit/vowpal_wabbit/wiki/Contextual-Bandit-Exploration-with-SquareCB)

Support for probabilities for PLT

Added support for probabilities output for the PLT reduction + fix it in version 9+.

Target rate added to explore eval

The goal of explore eval is to evaluate different exploration algorithms using the data from a logged policy. Explore Eval

VW refactors

Improved finish_example in all reductions
Parsers for different formats moved into their own libraries
Namespacing of library fixed - all things under VW

Click here to see all changes in this release

What's Changed

Features

feat: explore eval example target rate by @olgavrou in #4277
feat: [gd] invert_hash readeable model with hexfloat by @lalo in #3999
feat: explore eval target rate by @olgavrou in #4285
feat: Add explicit simd implementation for one pass svd in large action spaces. by @zwd-ms in #4261
feat: Add avx2 implementation for one pass svd in large action spaces. by @zwd-ms in #4281
feat: Handle ignore_linear in las simd and throw on unsupported interactions. by @zwd-ms in #4282
feat: spin off automl predict_only_model to standard cb model by @bassmang in #4279
feat: add mix with uniform impl by @jackgerrits in #4301
feat: Enforce minimum probability for squarecb and update impl by @jackgerrits in #4298
feat: add unique_ptr support to model_utils by @jackgerrits in #4341
feat: use strong type for no pred by @jackgerrits in #4343
feat: use strong type for no label by @jackgerrits in #4342
feat: Adding EMT reduction. by @mrucker in #4264
feat: [automl] trace to csv files by @lalo in #4355
feat: robust confidence sequence estimator by @bassmang in #4297
feat: [automl] config oracle cubic on top of quadratic by @lalo in #4351
feat: update for probabilistic label tree reduction (#2766) - support for --probabilities option and fixed compatibility with VW 9+ version by @mwydmuch in #4138
feat: constexpr uniform_hash and type fixes by @jackgerrits in #4415
feat: Enable learner type checks at build. by @zwd-ms in #4411
feat: stabilize unique_ptr based initialize function by @jackgerrits in #4438
feat: Added new CCB predict benchmark by @rajan-chari in #4421
feat: [CB_GF] CB with graph feedback text input by @olgavrou in #4392
feat: [epsilon_decay] predict_only_model by @bassmang in #4458

Fixes

fix!: resolve csoaa_ldf prediction return correctness by @jackgerrits in #4395
fix!: [LAS] las + squarecb to re-use squarecb gamma by @olgavrou in #4479
fix!: [py] use full word for namespace and add test by @lalo in #4485
fix: [Explore_eval] fix threshold for adaptive multiplier by @marco-rossi29 in #4168
fix: Add pragma once to merge.h by @byronxu99 in #4284
fix: [epsilon_decay] process models in descending order when shifting by @bassmang in #4286
fix: [CI] check for missing args consistently in forwards/backwards compat by @olgavrou in #4289
fix: [CI] backwards compat don't fail if model file is missing by @olgavrou in #4291
fix: silence unused warning when las simd not enabled by @jackgerrits in #4299
fix: Build las simd on x86 only and rename command line flag. by @zwd-ms in #4300
fix: [automl] update champ score when it matches labelled_action by @lalo in #4326
fix: fix get_features function returning dangling pointer by @jackgerrits in #4328
fix: [automl] config oracle edge cases by @lalo in #4327
fix: remove type numpy aliases as they are now removed upstream by @jackgerrits in #4363
fix: fix loop binding to temporary by @jackgerrits in #4379
fix: [automl] update print logic for new oracle by @lalo in #4384
fix: exception safety in learner builder by @jackgerrits in #4429
fix: remove cerr from cs_robust by @bassmang in #4441
fix: [automl/epsilon_decay] brentq optimization by @bassmang in #4449
fix: pydocs formatting by @bassmang in #4464
fix: invert_hash for coin/ftrl by @bassmang in #4465
fix: Account for | in make_valid_name() by @darwinyip in #4468
fix: [LAS] LAS not a cb adf common reduction, fixes metrics with LAS bug by @olgavrou in #4476
fix: [automl] allow multiple models underneath automl by @bassmang in #4463
fix: include t, min and max label in model merging by @jackgerrits in #4483

Other Changes

ci: use shared caches for vcpkg job by @jackgerrits in #4270
build: add missing include by @jackgerrits in #4275
refactor: use model utils instead of macro in recall tree by @jackgerrits in #4248
refactor: [automl] remove lb_trick by @bassmang in #4283
docs: Update off_policy_evaluation.md by @olgavrou in #4280
ci: compatibility CI checks to not fail on newly added arguments by @olgavrou in #4287
test: remove flaky test (win) by @lalo in #4290
build: Use nix to manage dev tooling starting with clang-tidy by @jackgerrits in #4292
build: remove regex from clang-tidy-diff command as it wasnt working by @jackgerrits in #4294
chore: use clang-format-14 for formatting by @jackgerrits in #4302
build: consume string-view-lite as a sys dep for vcpkg by @jackgerrits in #4303
refactor: implement scaffold for finish_example split and POC migrations by @jackgerrits in #4296
refactor: move csv parser into csv namespace by @jackgerrits in #4304
refactor: split apart output and progressive log by @jackgerrits in #4308
refactor: move accumulate funcs into details namespace by @jackgerrits in #4305
refactor: migrate mwt finish_example by @jackgerrits in #4311
build: reduce header dependencies in important headers by @jackgerrits in #4306
refactor: split cache parser into separate lib by @jackgerrits in #4309
refactor: fix conversion warnings in v_array and removed deprecated usages by @jackgerrits in #4310
refactor: allow reduction to control print frequency by @jackgerrits in #4315
ci: Python sdist/docs - use 3.10 as that is now the default on 22.04 by @jackgerrits in #4317
docs: dont execute epsilon decay notebook by @jackgerrits in #4318
ci: used shared cache for asan builds by @jackgerrits in #4313
chore: Move cats paper code to demo directory. by @zwd-ms in #4320
refactor: migrate nn finish_example by @jackgerrits in #4314
chore: don't try to format vcpkg_installed files by @jackgerrits in #4323
refactor: [automl] small clean-up by @lalo in #4325
refactor: migate OAA finish func by @jackgerrits in #4316
refactor: migrate stagewise_poly finish_example by @jackgerrits in #4322
perf: arm64 performance optimizations by @rami-lv in #4288
refactor: deprecate alloc/dealloc example by @jackgerrits in #4329
refactor: deduplicate random_seed state by @jackgerrits in #4331
refactor: remove unused field in parser by @jackgerrits in #4332
refactor: move some fields out of workspace by @jackgerrits in #4333
refactor: small namespace cleanup by @jackgerrits in #4334
refactor: move shared_data into VW namespace by @jackgerrits in #4338
refactor: cleanup unique_sort.h by @jackgerrits in #4336
refactor: remove unused stable_unique by @jackgerrits in #4337
refactor: mark WRITEIT and WRITEITVAR as deprecated by @jackgerrits in #4335
refactor: migrate finish and sender no longer holds on to examples by @jackgerrits in #4321
refactor: Migrate plt finish_example. by @zwd-ms in #4339
refactor: Migrate multilabel_oaa finish_example. by @zwd-ms in #4340
refactor: migrate topk finish_example by @jackgerrits in #4324
refactor: namespacing in parser.h by @jackgerrits in #4345
refactor: remove scoped_calloc_or_throw in favor of make_unique by @jackgerrits in #4346
refactor: Migrate oja newton finish and modernize memory management by @jackgerrits in #4350
refactor: fix namespacing of feature_group.h header by @jackgerrits in #4352
refactor: remove finish example for count_label by @jackgerrits in #4349
refactor: migrate confidence finish_example by @jackgerrits in #4348
refactor: migrate cbzo finish_example by @jackgerrits in #4347
refactor: use unique_ptr for parser, fix new/free mismatch by @jackgerrits in #4357
refactor: v_array resize_but_with_stl_behavior -> resize rename by @jackgerrits in #4361
refactor: cb_explore finish function by @peterychang in #4360
chore: clarify daemon support on MacOS by @jackgerrits in #4367
chore: add active_interactor deprecation notice by @jackgerrits in #4359
refactor: move text parser into its own lib by @olgavrou in #4356
refactor: deprecate some legacy functions by @jackgerrits in #4369
refactor: update baseline_cb_test usage of initialize to scoped by @jackgerrits in #4370
refactor: move label operations to members for findability by @jackgerrits in #4374
refactor: migrate bs finish_example by @jackgerrits in #4366
test: migrate some tests from boost to gtest by @jackgerrits in #4376
refactor: migrate cats_pdf, cats finish_example by @jackgerrits in #4372
refactor: migrate audit_regressor finish_example by @jackgerrits in #4358
refactor: migrate boosting finish_example by @jackgerrits in #4362
test: move more tests and add matcher impl by @jackgerrits in #4380
refactor: move confidence_seq code to impl by @jackgerrits in #4373
refactor: Migrate cb_explore_adf reductions finish functions by @peterychang in #4330
test: move more tests to gtest from boost by @jackgerrits in #4383
chore: remove some LAS tests that are not needed by @olgavrou in #4386
refactor: move json parser into its own lib by @olgavrou in #4381
refactor: reduce global namespace pollution by @jackgerrits in #4385
test: move more tests from boost to gtest by @jackgerrits in #4387
refactor: rename read_line_json_s -> read_line_json by @olgavrou in #4388
ci: csharp benchmarks to run on master by @olgavrou in #4389
build: automate test main.cc file generation by @jackgerrits in #4390
refactor: [automl/epsilon_decay] integrate robust confidence sequences by @bassmang in #4377
refactor: migrate cb_adf finish_example by @jackgerrits in #4397
build: rely on gtest_main target instead of custom by @jackgerrits in #4396
refactor: migrate cs_active finish function by @jackgerrits in #4394
refactor: cb_to_cb_adf finish function by @peterychang in #4398
test: move last of tests to gtest, remove all boost test infra by @jackgerrits in #4399
refactor: fix namespacing of parse_primitives.h by @jackgerrits in #4405
refactor: fix namespacing of io_buf.h by @jackgerrits in #4403
refactor: fix namespacing of parameters by @jackgerrits in #4404
refactor: move vw.h functions to be defined in vw.cc by @jackgerrits in #4410
refactor: Add metrics collector by @bassmang in #4407
refactor: standardize googletest naming by @bassmang in #4408
refactor: migrate log_multi finish_example by @jackgerrits in #4414
refactor: migrate memory_tree finish_example by @jackgerrits in #4412
test: minor testing fixes by @bassmang in #4417
refactor: migrate automl finish_example by @bassmang in #4419
refactor: migrate search finish_example by @jackgerrits in #4400
refactor: migrate recall_tree finish_example by @jackgerrits in #4402
docs: reproducible doxygen docs using nix by @jackgerrits in #4425
refactor: remove learner print_example by @jackgerrits in #4423
refactor: cb_algs finish functions by @peterychang in #4409
refactor: migrate ect finish_example by @bassmang in #4424
refactor: cleanup cb.h header by @jackgerrits in #4427
refactor: remove learn and label references from ect predict by @bassmang in #4426
refactor: cleanup more global namespace pollution by @jackgerrits in #4428
docs: Fixed typo from steep cost function to step cost function by @bkowshik in #4393
revert: previous wrong correction in docs by @lalo in #4433
refactor: cleanup CB related namespaces by @jackgerrits in #4431
refactor: deprecate is_from_pool by @jackgerrits in #4432
ci: output valgrind logs on unit test failure by @jackgerrits in #4430
test: add tests for uniform_hash by @jackgerrits in #4436
refactor: migrate exploration namespace to VW::explore by @jackgerrits in #4435
docs: minor tutorial cleanups by @ataymano in #4437
refactor: migrate INTERACTIONS namespace by @jackgerrits in #4434
refactor: unify locked and unlocked pools in one impl by @jackgerrits in #4439
refactor: migrate cbify finish example by @jackgerrits in #4440
refactor: migrate MULTILABEL namespace by @jackgerrits in #4443
refactor: migrate GD namespace by @jackgerrits in #4442
refactor: migrate explore_eval finish function by @olgavrou in #4448
refactor: migrate warm_cb finish_example by @jackgerrits in #4447
build: dont expose eigen in public headers by @jackgerrits in #4445
refactor: rename finalize_driver to finish by @jackgerrits in #4450
refactor: migrate csoaa finish_example by @jackgerrits in #4446
refactor: consolidate random into common by @jackgerrits in #4453
refactor: migrate lda finish_example by @jackgerrits in #4413
refactor: migrate csoaa_ldf finish_example by @jackgerrits in #4452
refactor: remove internal usage functions from workspace api by @olgavrou in #4456
build: do not override externally set VW_CXX_STANDARD by @jackgerrits in #4455
refactor: migrate to new initialize function by @jackgerrits in #4444
build: enable consumption of system sse2neon by @jackgerrits in #4457
test: add test for interactive active workload by @jackgerrits in #4454
refactor: refactor finish_example for active by @rajan-chari in #4353
docs: Fix typo in "Contextual Bandit Content Personalization" tutorial by @toldervoll in #4466
refactor: brentq optimizations by @bassmang in #4462
refactor: estimator dir by @lalo in #4470
refactor: [automl/epsilon decay] bisection method by @bassmang in #4469
refactor: estimators ns in cressieread by @lalo in #4472
refactor: [automl] add tol_x and opt_func flags by @bassmang in #4475
refactor: python lint (new black version) by @bassmang in #4480
refactor: make workspace const in add_constant_feature by @jackgerrits in #4481
refactor: implement delta add/subtract by actually adding and subtracting weights by @byronxu99 in #4486

New Contributors

@rami-lv made their first contribution in #4288
@bkowshik made their first contribution in #4393
@toldervoll made their first contribution in #4466
@darwinyip made their first contribution in #4468

Full Changelog: 9.6.0...9.7.0