Eigen Memory Tree
An Eigen Memory Tree (EMT) is a memory based learning reduction. EMTs will remember previous training examples and use this memory to assign labels to future requested predictions. For more information, see the EMT wiki page
Robust confidence sequence estimator
Cubic config oracle in automl
We are now able to search over cubic interactions on top of quadratic interactions in automl . Automl
Vector CPU instructions
Vector CPU instructions for faster computation in the CB with Large Action Space reduction. LAS
Predict only models
Ability to save predict only models from some reductions (automl, epsilon-decay). this removes the reductions from the reduction stack and allows older versions of VW to predict.
Enforce minimum probability for SquareCB
[SquareCB] (https://github.com/VowpalWabbit/vowpal_wabbit/wiki/Contextual-Bandit-Exploration-with-SquareCB)
Support for probabilities for PLT
Added support for probabilities output for the PLT reduction + fix it in version 9+.
Target rate added to explore eval
The goal of explore eval is to evaluate different exploration algorithms using the data from a logged policy. Explore Eval
VW refactors
- Improved finish_example in all reductions
- Parsers for different formats moved into their own libraries
- Namespacing of library fixed - all things under VW
Click here to see all changes in this release
What's Changed
Features
- feat: explore eval example target rate by @olgavrou in #4277
- feat: [gd] invert_hash readeable model with hexfloat by @lalo in #3999
- feat: explore eval target rate by @olgavrou in #4285
- feat: Add explicit simd implementation for one pass svd in large action spaces. by @zwd-ms in #4261
- feat: Add avx2 implementation for one pass svd in large action spaces. by @zwd-ms in #4281
- feat: Handle ignore_linear in las simd and throw on unsupported interactions. by @zwd-ms in #4282
- feat: spin off automl predict_only_model to standard cb model by @bassmang in #4279
- feat: add mix with uniform impl by @jackgerrits in #4301
- feat: Enforce minimum probability for squarecb and update impl by @jackgerrits in #4298
- feat: add unique_ptr support to model_utils by @jackgerrits in #4341
- feat: use strong type for no pred by @jackgerrits in #4343
- feat: use strong type for no label by @jackgerrits in #4342
- feat: Adding EMT reduction. by @mrucker in #4264
- feat: [automl] trace to csv files by @lalo in #4355
- feat: robust confidence sequence estimator by @bassmang in #4297
- feat: [automl] config oracle cubic on top of quadratic by @lalo in #4351
- feat: update for probabilistic label tree reduction (#2766) - support for --probabilities option and fixed compatibility with VW 9+ version by @mwydmuch in #4138
- feat: constexpr uniform_hash and type fixes by @jackgerrits in #4415
- feat: Enable learner type checks at build. by @zwd-ms in #4411
- feat: stabilize unique_ptr based initialize function by @jackgerrits in #4438
- feat: Added new CCB predict benchmark by @rajan-chari in #4421
- feat: [CB_GF] CB with graph feedback text input by @olgavrou in #4392
- feat: [epsilon_decay] predict_only_model by @bassmang in #4458
Fixes
- fix!: resolve csoaa_ldf prediction return correctness by @jackgerrits in #4395
- fix!: [LAS] las + squarecb to re-use squarecb gamma by @olgavrou in #4479
- fix!: [py] use full word for namespace and add test by @lalo in #4485
- fix: [Explore_eval] fix threshold for adaptive multiplier by @marco-rossi29 in #4168
- fix: Add pragma once to merge.h by @byronxu99 in #4284
- fix: [epsilon_decay] process models in descending order when shifting by @bassmang in #4286
- fix: [CI] check for missing args consistently in forwards/backwards compat by @olgavrou in #4289
- fix: [CI] backwards compat don't fail if model file is missing by @olgavrou in #4291
- fix: silence unused warning when las simd not enabled by @jackgerrits in #4299
- fix: Build las simd on x86 only and rename command line flag. by @zwd-ms in #4300
- fix: [automl] update champ score when it matches labelled_action by @lalo in #4326
- fix: fix get_features function returning dangling pointer by @jackgerrits in #4328
- fix: [automl] config oracle edge cases by @lalo in #4327
- fix: remove type numpy aliases as they are now removed upstream by @jackgerrits in #4363
- fix: fix loop binding to temporary by @jackgerrits in #4379
- fix: [automl] update print logic for new oracle by @lalo in #4384
- fix: exception safety in learner builder by @jackgerrits in #4429
- fix: remove cerr from cs_robust by @bassmang in #4441
- fix: [automl/epsilon_decay] brentq optimization by @bassmang in #4449
- fix: pydocs formatting by @bassmang in #4464
- fix: invert_hash for coin/ftrl by @bassmang in #4465
- fix: Account for | in make_valid_name() by @darwinyip in #4468
- fix: [LAS] LAS not a cb adf common reduction, fixes metrics with LAS bug by @olgavrou in #4476
- fix: [automl] allow multiple models underneath automl by @bassmang in #4463
- fix: include t, min and max label in model merging by @jackgerrits in #4483
Other Changes
- ci: use shared caches for vcpkg job by @jackgerrits in #4270
- build: add missing include by @jackgerrits in #4275
- refactor: use model utils instead of macro in recall tree by @jackgerrits in #4248
- refactor: [automl] remove lb_trick by @bassmang in #4283
- docs: Update off_policy_evaluation.md by @olgavrou in #4280
- ci: compatibility CI checks to not fail on newly added arguments by @olgavrou in #4287
- test: remove flaky test (win) by @lalo in #4290
- build: Use nix to manage dev tooling starting with clang-tidy by @jackgerrits in #4292
- build: remove regex from clang-tidy-diff command as it wasnt working by @jackgerrits in #4294
- chore: use clang-format-14 for formatting by @jackgerrits in #4302
- build: consume string-view-lite as a sys dep for vcpkg by @jackgerrits in #4303
- refactor: implement scaffold for finish_example split and POC migrations by @jackgerrits in #4296
- refactor: move csv parser into csv namespace by @jackgerrits in #4304
- refactor: split apart output and progressive log by @jackgerrits in #4308
- refactor: move accumulate funcs into details namespace by @jackgerrits in #4305
- refactor: migrate mwt finish_example by @jackgerrits in #4311
- build: reduce header dependencies in important headers by @jackgerrits in #4306
- refactor: split cache parser into separate lib by @jackgerrits in #4309
- refactor: fix conversion warnings in v_array and removed deprecated usages by @jackgerrits in #4310
- refactor: allow reduction to control print frequency by @jackgerrits in #4315
- ci: Python sdist/docs - use 3.10 as that is now the default on 22.04 by @jackgerrits in #4317
- docs: dont execute epsilon decay notebook by @jackgerrits in #4318
- ci: used shared cache for asan builds by @jackgerrits in #4313
- chore: Move cats paper code to demo directory. by @zwd-ms in #4320
- refactor: migrate nn finish_example by @jackgerrits in #4314
- chore: don't try to format vcpkg_installed files by @jackgerrits in #4323
- refactor: [automl] small clean-up by @lalo in #4325
- refactor: migate OAA finish func by @jackgerrits in #4316
- refactor: migrate stagewise_poly finish_example by @jackgerrits in #4322
- perf: arm64 performance optimizations by @rami-lv in #4288
- refactor: deprecate alloc/dealloc example by @jackgerrits in #4329
- refactor: deduplicate random_seed state by @jackgerrits in #4331
- refactor: remove unused field in parser by @jackgerrits in #4332
- refactor: move some fields out of workspace by @jackgerrits in #4333
- refactor: small namespace cleanup by @jackgerrits in #4334
- refactor: move shared_data into VW namespace by @jackgerrits in #4338
- refactor: cleanup unique_sort.h by @jackgerrits in #4336
- refactor: remove unused stable_unique by @jackgerrits in #4337
- refactor: mark WRITEIT and WRITEITVAR as deprecated by @jackgerrits in #4335
- refactor: migrate finish and sender no longer holds on to examples by @jackgerrits in #4321
- refactor: Migrate plt finish_example. by @zwd-ms in #4339
- refactor: Migrate multilabel_oaa finish_example. by @zwd-ms in #4340
- refactor: migrate topk finish_example by @jackgerrits in #4324
- refactor: namespacing in parser.h by @jackgerrits in #4345
- refactor: remove scoped_calloc_or_throw in favor of make_unique by @jackgerrits in #4346
- refactor: Migrate oja newton finish and modernize memory management by @jackgerrits in #4350
- refactor: fix namespacing of feature_group.h header by @jackgerrits in #4352
- refactor: remove finish example for count_label by @jackgerrits in #4349
- refactor: migrate confidence finish_example by @jackgerrits in #4348
- refactor: migrate cbzo finish_example by @jackgerrits in #4347
- refactor: use unique_ptr for parser, fix new/free mismatch by @jackgerrits in #4357
- refactor: v_array resize_but_with_stl_behavior -> resize rename by @jackgerrits in #4361
- refactor: cb_explore finish function by @peterychang in #4360
- chore: clarify daemon support on MacOS by @jackgerrits in #4367
- chore: add active_interactor deprecation notice by @jackgerrits in #4359
- refactor: move text parser into its own lib by @olgavrou in #4356
- refactor: deprecate some legacy functions by @jackgerrits in #4369
- refactor: update baseline_cb_test usage of initialize to scoped by @jackgerrits in #4370
- refactor: move label operations to members for findability by @jackgerrits in #4374
- refactor: migrate bs finish_example by @jackgerrits in #4366
- test: migrate some tests from boost to gtest by @jackgerrits in #4376
- refactor: migrate cats_pdf, cats finish_example by @jackgerrits in #4372
- refactor: migrate audit_regressor finish_example by @jackgerrits in #4358
- refactor: migrate boosting finish_example by @jackgerrits in #4362
- test: move more tests and add matcher impl by @jackgerrits in #4380
- refactor: move confidence_seq code to impl by @jackgerrits in #4373
- refactor: Migrate cb_explore_adf reductions finish functions by @peterychang in #4330
- test: move more tests to gtest from boost by @jackgerrits in #4383
- chore: remove some LAS tests that are not needed by @olgavrou in #4386
- refactor: move json parser into its own lib by @olgavrou in #4381
- refactor: reduce global namespace pollution by @jackgerrits in #4385
- test: move more tests from boost to gtest by @jackgerrits in #4387
- refactor: rename read_line_json_s -> read_line_json by @olgavrou in #4388
- ci: csharp benchmarks to run on master by @olgavrou in #4389
- build: automate test main.cc file generation by @jackgerrits in #4390
- refactor: [automl/epsilon_decay] integrate robust confidence sequences by @bassmang in #4377
- refactor: migrate cb_adf finish_example by @jackgerrits in #4397
- build: rely on gtest_main target instead of custom by @jackgerrits in #4396
- refactor: migrate cs_active finish function by @jackgerrits in #4394
- refactor: cb_to_cb_adf finish function by @peterychang in #4398
- test: move last of tests to gtest, remove all boost test infra by @jackgerrits in #4399
- refactor: fix namespacing of parse_primitives.h by @jackgerrits in #4405
- refactor: fix namespacing of io_buf.h by @jackgerrits in #4403
- refactor: fix namespacing of parameters by @jackgerrits in #4404
- refactor: move vw.h functions to be defined in vw.cc by @jackgerrits in #4410
- refactor: Add metrics collector by @bassmang in #4407
- refactor: standardize googletest naming by @bassmang in #4408
- refactor: migrate log_multi finish_example by @jackgerrits in #4414
- refactor: migrate memory_tree finish_example by @jackgerrits in #4412
- test: minor testing fixes by @bassmang in #4417
- refactor: migrate automl finish_example by @bassmang in #4419
- refactor: migrate search finish_example by @jackgerrits in #4400
- refactor: migrate recall_tree finish_example by @jackgerrits in #4402
- docs: reproducible doxygen docs using nix by @jackgerrits in #4425
- refactor: remove learner print_example by @jackgerrits in #4423
- refactor: cb_algs finish functions by @peterychang in #4409
- refactor: migrate ect finish_example by @bassmang in #4424
- refactor: cleanup cb.h header by @jackgerrits in #4427
- refactor: remove learn and label references from ect predict by @bassmang in #4426
- refactor: cleanup more global namespace pollution by @jackgerrits in #4428
- docs: Fixed typo from steep cost function to step cost function by @bkowshik in #4393
- revert: previous wrong correction in docs by @lalo in #4433
- refactor: cleanup CB related namespaces by @jackgerrits in #4431
- refactor: deprecate is_from_pool by @jackgerrits in #4432
- ci: output valgrind logs on unit test failure by @jackgerrits in #4430
- test: add tests for uniform_hash by @jackgerrits in #4436
- refactor: migrate exploration namespace to VW::explore by @jackgerrits in #4435
- docs: minor tutorial cleanups by @ataymano in #4437
- refactor: migrate INTERACTIONS namespace by @jackgerrits in #4434
- refactor: unify locked and unlocked pools in one impl by @jackgerrits in #4439
- refactor: migrate cbify finish example by @jackgerrits in #4440
- refactor: migrate MULTILABEL namespace by @jackgerrits in #4443
- refactor: migrate GD namespace by @jackgerrits in #4442
- refactor: migrate explore_eval finish function by @olgavrou in #4448
- refactor: migrate warm_cb finish_example by @jackgerrits in #4447
- build: dont expose eigen in public headers by @jackgerrits in #4445
- refactor: rename finalize_driver to finish by @jackgerrits in #4450
- refactor: migrate csoaa finish_example by @jackgerrits in #4446
- refactor: consolidate random into common by @jackgerrits in #4453
- refactor: migrate lda finish_example by @jackgerrits in #4413
- refactor: migrate csoaa_ldf finish_example by @jackgerrits in #4452
- refactor: remove internal usage functions from workspace api by @olgavrou in #4456
- build: do not override externally set VW_CXX_STANDARD by @jackgerrits in #4455
- refactor: migrate to new initialize function by @jackgerrits in #4444
- build: enable consumption of system sse2neon by @jackgerrits in #4457
- test: add test for interactive active workload by @jackgerrits in #4454
- refactor: refactor finish_example for active by @rajan-chari in #4353
- docs: Fix typo in "Contextual Bandit Content Personalization" tutorial by @toldervoll in #4466
- refactor: brentq optimizations by @bassmang in #4462
- refactor: estimator dir by @lalo in #4470
- refactor: [automl/epsilon decay] bisection method by @bassmang in #4469
- refactor: estimators ns in cressieread by @lalo in #4472
- refactor: [automl] add tol_x and opt_func flags by @bassmang in #4475
- refactor: python lint (new black version) by @bassmang in #4480
- refactor: make workspace const in add_constant_feature by @jackgerrits in #4481
- refactor: implement delta add/subtract by actually adding and subtracting weights by @byronxu99 in #4486
New Contributors
- @rami-lv made their first contribution in #4288
- @bkowshik made their first contribution in #4393
- @toldervoll made their first contribution in #4466
- @darwinyip made their first contribution in #4468
Full Changelog: 9.6.0...9.7.0