This release includes several new reductions including contextual bandits with graph feedback as well as a completely new interaction grounded learning reduction. There are also WASM bindings available for Vowpal Wabbit now.
Contextual Bandits with Graph Feedback
Contextual Bandits (CB) with graph feedback can be used for scenarios where some actions, when taken, reveal information other actions (not taken), or maybe don't reveal any information at all. If there exists prior knowledge of this relationship between actions then that knowledge can be used to make exploration and learning more efficient.
See here for more details.
Contextual Bandits with interaction grounded learning
Interaction grounded learning (IGL) can be used for the scenario where user doesn't have a reward function. It will automatically learn a personalized reward function from user's feedback and optimize directly for the latent user satisfaction
See here for more details
Vowpal Wabbit package in npm
Now that WASM bindings are available, Vowpal Wabbit can be used in JavaScript and TypeScript applications via the npm package. For more details, see here
Click here to see all changes in this release
What's Changed
New Contributors
Full Changelog: 9.8.0...9.9.0