github promptfoo/promptfoo 0.119.0

latest release: 0.119.1
2 days ago

What's Changed

Features

  • feat(webui): filter eval results by metric values with numeric operators (EQ, GT, LTE, etc.) by @will-holley in #6011
  • feat(providers): 10-100x performance improvement for Python providers with persistent worker pools by @mldangelo in #5968
  • feat(providers): add OpenAI Agents SDK integration with support for agents, tools, and handoffs by @mldangelo in #6009
  • feat(providers): add function calling/tool support for Ollama by @mldangelo in #5977
  • feat(providers): add support for Claude Haiku 4.5 by @jameshiester in #5937
  • feat(redteam): add jailbreak:meta strategy with intelligent attack taxonomy learning by @MrFlounder in #6021
  • feat(redteam): add COPPA plugin by @typpo in #5997
  • feat(redteam): add GDPR preset mappings by @typpo in #5986
  • feat(redteam): add modifiers support to iterative strategies by @MrFlounder in #5972
  • feat(redteam): add authoritative markup injection strategy by @typpo in #5961
  • feat(redteam): add wordplay plugin by @typpo in #5889
  • feat(redteam): add Simba red team agent strategy by @sklein12 in #5795
  • feat(redteam): add subcategory filtering to BeaverTails plugin by @typpo (a70372f)
  • feat(redteam): include pluginId, strategyId, and sessionId in CSV exports by @sklein12 in #6016
  • feat(webui): persist custom policy names by @will-holley in #5990
  • feat(webui): show target responses for red team test cases by @will-holley in #5869
  • feat(cli): log errors to file with console messages by @sklein12 in #5992
  • feat(cli): show errors in eval progress bar by @sklein12 in #5942
  • feat(cache): display latency measurements for cached responses by @mldangelo in #5978

Fixes

  • fix(providers): restore runtime variable substitution in templates by @mldangelo (5423f80)
  • fix(providers): improve Python provider reliability with automatic python3/python detection and better error handling by @mldangelo in #6034
  • fix(providers): simulated-user and mischievous-user now respect system prompts in multi-turn conversations by @mldangelo in #6020
  • fix(providers): improve MCP tool schema compatibility with OpenAI by @mldangelo in #5965
  • fix(providers): properly store sessionId in metadata by @sklein12 in #6016
  • fix(redteam): skip session management tests for stateless targets by @faizanminhas in #5989
  • fix(redteam): improve Crescendo strategy accuracy by @jameshiester in #5964
  • fix(redteam): reduce duplicate error messages for invalid strategy and plugin ids by @typpo in #5954
  • fix(fetch): improve retry counter messages and error details by @LizzHale in #6017, in #6019
  • fix(webui): pass extensions config when running evals by @theLucasAntunes in #6006
  • fix(webui): fix visibility of reset config button in red team setup by @will-holley in #5896
  • fix(webui): sync selected plugins to global config by @will-holley in #5991
  • fix(webui): fix HTTP test agent by @faizanminhas in #6033
  • fix(webui): reset strategy config dialog when switching strategies by @sklein12 in #6035

Chores

Documentation

Tests

New Contributors

Full Changelog: 0.118.17...0.119.0

Don't miss a new promptfoo release

NewReleases is sending notifications on new releases.