1.9.0 (2025-07-02)
Features Added
- Added support for Azure Open AI evaluation via
AzureOpenAIScoreModelGrader
class, which serves as a wrapper around Azure Open AI score model configurations. This new grader object can be supplied to the mainevaluate
method as if it were a normal callable evaluator. - Added new experimental risk categories ProtectedMaterial and CodeVulnerability for redteam agent scan.
Bugs Fixed
- Significant improvements to IntentResolution evaluator. New version has less variance, is nearly 2x faster and consumes fewer tokens.
- Fixed MeteorScoreEvaluator and other threshold-based evaluators returning incorrect binary results due to integer conversion of decimal scores. Previously, decimal scores like 0.9375 were incorrectly converted to integers (0) before threshold comparison, causing them to fail even when above the threshold. #41415
- Added a new enum
ADVERSARIAL_QA_DOCUMENTS
which moves all the "file_content" type prompts away fromADVERSARIAL_QA
to the new enum