1.13.0 (2025-10-30)
Features Added
-
Updated
IndirectAttackrisk category for RedTeam toIndirectJailbreakto better reflect its purpose. This change allows users to apply cross-domain prompt injection (XPIA) attack strategies across all risk categories, enabling more comprehensive security testing of AI systems against indirect prompt injection attacks during red teaming. -
Added
TaskAdherence,SensitiveDataLeakage, andProhibitedActionsas cloud-only agent safety risk categories for red teaming. -
Updated all evaluators' output to be of the following schema:
gpt_{evaluator_name},{evaluator_name}: float score,{evaluator_name}_result: pass/fail based on threshold,{evaluator_name}_reason,{evaluator_name}_threshold{evaluator_name}_prompt_tokens,{evaluator_name}_completion_tokens,{evaluator_name}_total_tokens,{evaluator_name}_finish_reason{evaluator_name}_model: model used for evaluation{evaluator_name}_sample_input,{evaluator_name}_sample_output: input and output used for evaluation
This change standardizes the output format across all evaluators and follows OTel convention.
Bugs Fixed
image_tagparameter inAzureOpenAIPythonGraderis now optional.