1.2.0 (2025-01-27)
Features Added
- CSV files are now supported as data file inputs with
evaluate()
API. The CSV file should have a header row with column names that match thedata
andtarget
fields in theevaluate()
method and the filename should be passed as thedata
parameter. Column name 'Conversation' in CSV file is not fully supported yet.
Breaking Changes
ViolenceMultimodalEvaluator
,SexualMultimodalEvaluator
,SelfHarmMultimodalEvaluator
,HateUnfairnessMultimodalEvaluator
andProtectedMaterialMultimodalEvaluator
will be removed in next release.
Bugs Fixed
- Removed
[remote]
extra. This is no longer needed when tracking results in Azure AI Studio. - Fixed
AttributeError: 'NoneType' object has no attribute 'get'
while running simulator with 1000+ results - Fixed the non adversarial simulator to run in task-free mode
- Content safety evaluators (violence, self harm, sexual, hate/unfairness) return the maximum result as the
main score when aggregating per-turn evaluations from a conversation into an overall
evaluation score. Other conversation-capable evaluators still default to a mean for aggregation. - Fixed bug in non adversarial simulator sample where
tasks
undefined
Other Changes
- Changed minimum required python version to use this package from 3.8 to 3.9
- Stop dependency on the local promptflow service. No promptflow service will automatically start when running evaluation.
- Evaluators internally allow for custom aggregation. However, this causes serialization failures if evaluated while the
environment variableAI_EVALS_BATCH_USE_ASYNC
is set to false.