We are excited to announce that TorchMetrics v0.7 is now publicly available. This release is pretty significant. It includes several new metrics (mainly for NLP), naming and import changes, general improvements to the API, and some other great features. TorchMetrics thus now has over 60+ metrics, and the package is more user-friendly than ever.
NLP metrics - Text package
Text package is a part of TorchMetrics as of v0.5. With the growing capability of language generation models, there is also a real need to have reliable evaluation metrics. With several added metrics and unified API, TorchMetrics makes the usage of various metrics even easier! TorchMetrics v0.7 newly includes a couple of machine translation metrics such as chrF, chrF++, Translation Edit Rate, or Extended Edit Distance. Furthermore, it also supports other metrics - Match Error Rate, Word Information Lost, Word Information Preserved, and SQuAD evaluation metrics. Last but not least, we also made possible the evaluation of the ROUGE score using multiple references.
Argument unification
Importantly, all text metrics assume preds, target input order with these explicit keyword arguments. If different naming was used before v0.7, it is deprecated and completely removed in v0.8.
Import and naming changes
TorchMetrics v0.7 brings more extensive and minor changes to how metrics should be imported. The import changes directly impact v0.7, meaning that you will most likely need to change the import statement for some specific metrics. All naming changes follow our standard deprecation process, meaning that in v0.7, any metric that is renamed will still work but raise an error asking to use the new metric name. From v0.8, the old metric names will no longer be available.
[0.7.0] - 2022-01-17
Added
- Added NLP metrics:
- Added
MultiScaleSSIMinto image metrics (#679) - Added Signal to Distortion Ratio (
SDR) to audio package (#565) - Added
MinMaxMetricto wrappers (#556) - Added
ignore_indexto retrieval metrics (#676) - Added support for multi references in
ROUGEScore(#680) - Added a default VSCode devcontainer configuration (#621)
Changed
- Scalar metrics will now consistently have additional dimensions squeezed (#622)
- Metrics having third party dependencies removed from global import (#463)
- Untokenized for
BLEUScoreinput stay consistent with all the other text metrics (#640) - Arguments reordered for
TER,BLEUScore,SacreBLEUScore,CHRFScorenow the expected input order is predictions first and target second (#696) - Changed dtype of metric state from
torch.floattotorch.longinConfusionMatrixto accommodate larger values (#715) - Unify
preds,targetinput argument's naming across all text metrics (#723, #727)bert,bleu,chrf,sacre_bleu,wip,wil,cer,ter,wer,mer,rouge,squad
Deprecated
- Renamed IoU -> Jaccard Index (#662)
- Renamed text WER metric: (#714)
functional.wer->functional.word_error_rateWER->WordErrorRate
- Renamed correlation coefficient classes: (#710)
MatthewsCorrcoef->MatthewsCorrCoefPearsonCorrcoef->PearsonCorrCoefSpearmanCorrcoef->SpearmanCorrCoef
- Renamed audio STOI metric: (#753, #758)
audio.STOItoaudio.ShortTimeObjectiveIntelligibilityfunctional.audio.stoitofunctional.audio.short_time_objective_intelligibility
- Renamed audio PESQ metrics: (#751)
functional.audio.pesq->functional.audio.perceptual_evaluation_speech_qualityaudio.PESQ->audio.PerceptualEvaluationSpeechQuality
- Renamed audio SDR metrics: (#711)
functional.sdr->functional.signal_distortion_ratiofunctional.si_sdr->functional.scale_invariant_signal_distortion_ratioSDR->SignalDistortionRatioSI_SDR->ScaleInvariantSignalDistortionRatio
- Renamed audio SNR metrics: (#712)
functional.snr->functional.signal_distortion_ratiofunctional.si_snr->functional.scale_invariant_signal_noise_ratioSNR->SignalNoiseRatioSI_SNR->ScaleInvariantSignalNoiseRatio
- Renamed F-score metrics: (#731, #740)
functional.f1->functional.f1_scoreF1->F1Scorefunctional.fbeta->functional.fbeta_scoreFBeta->FBetaScore
- Renamed Hinge metric: (#734)
functional.hinge->functional.hinge_lossHinge->HingeLoss
- Renamed image PSNR metrics (#732)
functional.psnr->functional.peak_signal_noise_ratioPSNR->PeakSignalNoiseRatio
- Renamed image PIT metric: (#737)
functional.pit->functional.permutation_invariant_trainingPIT->PermutationInvariantTraining
- Renamed image SSIM metric: (#747)
functional.ssim->functional.scale_invariant_signal_noise_ratioSSIM->StructuralSimilarityIndexMeasure
- Renamed detection
MAPtoMeanAveragePrecisionmetric (#754) - Renamed Fidelity & LPIPS image metric: (#752)
image.FID->image.FrechetInceptionDistanceimage.KID->image.KernelInceptionDistanceimage.LPIPS->image.LearnedPerceptualImagePatchSimilarity
Removed
- Removed
embedding_similaritymetric (#638) - Removed argument
concatenate_textsfromwermetric (#638) - Removed arguments
newline_sepanddecimal_placesfromrougemetric (#638)
Fixed
- Fixed MetricCollection kwargs filtering when no
kwargsare present in update signature (#707)
Contributors
@ashutoshml, @Borda, @cuent, @Fariborzzz, @getgaurav2, @janhenriklambrechts, @justusschock, @karthikrangasai, @lucadiliello, @mahinlma, @mathemusician, @mona0809, @mrleu, @puhuk, @quancs, @SkafteNicki, @stancld, @twsl
If we forgot someone due to not matching commit email with GitHub account, let us know :]