- Added support for new
gpt-4-turbo-2024-04-09
andgpt-4-turbo
models.- Benchmarked at 61.7% on Exercism benchmark, comparable to
gpt-4-0613
and worse than thegpt-4-preview-XXXX
models. See recent Exercism benchmark results. - Benchmarked at 34.1% on the refactoring/laziness benchmark, significantly worse than the
gpt-4-preview-XXXX
models. See recent refactor bencmark results. - Aider continues to default to
gpt-4-1106-preview
as it performs best on both benchmarks, and significantly better on the refactoring/laziness benchmark.
- Benchmarked at 61.7% on Exercism benchmark, comparable to