v1.2.2.dev0 — More reliable LLM requests

Release Date: 2026-06-25
Changes: v1.2.2.dev0 → increase-llm-retry

Summary

This release makes LLM (large language model) requests more reliable by increasing and improving retry behavior for transient failures. You’ll see fewer failed operations caused by temporary network, rate-limit, or service hiccups, and clearer diagnostic messages when retries occur.

Highlights

New, smarter retry policy for LLM requests so transient failures (like rate limits or short network outages) automatically retry instead of failing.
Configurable retry settings so teams can tune retry attempts and backoff behavior for their environment.
Improved logging and error messages to make it easier to understand when retries happen and why.
No breaking changes — the new behavior is backward-compatible and works out-of-the-box.

Breaking Changes

None. The new retry behavior is backward-compatible and enabled by default. Existing integrations should continue to work without modification.

New Features

Configurable LLM retry policy — WHAT: A new setting that controls how many times and how the system retries requests made to external language models (LLMs). WHAT it does: it retries requests that fail for transient reasons (e.g., temporary network errors, rate limits) using an exponential backoff strategy. WHY it matters: reduces the number of operations that fail due to short-lived problems, so workflows complete more reliably without manual re-run.
Per-request retry override — WHAT: You can now override the default retry policy on a per-request basis. WHAT it does: allows individual calls to request more or fewer retries than the global default. WHY it matters: gives you control to be more aggressive for critical operations or more conservative to reduce latency for low-value calls.

Improvements

Smarter detection of retriable errors — WHAT: The retry logic better recognizes which types of failures are worth retrying (for example, transient network errors and rate-limit responses). WHAT it does: avoids retrying for permanent errors and focuses retries where they’ll help. WHY it matters: improves success rates without causing unnecessary retries and delays.
Better diagnostic logging — WHAT: When retries occur, logs include clearer messages about the cause and retry attempts. WHAT it does: makes debugging and observability easier when requests are retried. WHY it matters: saves time diagnosing intermittent LLM-related failures.
Default behavior tuned for reliability — WHAT: The out-of-the-box retry settings are more resilient. WHAT it does: no configuration required for most users to benefit from the change. WHY it matters: reduces operational friction and surprise failures.

Performance

Higher overall success rate for operations that depend on LLMs — WHAT: fewer tasks fail due to transient LLM errors. WHAT it does: increases end-to-end throughput and reduces manual reruns. WHY it matters: more reliable pipelines and fewer interruptions for users.
Slightly higher latency for individual calls that need retries — WHAT: calls that actually hit transient errors will take longer because they are retried. WHAT it does: adds backoff delays only when necessary. WHY it matters: a small trade-off for much higher overall success and stability.

Security

No security changes in this release. The update focuses on operational reliability and diagnostics.

Bug Fixes

Fixed intermittent failures where short-lived LLM service issues caused immediate operation aborts — WHAT: operations previously failed without retry when facing transient errors. WHAT it does: those failures now trigger the retry policy. WHY it matters: prevents avoidable task failures and manual reprocessing.

Technical Changes

Refactored LLM client retry code — WHAT: internal client code was reorganized to centralize retry/backoff logic. WHAT it does: makes future tuning and bug fixes easier. WHY it matters: reduces complexity for maintainers and enables safer improvements in future releases.
Added unit and integration tests around retry scenarios — WHAT: more test coverage for transient error handling. WHAT it does: reduces the chance of regressions. WHY it matters: increases confidence in release stability.

Compatibility

Component	Supported / Required
Python	`>=3.10,<3.15`
pydantic	`>=2.10.5`
litellm	`>=1.83.7`
fastapi	`>=0.116.2,<1.0.0`
sqlalchemy	`>=2.0.39,<3.0.0`
lancedb	`>=0.24.3,<1.0.0`
ladybug	`>=0.16.0,<0.18`

— The Cognee Team · 2026-06-25

topoteretes/cognee v1.2.2.dev0 v1.2.2.dev0 — More reliable LLM requests on GitHub