This release marks Koog's transition toward a stable 1.0 API. The library is now split into "stable" and "beta" modules, so production code can pin to APIs that won't break unexpectedly while experimental features continue to evolve. Alongside that, this release lands a redesigned Java interop layer, decouples HTTP transport from Ktor, brings OpenTelemetry to Kotlin Multiplatform, and adds Anthropic prompt caching.
Major Features
Stable / Beta module split
- Versioning by stability: Modules now ship under two streams — stable (
1.0.0-preview) and beta (1.0.0-preview-beta) — so production code can pin to APIs that won't break without a deprecation cycle (#2011, #2000).
Java interop, redesigned
- Uniform blocking API: All Java-facing entry points now follow one pattern —
xxxBlockingin Kotlin, plainxxxfrom Java. ExplicitExecutorService/Executorparameters are gone; the agent's configured dispatcher is used instead (#2005). - Deadlock-free reentrant calls: Kotlin → Java → Kotlin call chains on a single-threaded executor no longer deadlock — the reentrant dispatch is detected and skipped (#1945, KG-750).
HTTP transport, decoupled from Ktor
- Pluggable HTTP factory: LLM clients now take a
KoogHttpClient.Factoryinstead of a KtorHttpClient. A Ktor-backed default is auto-discovered on JVM/Android; users can plug in Java's HTTP client, OkHttp, or Spring'sRestClientwithout touching Koog internals (#2006, #1948, KG-821, KG-818). - Ollama on
KoogHttpClient: Ollama now routes streaming, headers, and endpoint config through the same abstraction as every other provider (#1993, KG-833).
OpenTelemetry on every target
- Multiplatform OpenTelemetry: Langfuse, Weave, and DataDog now run on every Koog target via a Ktor-based OTLP/JSON exporter, not just JVM (#1942, KG-785).
- Built-in metrics: Agents emit standard
gen_ai.client.token.usage,gen_ai.client.operation.duration, and a customgen_ai.client.tool.countmetric — plug straight into existing Prometheus/Grafana stacks (#1381, KG-136).
Anthropic prompt caching
- Automatic and explicit cache control: End-to-end caching support — automatic on requests, explicit breakpoints on messages, cache tokens in usage metrics. Cuts cost and latency for agents that re-send long system prompts (#1812, KG-707).
Memory and persistence
AIAgentStoragein checkpoints: Custom key-value data is now saved and restored alongside agent checkpoints; a newrunFromCheckpointAPI restores execution without requiring the Persistence feature (#1998, #1828).- Persistence for planner agents: Planner-based agents now support checkpoint/restore (#1786, KG-673).
- Amazon Bedrock AgentCore as
LongTermMemory: Managed vector-memory backend on Bedrock (#1855, KG-603). LongTermMemoryreliability: Storage errors no longer silently swallowed — newFailurePolicyplus a fix for double-ingestion during active sessions. Feature promoted from experimental (#1963).
New providers
- LiteRT LLM client: New client for running Google's LiteRT models locally (#1980).
- Oracle Database
ChatHistoryProviderfor Oracle-standardized deployments (#1851, KG-772).
Improvements
- New models: Anthropic Opus 4.7 (#1861), OpenAI GPT-5.5 and GPT-5.5 Pro (#1913), DeepSeek V4 Flash and Pro (#1914), additional Bedrock models — Kimi K 2.5, MiniMax 2.5, Gemma 3, GPT OSS (#1902), and Ollama gpt-oss / qwen3.5 (#1292).
ToolCallMetadataside channel: Tools can now receive per-call context (trace IDs, correlation IDs, feature flags, the liveAIAgentContext) without polluting their LLM-visible argument schema. Features can contribute metadata viaAIAgentPipeline.provideToolCallMetadata, with caller-supplied values winning on key collision (#1886, #1777).- Planners moved to a dedicated module:
GOAPPlannerandSimpleLLMPlannernow live in a separateagents:agents-plannersmodule, and a simplerAIAgentPlannerStrategy.create(name, planner)factory replaces the old builder. Agents that don't use planning no longer pay the dependency cost (#1997). - MCP SDK upgrade with Streamable HTTP: MCP kotlin-sdk upgraded from 0.8.1 to 0.11.1; Streamable HTTP is now the primary transport for both MCP client and server (#1870, KG-792, KG-756, KG-755, KG-49).
RetrieveFactsFromHistoryextracted from AgentMemory: ThisHistoryCompressionStrategynow lives outside the AgentMemory feature so it can be used independently. The oldAgentMemoryfeature is removed in favor of the more capableLongTermMemory(#1927).- OpenTelemetry GenAI semantic conventions update: Aligned with the latest spec — content is carried via
gen_ai.input.messages/gen_ai.output.messagesattributes instead of deprecated per-message events; moderation results moved to a Koog custom attributekoog.moderation.result(#1967, KG-826). KoogClockmigration: Internal time APIs now use aKoogClockabstraction instead ofkotlin.time.Clock, enabling virtual-time testing and consistent clock behavior across platforms (#1925).- Ollama
thinkparameter from prompt params: Thethinkflag is now sourced fromprompt.paramsinstead of being hard-coded, so callers can control reasoning behavior per prompt (#1615, #1877, KG-736). SearchRequestinterface inLongTermMemory: Replaces the concreteSimilaritySearchRequestso storages can implement keyword, hybrid, or other search types (#1864).- Minimum Java version raised to 17: Aligns the runtime requirement with documentation and modern toolchain expectations (#1931).
- Factory functions replace
invokeconstructors:AIAgent,AIAgentService,ToolRegistry,RollbackToolRegistry, andAIAgentPlannerStrategynow use top-level factory functions instead of companion-objectinvokeoperators. Usage syntax (A(...)) is unchanged for normal callers; only unusual forms likeA.Companion.invoke()are affected (#1882). - Agent pipeline cleanup: Pipeline event contexts now expose the
AIAgentinstance directly instead of separateagentId/configfields, parameter order is harmonized, and KDoc style is unified across pipeline interfaces (#1991, KG-807). - Locks and exception utilities consolidated: Duplicate
RWLockcode moved into a dedicatedagents-utilsmodule (#1893, KG-812).
Bug Fixes
- Spring Boot: Anthropic API key masked in autoconfiguration logs: Previously the key was emitted in plaintext during application startup. Security fix (#1965).
- Ollama streaming
Flow invariant is violated:buildStreamFrameFlownow useschannelFlowso emission works across the dispatched contexts Ktor's streaming HTTP introduces. Most visibly fixes Ollama streaming (#1844, #1775). - Ollama:
text/plainresponses parsed as JSON: Ollama sometimes returns valid JSON withContent-Type: text/plain. The client now registers JSON decoding for that content type too (#1887, #1237). - Ollama: tool calls returned before assistant text: When the model emits both a tool call and a text message, the tool call now comes first — matching OpenAI behavior — so built-in strategies don't terminate prematurely (#1888, KG-811).
- Ollama batch embeddings: Implementation now handles both current and legacy Ollama API response formats (#1885, #1874).
- Ollama embeddings implementation: Aligned with the official Ollama embeddings API (#1854).
- OpenAI:
additional_propertiesno longer leaks into requests:AdditionalPropertiesFlatteningSerializernow recognizes both camelCase and snake_case forms, so theadditionalPropertiesmap is correctly stripped underJsonNamingStrategy.SnakeCaseand no longer trips OpenAI's 400unknown_parametererror (#1884, #1878). - OpenAI: response decoding exceptions wrapped:
AbstractOpenAILLMClientnow wraps decode failures inLLMClientExceptioninstead of letting arbitrary exception types escape (#2012, #1978). - OpenAI / Google: keepalive and reasoning handling in streaming responses: OpenAI streaming now honors keepalive events; Google streaming now correctly distinguishes reasoning content from plain text (#1868, #1865, #1866).
- OpenRouter: streaming with tool calls no longer errors (#1369, KG-626).
- Streaming: blank tool call IDs processed correctly (#1915, #1900).
- Streaming: empty text complete frames filtered out (#1924).
- Reflective tool failures preserve the original exception message: Previously the
InvocationTargetExceptionwrapper hid the real error so agents received"Unknown error"— now the underlying cause is surfaced (#1548, KG-704). @Tool(customName = ...)honored inToolSet.asTools(): Custom tool names declared via the@Toolannotation are now respected when registering viatools(ToolSet)(#1883, #1881).AIAgentErrorcarries atypeparameter: Adds the exception type to the error data class so downstream consumers can branch on it (#1917, KG-814).- Tool/agent event contexts carry the original
Throwable: Pipeline failure hooks (onToolValidationFailed,onToolCallFailed, etc.) now receive a realThrowableinstead of a stringifiedAIAgentError, preserving full exception details and fixing a latenterror.typemislabel in OpenTelemetry spans (#1918, KG-815). - OpenTelemetry: failed LLM requests no longer crash the feature: Failures are signalled via span ERROR status and
error.typeattribute instead of the non-specfinish_reasons=[error](#1435, KG-675). - OpenTelemetry: span adapter hooks run on fully populated spans:
SpanAdapter.onBeforeSpanFinishednow fires after all attributes are set, so Langfuse and Weave adapters see the same data the SDK exports (#1969, KG-808). - OpenTelemetry: Langfuse trace attributes set on every span: Previously only
invoke_agentcarried trace-level attributes, so settings likelangfuse.environmentwere ignored on most spans (#1547, KG-703). - OpenTelemetry: Java API for the feature works correctly (#1992, KG-835).
- OpenTelemetry: configurable shutdown hook: Removed the hardcoded JVM shutdown hook that caused
IllegalStateExceptionduring graceful drain windows. A newsetShutdownOnAgentCloseopt-in (defaultfalse) replaces the previous always-on behavior (#1856, #1850). subgraphWithTask/subtask: missing tool results when finish tool is called alongside other tools: When the model requests other tools together with the finish tool, results from those tools are now appended to the prompt so the model doesn't see orphan tool calls (#1971).RetryingLLMClientJSON schema generators: The wrapper no longer drops the underlying client'sJsonSchemaGeneratorimplementation on retry (#1781).- Tool raw result preserved: Added
resultObjecttoReceivedToolResultso tools producing structured intermediate results expose them to downstream code (#2004). withPromptuses a write lock: Previously used a read lock and could race with concurrent prompt mutations (#1871).- LiteRT iOS stub +
FactRetrievalvarargs constructor restored (#2008). - Gemini 2.0 Flash and Flash-Lite advertise
fullCapabilities: These models support structured output and now declare it (#1191). - Llama 3 model IDs on OpenRouter: Corrected the provider prefix from
metatometa-llamaso the models actually resolve (#1346). LLAMA4_SCOUTmodel definition: FixedLLAMA4_SCOUTwhich was incorrectly pointing to baseLLAMA4(#1155).
Breaking Changes
This is the 1.0 preview — breaking changes are intentional and grouped here so migration is straightforward.
- LLM client constructors: The
(apiKey, settings, baseClient: HttpClient, ...)constructor is removed from all 8 HTTP-based LLM clients. Use the factory-based constructor or, on JVM, the convenience top-level function. ThebaseClient: HttpClientparameter is also removed fromPromptExecutor.builder().{openAI, anthropic, google, deepseek, mistral, ollama, openRouter, dashscope}(...); pass an optionalhttpClientFactory: KoogHttpClient.Factoryinstead, or omit it for the default. The deprecatedKoogHttpClient.Companion.fromKtorClient(... baseUrl ...)overload is also removed (#2006). prompt-executor-llms-allconsumers: Ktor types no longer leak onto the compile classpath transitively — add an explicithttp-client-ktordependency if you needKtorKoogHttpClient/KtorKoogHttpClient.Factorydirectly. The Java synthetic classSimplePromptExecutorsKtis renamed toSimplePromptExecutors(updateimport staticlines) (#2006).- Java blocking API rename:
javaNonSuspendRun→runBlockingonAIAgent;javaNonSuspendInitialize/javaNonSuspendOnMessage→initializeBlocking/onMessageBlockingonFeatureMessageProcessor; allcreateAgent*,removeAgent*,agentByIdJava overloads onAIAgentServiceand all blocking overloads onPromptExecutor/LLMClientrenamed to*Blocking;NonSuspendAIAgentStrategy→AIAgentStrategyBlocking(abstractexecuteStrategy→executeBlocking);NonSuspendAIAgentFunctionalStrategy→AIAgentFunctionalStrategyBlocking. Java callers see the original names via@JvmName, so most Java source code requires no changes (#2005). ExecutorService/Executorparameters removed from blocking wrappers: Each agent's own configured dispatcher is used instead. Also removed:SubtaskBuilder.withExecutorService()andexecutorServiceproperty (#2005).- Planners moved to a new module: GOAP and LLM-based planner usage now requires an explicit
agents:agents-plannersdependency.AIAgentPlannerandJavaAIAgentPlannergain two abstract methods (initializeState,provideOutput);AIAgentPlannerStrategy.builder()andAIAgentPlannerStrategy.goap()are removed in favor ofAIAgentPlannerStrategy(name, planner)/AIAgentPlannerStrategy.create(name, planner)(#1997). AgentMemoryfeature removed: UseLongTermMemoryinstead.RetrieveFactsFromHistorymoved out ofagents-features-memory(#1927).LongTermMemoryAPI renames:IngestionTimingremoved;QueryExtractor→SearchQueryProvider;ExtractionStrategy→DocumentExtractor(#1963).AIAgentStorageAPI changes:AIAgentStorageKeyequality is now name-based rather than referential; the no-argAIAgentStorage()constructor is replaced byAIAgentStorage(serializer);AIAgentStorageAPI.toMap()removed (usetoSerializedMap()orputAll(other));runFromCheckpoint'sagentInputparameter renamed toinput(#1998).- OpenTelemetry Multiplatform migration:
addSpanExporter,addMetricExporter,addMetricFilterare now JVM-only extensions onOpenTelemetryConfigJvm— JVM users must update imports.addResourceAttributessignature changes fromio.opentelemetry.api.common.AttributestoMap<String, Any>. TheSpanEndStatuswrapper is removed — useStatusDatadirectly (#1942, KG-785). - OpenTelemetry deprecated events removed: The
ai.koog.agents.features.opentelemetry.eventpackage and all event APIs onGenAIAgentSpan(events,addEvent,addEvents,removeEvent) are removed. Thegen_ai.systemattribute andmoderation.resultspan event are no longer emitted (#1967, KG-826). KoogClockreplaceskotlin.time.Clock: All APIs that previously took aClockparameter are affected (#1925).KoogHttpClientimplementations: Must now implement the newlines()method for non-SSE line streaming, and methods accept per-requestheadersparameters (#1993, KG-833).OllamaClient.baseUrlproperty removed — endpoint configuration is delegated to the suppliedKoogHttpClient(#1993).- Anthropic prompt caching: Removed deprecated
usermessage builders to make room for thecacheControlvariant (#1812). - Companion
invokeconstructors removed: ForAIAgent,AIAgentService,ToolRegistry,RollbackToolRegistry,AIAgentPlannerStrategy. NormalA(...)syntax is unchanged; only unusual forms likeA.Companion.invoke()orA.invoke()no longer compile (#1882). - Minimum Java version: 17 (#1931).
AgentCheckpointDatashape: FieldsnodePath,lastInput, andlastOutputare removed — they now live in the existingproperties: JSONObject(#1786).
Deprecations
AIAgentConfigJVM constructors / methods takingExecutorService— use the more generalExecutorvariants instead (#1945).startSseMcpServer(factory, port, host, tools)andstartSseMcpServer(factory, host, tools)— usestartMcpServer(factory, tools, port, host)/startMcpServer(factory, tools, host)(#1870).