Better OpenAI prompt caching + fixed token reporting
This release has two changes for OpenAI:
Better prompt caching
This release adds proper supports for to OpenAI's Responses API store: true and previous_response_id fields. With this setting, you can ask OpenAI to store the conversation as it happens, and continue from there by passing the previous_response_id in the next request. This means a faster agent and less token usage.
You can read more about it here:
- https://developers.openai.com/cookbook/examples/prompt_caching_201
- https://developers.openai.com/api/docs/guides/migrate-to-responses
Fixed OpenAI token reporting
We made a fix for the usage.InputTokens field, which was higher than expected. We need to subtract the cached tokens from that value, and from this release it'll be correct.
Keep fantasizing 👻
Charm
Changelog
New!
- 0c8663f: feat(openai): add responses api
store,previous_response_id, andresponse.idsupport (#175) (@ibetitsmike)
Fixed
- 22c3e9a: fix(openai): subtract cached tokens from input tokens to avoid double counting (#176) (@andreynering)
Thoughts? Questions? We love hearing from you. Feel free to reach out on X, Discord, Slack, The Fediverse, Bluesky.