What's New
Streaming for AgentChat agents and teams
- Introduce ModelClientStreamingChunkEvent for streaming model output and update handling in agents and console by @ekzhu in #5208
To enable streaming from an AssistantAgent, set model_client_stream=True
when creating it. The token stream will be available when you run the agent directly, or as part of a team when you call run_stream
.
If you want to see tokens streaming in your console application, you can use Console
directly.
import asyncio
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.ui import Console
from autogen_ext.models.openai import OpenAIChatCompletionClient
async def main() -> None:
agent = AssistantAgent("assistant", OpenAIChatCompletionClient(model="gpt-4o"), model_client_stream=True)
await Console(agent.run_stream(task="Write a short story with a surprising ending."))
asyncio.run(main())
If you are handling the messages yourself and streaming to the frontend, you can handle
autogen_agentchat.messages.ModelClientStreamingChunkEvent
message.
import asyncio
from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient
async def main() -> None:
agent = AssistantAgent("assistant", OpenAIChatCompletionClient(model="gpt-4o"), model_client_stream=True)
async for message in agent.run_stream(task="Write 3 line poem."):
print(message)
asyncio.run(main())
source='user' models_usage=None content='Write 3 line poem.' type='TextMessage'
source='assistant' models_usage=None content='Silent' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None content=' whispers' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None content=' glide' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None content=',' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None content=' \n' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None content='Moon' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None content='lit' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None content=' dreams' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None content=' dance' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None content=' through' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None content=' the' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None content=' night' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None content=',' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None content=' \n' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None content='Stars' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None content=' watch' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None content=' from' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None content=' above' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=None content='.' type='ModelClientStreamingChunkEvent'
source='assistant' models_usage=RequestUsage(prompt_tokens=0, completion_tokens=0) content='Silent whispers glide, \nMoonlit dreams dance through the night, \nStars watch from above.' type='TextMessage'
TaskResult(messages=[TextMessage(source='user', models_usage=None, content='Write 3 line poem.', type='TextMessage'), TextMessage(source='assistant', models_usage=RequestUsage(prompt_tokens=0, completion_tokens=0), content='Silent whispers glide, \nMoonlit dreams dance through the night, \nStars watch from above.', type='TextMessage')], stop_reason=None)
Read more here: https://microsoft.github.io/autogen/stable/user-guide/agentchat-user-guide/tutorial/agents.html#streaming-tokens
Also, see the sample showing how to stream a team's messages to ChainLit frontend: https://github.com/microsoft/autogen/tree/python-v0.4.5/python/samples/agentchat_chainlit
R1-style reasoning output
import asyncio
from autogen_core.models import UserMessage, ModelFamily
from autogen_ext.models.openai import OpenAIChatCompletionClient
async def main() -> None:
model_client = OpenAIChatCompletionClient(
model="deepseek-r1:1.5b",
api_key="placeholder",
base_url="http://localhost:11434/v1",
model_info={
"function_calling": False,
"json_output": False,
"vision": False,
"family": ModelFamily.R1,
}
)
# Test basic completion with the Ollama deepseek-r1:1.5b model.
create_result = await model_client.create(
messages=[
UserMessage(
content="Taking two balls from a bag of 10 green balls and 20 red balls, "
"what is the probability of getting a green and a red balls?",
source="user",
),
]
)
# CreateResult.thought field contains the thinking content.
print(create_result.thought)
print(create_result.content)
asyncio.run(main())
Streaming is also supported with R1-style reasoning output.
See the sample showing R1 playing chess: https://github.com/microsoft/autogen/tree/python-v0.4.5/python/samples/agentchat_chess_game
FunctionTool for partial functions
- FunctionTool partial support by @nour-bouzid in #5183
Now you can define function tools from partial functions, where some parameters have been set before hand.
import json
from functools import partial
from autogen_core.tools import FunctionTool
def get_weather(country: str, city: str) -> str:
return f"The temperature in {city}, {country} is 75°"
partial_function = partial(get_weather, "Germany")
tool = FunctionTool(partial_function, description="Partial function tool.")
print(json.dumps(tool.schema, indent=2))
{
"name": "get_weather",
"description": "Partial function tool.",
"parameters": {
"type": "object",
"properties": {
"city": {
"description": "city",
"title": "City",
"type": "string"
}
},
"required": [
"city"
]
}
}
CodeExecutorAgent update
New Samples
- Streamlit + AgentChat sample by @husseinkorly in #5306
- ChainLit + AgentChat sample with streaming by @ekzhu in #5304
- Chess sample showing R1-Style reasoning for planning and strategizing by @ekzhu in #5285
Documentation update:
- Add Semantic Kernel Adapter documentation and usage examples in user guides by @ekzhu in #5256
- Update human-in-the-loop tutorial with better system message to signal termination condition by @ekzhu in #5253
Moves
Bug Fixes
- fix: handle non-string function arguments in tool calls and add corresponding warnings by @ekzhu in #5260
- Add default_header support by @nour-bouzid in #5249
- feat: update OpenAIAssistantAgent to support AsyncAzureOpenAI client by @ekzhu in #5312
All Other Python Related Changes
- Update website for v0.4.4 by @ekzhu in #5246
- update dependencies to work with protobuf 5 by @MohMaz in #5195
- Adjusted M1 agent system prompt to remove TERMINATE by @afourney in #5263
#5270 - chore: update package versions to 0.4.5 and remove deprecated requirements by @ekzhu in #5280
- Update Distributed Agent Runtime Cross-platform Sample by @linznin in #5164
- fix: windows check ci failure by @bassmang in #5287
- fix: type issues in streamlit sample and add streamlit to dev dependencies by @ekzhu in #5309
- chore: add asyncio_atexit dependency to docker requirements by @ekzhu in #5307
- feat: add o3 to model info; update chess example by @ekzhu in #5311
New Contributors
- @nour-bouzid made their first contribution in #5183
- @linznin made their first contribution in #5164
- @husseinkorly made their first contribution in #5306
Full Changelog: v0.4.4...python-v0.4.5