[Feature Request]: Stream output like chatgpt? #511

universea · 2025-01-16T09:22:25Z

Is your feature request related to a problem? Please describe.

I want to design a UI by using agent chat, like chatgpt (the output word can be showed one by one), but the agent reply result is a string, but not a ChatCompletionChunk. How can I realize this function?

from openai import OpenAI

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[
        {"role": "system", "content": "You are a helpful assistant"},
        {"role": "user", "content": "Hello"},
  ],
    max_tokens=1024,
    temperature=0.7,
    stream=True
)

for chunk in response:
  print(chunk)
  print(chunk.choices[0].delta)
  
  
  
  
import os
from autogen import AssistantAgent, UserProxyAgent, ConversableAgent

#llm_config = {"model": "gpt-4", "api_key": os.environ["OPENAI_API_KEY"]}

llm_config = {   
        "model": "deepseek-chat",
        "api_type": "deepseek",
        "api_key": "xxx",
        "base_url": "https://api.deepseek.com",
        "price" : [0.00014, 0.00028],
        "stream": True,
        }

assistant = ConversableAgent("assistant", llm_config=llm_config, human_input_mode='Never')
user_proxy = UserProxyAgent("user_proxy", code_execution_config=False)

message=[{"content": "Tell me a joke about NVDA and TESLA stock prices.", "role": "user"}]

response = assistant.generate_oai_reply(messages=message)
response = await assistant.a_generate_oai_reply(messages=message)
response = assistant.generate_reply(messages=message)
response = await assistant.a_generate_reply(messages=message)
print(response)

for chunk in response:
  print('_new_chunk_')
  print(chunk)

Describe the solution you'd like

All the agents can be return a (Returns a chat completion object, or a streamed sequence of chat completion chunk objects if the request is streamed.)

Additional context

No response

davorrunje · 2025-01-16T15:23:58Z

We could add support for something like this:

import asyncer
from autogen import a_create_message_iterator

async def chat(message, llm_config):
    assistant = ConversableAgent("assistant", llm_config=llm_config, human_input_mode='NEVER')
    user_proxy = UserProxyAgent("user_proxy", human_input_mode="NEVER")

    async with asyncer.create_task_group() as task_group:
        async with a_create_message_iterator() as it:
            task_group.soonify(user_proxy.a_initiate_chat)(assistant, message=message, max_turns=5, cache=cache)
            async for m in it:
                # do what you need to
                print(m.model_dump_json())

I can probably simplify it further so you don't see all those low-level details.

universea · 2025-01-16T15:41:13Z

Thank you so much!

universea added the enhancement New feature or request label Jan 16, 2025

davorrunje self-assigned this Jan 16, 2025

davorrunje added this to ag2 Jan 16, 2025

davorrunje moved this to Todo in ag2 Jan 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request]: Stream output like chatgpt? #511

[Feature Request]: Stream output like chatgpt? #511

universea commented Jan 16, 2025 •

edited by davorrunje

Loading

davorrunje commented Jan 16, 2025

universea commented Jan 16, 2025

[Feature Request]: Stream output like chatgpt? #511

[Feature Request]: Stream output like chatgpt? #511

Comments

universea commented Jan 16, 2025 • edited by davorrunje Loading

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Additional context

davorrunje commented Jan 16, 2025

universea commented Jan 16, 2025

universea commented Jan 16, 2025 •

edited by davorrunje

Loading