Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: facing issues with MultimodalConversableAgent, as TextMessage in agent_message is not allowing image objects to be passed #558

Open
shriyanshagnihotri opened this issue Jan 20, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@shriyanshagnihotri
Copy link
Collaborator

Describe the bug

facing issues with MultimodalConversableAgent, as TextMessage in agent_message is not allowing image objects to be passed

Messages like

`Compare these two images in detail:
1. Reference Image:

2. Current Screenshot:

        Please analyze and describe:
        1. Visual similarities and differences
        2. Layout differences
        3. Any missing or extra elements
        4. Color or styling differences
        5. Whether they can be considered visually equivalent

        Be specific and detailed in your comparison.`

are blowing up as image objects failed in the parsing of TextMessage

b4b1327#diff-8533b1c74059a6eca88070c00a1a71bcec5d422cb54662803caae0ed4ba64120R192

the content: Optional[Union[str, int, float, bool, list[dict[str, str]]]] = None

should allow dict of [str, Any] as the value can be even image considering MultimodalConversableAgent is used for even image analysis.

Steps to reproduce

Just follow this
https://docs.ag2.ai/notebooks/agentchat_lmm_gpt-4v

Model Used

gpt-4o

Expected Behavior

The pydantic validation should be slightly loose for this case as values can be beyond Text.

Screenshots and logs

content.str Input should be a valid string [type=string_type, input_value=[{'type': 'text', 'text':...d in your comparison.'}], input_type=list] For further information visit https://errors.pydantic.dev/2.10/v/string_type content.int Input should be a valid integer [type=int_type, input_value=[{'type': 'text', 'text':...d in your comparison.'}], input_type=list] For further information visit https://errors.pydantic.dev/2.10/v/int_type content.float Input should be a valid number [type=float_type, input_value=[{'type': 'text', 'text':...d in your comparison.'}], input_type=list] For further information visit https://errors.pydantic.dev/2.10/v/float_type content.bool Input should be a valid boolean [type=bool_type, input_value=[{'type': 'text', 'text':...d in your comparison.'}], input_type=list] For further information visit https://errors.pydantic.dev/2.10/v/bool_type Traceback (most recent call last): File "/Users/shriyanshagnihotri/workspace/testzeus-hercules/testzeus_hercules/core/extra_tools/visual_skill.py", line 130, in compare_visual_screenshot chat_response = await asyncio.to_thread(image_ex_user_proxy.initiate_chat, image_agent, message=message) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Cellar/[email protected]/3.11.10/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/threads.py", line 25, in to_thread return await loop.run_in_executor(None, func_call) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/Cellar/[email protected]/3.11.10/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/futures.py", line 287, in __await__ yield self # This tells Task to wait for completion. ^^^^^^^^^^ File "/opt/homebrew/Cellar/[email protected]/3.11.10/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/tasks.py", line 349, in __wakeup future.result() File "/opt/homebrew/Cellar/[email protected]/3.11.10/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/futures.py", line 203, in result raise self._exception.with_traceback(self._exception_tb) File "/opt/homebrew/Cellar/[email protected]/3.11.10/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/shriyanshagnihotri/workspace/testzeus-hercules/env/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 1117, in initiate_chat self.send(msg2send, recipient, silent=silent) File "/Users/shriyanshagnihotri/workspace/testzeus-hercules/env/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 807, in send recipient.receive(message, self, request_reply, silent) File "/Users/shriyanshagnihotri/workspace/testzeus-hercules/env/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 914, in receive self._process_received_message(message, sender, silent) File "/Users/shriyanshagnihotri/workspace/testzeus-hercules/env/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 882, in _process_received_message self._print_received_message(message, sender) File "/Users/shriyanshagnihotri/workspace/testzeus-hercules/env/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 865, in _print_received_message message_model = create_received_message_model(message=message, sender=sender, recipient=self) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/shriyanshagnihotri/workspace/testzeus-hercules/env/lib/python3.11/site-packages/autogen/messages/agent_messages.py", line 246, in create_received_message_model return TextMessage( ^^^^^^^^^^^^ File "/Users/shriyanshagnihotri/workspace/testzeus-hercules/env/lib/python3.11/site-packages/autogen/messages/base_message.py", line 67, in __init__ super().__init__(*args, content=message_cls(*args, **data, content=content), **data) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/shriyanshagnihotri/workspace/testzeus-hercules/env/lib/python3.11/site-packages/autogen/messages/base_message.py", line 22, in __init__ super().__init__(uuid=uuid, **kwargs) File "/Users/shriyanshagnihotri/workspace/testzeus-hercules/env/lib/python3.11/site-packages/pydantic/main.py", line 214, in __init__ validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ pydantic_core._pydantic_core.ValidationError: 4 validation errors for TextMessage content.str Input should be a valid string [type=string_type, input_value=[{'type': 'text', 'text':...d in your comparison.'}], input_type=list] For further information visit https://errors.pydantic.dev/2.10/v/string_type content.int Input should be a valid integer [type=int_type, input_value=[{'type': 'text', 'text':...d in your comparison.'}], input_type=list] For further information visit https://errors.pydantic.dev/2.10/v/int_type content.float Input should be a valid number [type=float_type, input_value=[{'type': 'text', 'text':...d in your comparison.'}], input_type=list] For further information visit https://errors.pydantic.dev/2.10/v/float_type content.bool Input should be a valid boolean [type=bool_type, input_value=[{'type': 'text', 'text':...d in your comparison.'}], input_type=list] For further information visit https://errors.pydantic.dev/2.10/v/bool_type

Additional Information

has to be fixed in next release hopefully.

@shriyanshagnihotri shriyanshagnihotri added the bug Something isn't working label Jan 20, 2025
shriyanshagnihotri added a commit to test-zeus-ai/ag2 that referenced this issue Jan 20, 2025
… another nested dict representing image url, hence both str and dict should be allowed issue ag2ai#558
shriyanshagnihotri added a commit to test-zeus-ai/ag2 that referenced this issue Jan 20, 2025
…lues can be another nested dict representing image url, hence both str and dict should be allowed issue ag2ai#558
github-merge-queue bot pushed a commit that referenced this issue Jan 20, 2025
…lue because MultimodalConversableAgent was breaking while processing images. (#560)

* making TextMessage take a list[dict[str, str | dict[str, Any]]] as values can be another nested dict representing image url, hence both str and dict should be allowed issue #558

* using Union instead of | to make the mypy test pass

* Consistency for List and Dict to list and dict

---------

Co-authored-by: Mark Sze <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant