Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify serialization strategy for tool results #231

Merged
merged 2 commits into from
Dec 18, 2024
Merged

Conversation

jcheng5
Copy link
Collaborator

@jcheng5 jcheng5 commented Dec 18, 2024

Closes #230

Update 2024-12-18 14:25 PST

This went from a very targeted "Add I() to a tool result to prevent it from being double JSON encoded" to changing the default serialization strategy for tool call responses.

The previous behavior only really made sense if you returned a single-element character vector or the output of jsonlite::toJSON(); these work the same way in the new code. But the new code also makes sense for multi-element character vectors (collapses them with "\n") and for JSON-serializable objects (they are automatically serialized), in addition to I().

@hadley
Copy link
Member

hadley commented Dec 18, 2024

This seems reasonable to me, even if we do eventually end up adding a specific image content type. Will just need some docs.

@jcheng5 jcheng5 force-pushed the verbatim-tool-results branch from 8dd679f to 5dedadb Compare December 18, 2024 19:19
@jcheng5
Copy link
Collaborator Author

jcheng5 commented Dec 18, 2024

I'm a little surprised that tool_string calls toString(x@value) instead of jsonlite::toJSON(x@value, auto_unbox = TRUE).

For example, if you return a data frame:

> toString(cars)
[1] "c(4, 4, 7, 7, 8, 9, 10, 10, 10, 11, 11, 12, 12, 12, 12, 13, 13, 13, 13, 14, 14, 14, 14, 15, 15, 15, 16, 16, 17, 17, 17, 18, 18, 18, 18, 19, 19, 19, 20, 20, 20, 20, 20, 22, 23, 24, 24, 24, 24, 25), c(2, 10, 4, 22, 16, 10, 18, 26, 34, 17, 28, 14, 20, 24, 28, 26, 34, 34, 46, 26, 36, 60, 80, 20, 26, 54, 32, 40, 32, 40, 50, 42, 56, 76, 84, 36, 46, 68, 32, 48, 52, 56, 64, 66, 54, 70, 92, 93, 120, 85)"
> jsonlite::toJSON(cars, auto_unbox = TRUE)
[{"speed":4,"dist":2},{"speed":4,"dist":10},{"speed":7,"dist":4},{"speed":7,"dist":22},{"speed":8,"dist":16},{"speed":9,"dist":10},{"speed":10,"dist":18},{"speed":10,"dist":26},{"speed":10,"dist":34},{"speed":11,"dist":17},{"speed":11,"dist":28},{"speed":12,"dist":14},{"speed":12,"dist":20},{"speed":12,"dist":24},{"speed":12,"dist":28},{"speed":13,"dist":26},{"speed":13,"dist":34},{"speed":13,"dist":34},{"speed":13,"dist":46},{"speed":14,"dist":26},{"speed":14,"dist":36},{"speed":14,"dist":60},{"speed":14,"dist":80},{"speed":15,"dist":20},{"speed":15,"dist":26},{"speed":15,"dist":54},{"speed":16,"dist":32},{"speed":16,"dist":40},{"speed":17,"dist":32},{"speed":17,"dist":40},{"speed":17,"dist":50},{"speed":18,"dist":42},{"speed":18,"dist":56},{"speed":18,"dist":76},{"speed":18,"dist":84},{"speed":19,"dist":36},{"speed":19,"dist":46},{"speed":19,"dist":68},{"speed":20,"dist":32},{"speed":20,"dist":48},{"speed":20,"dist":52},{"speed":20,"dist":56},{"speed":20,"dist":64},{"speed":22,"dist":66},{"speed":23,"dist":54},{"speed":24,"dist":70},{"speed":24,"dist":92},{"speed":24,"dist":93},{"speed":24,"dist":120},{"speed":25,"dist":85}] 

@jcheng5
Copy link
Collaborator Author

jcheng5 commented Dec 18, 2024

This could potentially be a parameter on tool(): serialize = c("json", "string", "as-is")

@hadley
Copy link
Member

hadley commented Dec 18, 2024

You should just change to that if you think it makes sense (and it doesn't make the tests break). I don't think I really thought through that decision at all.

Instead of toString, use toJSON. But don't
over-encode strings or already-converted JSON.
@jcheng5 jcheng5 force-pushed the verbatim-tool-results branch from 5dedadb to 9e46fb4 Compare December 18, 2024 22:14
@jcheng5
Copy link
Collaborator Author

jcheng5 commented Dec 18, 2024

I pushed a change so the logic in tool_string is now:

  1. If you return a string, that's passed as-is to the model. (E.g., "The time is 1:20pm" becomes "content": "The time is 1:20pm")
  2. If you return JSON (inherits(x@value, "json")), then that becomes the content, with one more level of stringifying: jsonlite::toJSON(list(h=13, m=20), auto_unbox = TRUE) becomes "content": "{\"h\":13,\"m\":20}". This is the common case.
  3. If you return AsIs, we attach it verbatim to the message and it only gets serialized with the rest of the message. So I(list(h=13, m=20)) would become "content": {"h": 13, "m": 20}.

@jcheng5 jcheng5 changed the title Allow tool results to use I() to provide verbatim JSONifiable objects Modify serialization strategy for tool results Dec 18, 2024
@jcheng5 jcheng5 marked this pull request as ready for review December 18, 2024 22:38
@jcheng5 jcheng5 requested a review from hadley December 18, 2024 22:38
@hadley hadley merged commit 76185fb into main Dec 18, 2024
11 checks passed
@hadley hadley deleted the verbatim-tool-results branch December 18, 2024 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ability to return images from Anthropic tool calls
2 participants