Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proper SSE parsing #7

Open
johnd0e opened this issue Apr 16, 2024 · 2 comments · May be fixed by #8
Open

Proper SSE parsing #7

johnd0e opened this issue Apr 16, 2024 · 2 comments · May be fixed by #8

Comments

@johnd0e
Copy link
Contributor

johnd0e commented Apr 16, 2024

Specs: https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#event_stream_format

As you can see, the specifications are rigidly defined, eliminating the necessity to use LPEG and iterative trial-and-error JSON decoding.

In addition, the current implementation is unable to parse valid cases, for instance, when the response contains comments (lines commencing with a colon). This is something you could encounter.

Specifically, below is a sample response from https://openrouter.ai/api/v1/chat/completions:

: OPENROUTER PROCESSING

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":"The"},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":" next"},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":" day"},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":","},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":" I"},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":" wand"},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":"ered"},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":" through"},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":" the"},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":" streets"},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":" of"},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":" the"},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":" city"},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":","},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":" feeling"},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":" lost"},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":" and"},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":" alone"},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":"."},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":" I"},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":" had"},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":" no"},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":" idea"},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":" where"},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":" I"},"finish_reason":null}]}

data: {"id":"gen-Hh4bFjJS0U0YgPav7S9tMpfXrNHA","model":"mistralai/mistral-7b-instruct:free","object":"chat.completion.chunk","created":1713306744,"choices":[{"index":0,"delta":{"role":"assistant","content":" was"},"finish_reason":null}]}

: the rest is trimmed, to keep the issue message length in limits (maximum is 65536 characters).

data: [DONE]

@leafo
Copy link
Owner

leafo commented Apr 17, 2024

As you can see, the specifications are rigidly defined, eliminating the necessity to use LPEG and iterative trial-and-error JSON decoding.

The specification is unrelated to the reason for the way the parsing is done. Output is being streamed from the http client, there is no guarantee that a full message will be put into the buffer as output is written from the server. cjson is not a streaming parser, so repeated parsing is done to emulate a streaming parser.

If you're concerned about performance, a potential optimization would be to start from the full length of the string and shrink by 1, instead of starting from length 1 and increase to the end. As it's more likely the correct position is located near the end of the string, not the beginning.

Specifically, below is a sample response from https://openrouter.ai/api/v1/chat/completions:

I've only tested this with the OpenAI API, but I have no issue with having comments be ignored if other "compatible" APIs output them.

@johnd0e
Copy link
Contributor Author

johnd0e commented Apr 17, 2024

Output is being streamed from the http client, there is no guarantee that a full message will be put into the buffer as output is written from the server.

Yes, but if we adhere to the specifications, we can still depend on the subsequent message being separated by two new lines. Instead of trying to simulate streaming JSON parsing, we may simply accumulate the response until '\n\n' is hit, and then JSON-parse the message in one go. This is the method employed in certain other libraries.

If you're concerned about performance.

Well, not so much, but while we still need to consider the comments-case, why not implement everything in the most optimal way?
I could undertake the implementation.

@johnd0e johnd0e linked a pull request Apr 17, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants