Balance received message handling between connections #99

gavin-norman-sociomantic · 2017-08-01T16:23:13Z

The current connection (read) handling works as follows:

All connections waiting to read are registered with epoll.
epoll_wait reports that one or more connections have received something.
For each connection that has received something:
a. Check if at least one full message has been received. If not, re-register with epoll.
b. If one or more full messages have been received, iterate over them, passing them to the appropriate request handlers.

The problem is at 3b. Given the various input buffers that are in use, it's possible for hundreds (even thousands -- in experiments we have seen this) of messages on a connection to have been received and now be ready for handling. So, step 3b could actually take quite some time to handle, leaving other connections and requests starved in the meantime.

This is not ideal asynchronous behaviour -- the assumption is that handling an event is quick, allowing other epoll clients to get their share of attention.

One idea to fix this problem would be to do the following:

All connections waiting to read are registered with epoll.
epoll_wait reports that one or more connections have received something.
For each connection that has received something:
a. Check if at least one full message has been received. If not, re-register with epoll.
b. If one or more full messages have been received, set a flag indicating that they are waiting to be handled.
Iterate over all connections which are flagged as having messages ready to be handled, handling one message per connection until all have been handled. In this way, the handling of messages is balanced between connections.

The text was updated successfully, but these errors were encountered:

gavin-norman-sociomantic · 2017-08-01T16:30:30Z

Iterate over all connections which are flagged as having messages ready to be handled, handling one message per connection until all have been handled.

The exact finishing condition here is open to debate. It may, for example make sense to also limit the number of messages handled (across all connections) per epoll cycle, so as to not overwhelm other epoll clients.

gavin-norman-sociomantic · 2017-08-02T09:49:06Z

Iterate over all connections which are flagged as having messages ready to be handled, handling one message per connection until all have been handled. In this way, the handling of messages is balanced between connections.

This will require some means of performing an action after all epoll clients have been handled, before the next call to epoll_wait. To do this directly, we'll need to enhance EpollSelectDispatcher to add a more flexible system for specifying hooks, something like this.

Another way would be to rely on an event fd which is triggered every epoll cycle. It wouldn't be guaranteed to be handled at the end of the cycle, but simply processing pending connection messages once per cycle should be sufficient.

nemanja-boric-sociomantic · 2017-08-02T09:51:13Z

Small disadvantage of using always-ready evetfd is doubling number of open file descriptors. However, it also solves the problem of starving other select clients, since we would go back to the epoll_wait frequently.

gavin-norman-sociomantic · 2017-08-02T09:51:54Z

Why doubling? One event fd for the whole system, not one per connection.

nemanja-boric-sociomantic · 2017-08-02T09:52:09Z

Ah, it wouldn't actually double it, yes, I was just going to write that.

david-eckardt-sociomantic · 2017-08-02T13:35:01Z

It's possible for hundreds (even thousands -- in experiments we have seen this) of messages on a connection to have been received and now be ready for handling. So, step 3b could actually take quite some time to handle, leaving other connections and requests starved in the meantime.

The processing time is not the only problem. Without a way of suspending the read loop fiber while iterating over hundreds of received messages a client application can be flooded with received messages without a chance of processing them or throttling the input stream.
The situation is that the read loop fiber is suspended only after all messages in the MessageReceiver's buffer have been delivered to the requests. Suspending the read loop fiber may be needed

to send a message such as when the client application wants to suspend a streaming request,
for the client application to process the received messages.

Many client applications need an event loop cycle to process received records, for example because they schedule a task to send another record. But because the next event loop cycle is going to happen only after all of the hundreds of received messages have been delivered to the request handler, a high amount of scheduled tasks or outgoing messages can stack up in the application. It can be that many that they exhaust resources such as the capacity of task or message queues.

gavin-norman-sociomantic · 2017-08-02T13:41:51Z

But because the next event loop cycle is going to happen only after all of the hundreds of received messages have been delivered to the request handler, a high amount of scheduled tasks or outgoing messages can stack up in the application. It can be that many that they exhaust resources such as the capacity of task or message queues.

Right, this ties into what I wrote above:

The exact finishing condition here is open to debate. It may, for example make sense to also limit the number of messages handled (across all connections) per epoll cycle, so as to not overwhelm other epoll clients.

If we implement this at some point, we'll need to experiment with different approaches to see what works best.

gavin-norman-sociomantic · 2018-06-11T10:34:35Z

I wonder if this is necessary any more. The idea originated in our troubles with stream requests. These have now been (/ are being) re-implemented as batch requests.

@david-eckardt-sociomantic @nemanja-boric-sociomantic what do you think?

gavin-norman-sociomantic added difficulty-hard type-feature labels Aug 1, 2017

gavin-norman-sociomantic added this to the Ideas / future milestone Aug 1, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Balance received message handling between connections #99

Balance received message handling between connections #99

gavin-norman-sociomantic commented Aug 1, 2017

gavin-norman-sociomantic commented Aug 1, 2017

gavin-norman-sociomantic commented Aug 2, 2017

nemanja-boric-sociomantic commented Aug 2, 2017

gavin-norman-sociomantic commented Aug 2, 2017

nemanja-boric-sociomantic commented Aug 2, 2017

david-eckardt-sociomantic commented Aug 2, 2017

gavin-norman-sociomantic commented Aug 2, 2017

gavin-norman-sociomantic commented Jun 11, 2018

Balance received message handling between connections #99

Balance received message handling between connections #99

Comments

gavin-norman-sociomantic commented Aug 1, 2017

gavin-norman-sociomantic commented Aug 1, 2017

gavin-norman-sociomantic commented Aug 2, 2017

nemanja-boric-sociomantic commented Aug 2, 2017

gavin-norman-sociomantic commented Aug 2, 2017

nemanja-boric-sociomantic commented Aug 2, 2017

david-eckardt-sociomantic commented Aug 2, 2017

gavin-norman-sociomantic commented Aug 2, 2017

gavin-norman-sociomantic commented Jun 11, 2018