Replies: 1 comment 16 replies
-
IMHO It is a shame to say but ~6k per second is my local cap. But it goes neither from To exclude while True:
message = pool_queue.get()
continue And the cap is still ~6k! I did not investigate it deeper but invite you to start from here :) My points to check:
P.S. We should definitely try UPD. I found something looking good to be a candidate for the new WebSocket client https://github.com/cirospaciari/socketify.py (I hope it has clients) |
Beta Was this translation helpful? Give feedback.
-
Since the recent wave of Bluesky signups, the firehose of posts from the network is topping out at around 1300 ops/second. This is a lot of commits! And a lot of infra is already struggling to keep up. I wanted to start a discussion about how we could future-proof firehose consumer code in case the network gets even larger.
As I understand it, the maximum amount of ops/second that the
FirehoseSubscribeReposClient
can keep up with is about 6000, after which it becomes CPU bound based on the benchmark from way back in v0.0.26 whenlibipld
was added. Today, I moved the astronomy feeds to new hosting, and after some benchmarking, I can get a peak of 5100 ops/second in processing speed (which is close to that original benchmark.) Again, it seems theFirehoseSubscribeReposClient
gets CPU bound around that point.However, this begs the question of what happens if Bluesky continues to grow and gets ~10x larger. Currently, the library can only manage about a ~4x-greater size increase, which isn't so much. I know the Bluesky devs have talked about plans for things like filtered firehoses, but that doesn't help if someone is running a feed that uses all types of record.
Thought 1: I wonder whether or not the decoding steps of
FirehoseSubscribeReposClient
could be multithreaded, as I believe that's the main bottleneck right now.Thought 2: Python's base multiprocessing library looks like it's also a bit of a limiting factor. When running at peak speed (5000 ops/second), about 20% of the compute time of the firehose subscription process in my feed is spent just sending things to the worker process with a
multiprocessing.Pipe
(line 63 in this file)*, due to Python's lack of a way to handle shared memory nicely. If the FirehoseSubscribeReposClient was faster, then this would become an even larger limit. Is there a way to multithread firehose code in a way that avoids copies? For instance, maybe ifFirehoseSubscribeReposClient
was multithreaded, maybe each thread would callon_message_handler
itself, instead of putting it all into a single Pipe/Queue at the end and causing a different bottleneck instead.Thanks for your help as ever, @MarshalX!
multiprocessing.Queue
doesn't work on it. However, I thinkmultiprocessing.Queue
still usesmultiprocessing.Pipe
s internally to communicate with workers, so I (think?) it will have the same bottleneck as my single-Pipe solution has.Beta Was this translation helpful? Give feedback.
All reactions