-
Notifications
You must be signed in to change notification settings - Fork 862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] Fix FlowWindowSize managment when sender drops packets #2815
Conversation
When the sender drops packets, it generate a fake ack. In some cases, it could end up in a situation in which its ackseqno was always higher than the one from the receiver. Because of that, ack were ignored and the FlowWindowSize was never update. Without this fix, when doing a 50mbps transmission at 1200s latency over a 21% loss network with a 200ms RTTT, we would end up with 50% unrecovered packets. With this fix, this goes down to below 1%, which is in line with other cases in which is close to other cases in which the problem didn't appear |
Here are some tests results : With the fix, default buffer size, 100ms RTT network and 25% loss packets
With the fix, 1000000000 recv buffer size, 100ms RTT network and 25% loss packets
With the fix, 200ms RTT network and 25% lost packets. Increasing the receiving buffer improves the results the same way it did with the 100ms RTT network
For reference, here are the results I had without the fix, in a network with 200ms RTT network and 21% losses (I didn't go up to 25% but results are still speaking for themselves). WIth latency greater than 1200ms, results were the same than at 1200 (>=50% unrecovered packets)
|
Results with the fix are collected in a 100 ms RTT network with 25% packet loss and a 200 ms network with 25% packet loss. Comparing results without the fix (200 ms RTT, 21% packet loss) and without the fix (200 ms RTT, 25% packet loss) there is a lot more dropped packets with the fix at 1s latency (1075 vs 349), but a lot less at 1.2 s latency (366 vs 30980). Although the results do not provide a solid base for comparison, it looks like the fix uses the available size of the receiver buffer more efficiently. |
What I can see what How can then dropping from sender buffer virtually increase the free space in the peer's receiver buffer? I understand that it's just a "statement" that these same packets that are dropped on the sender side will be also dropped (independently) on the receiver side when the time to play a packet following the loss comes. The problem is, we don't exactly know when this is going to happen. The perfect solution would be to define the exact time, that would be "peer's play time minus RTT" of the packet that closes the earliest gap, as the time up to which the packets on the sender side should be dismissed. This could at least replicate the phenomena happening on the receiver side at this moment, or very close to it. Don't forget that people have been complaining here that they have much more drops if the sender dropping is enabled. I added this option to configure the sender dropping time mainly for development needs, to see by adjusting this settings you can have less drops on the same link. |
I'll add some tests results with the fix at 21% loss |
The assumption is that the TL packet drop on the sender side happens later, than the corresponding TL packet drop on the receiver, given some later packet has been received. Even assuming no later packets have been received, still sending the following packets would eventually trigger TL packet drop on the receiver anyway, freeing up space for further packets. |
Ran tests to determine the RCV buffer size leading to a 0% packet drop. 50 Mbps, 100 ms RTT, 25% packet loss, SRT latency 1.3 seconds.The target RCV buffer size recommended by the Configuration Guidelines is The value to recommend based on the results below is SRT master.
SRT with the FIX (PR 2815)
50 Mbps, 100 ms RTT, 25% packet loss, SRT latency 1.0 seconds.The target RCV buffer size recommended by the Configuration Guidelines is The value to recommend based on the results below is SRT master
SRT with the fix
SRT with the fix and
|
My point is that on the receiver side the dropping is triggered at the very specific packet - the packet that follows the loss. Dropped are only the lost packets irrecovered at the moment of drop check. So on the sender side at the moment of receiving the loss report you know only one thing for sure - that these cells in the receiver buffer will have been freed exactly at the moment when the "loss coverer" packet is ready to play (actually when retrieved by the application, but we believe that it will be extracted immediately). Whether any packets were recovered up to this moment is unimportant because we believe that every packet preceding the play-ready packet will be removed from the buffer at the play time anyway. The problem is: packets removed as "too late" from the sender buffer still have their "corresponding cells" active in the peer's receiver buffer after dropping. When these same cells will be removed from the peer's receiver buffer is another story. This simply means that a sender's prediction that by removing some range of packets from the sender buffer it gets more free space in the peer's receiver buffer at the moment when this dropping happens only increases the risk that the newest packets will be dropped on reception in case when the buffer is almost full and the sender has "tricked itself" that the peer will have the space to accommodate the packet it is about to send. The problem isn't whether the cells in the peer's receiver buffer will be freed "eventually", but whether they will be really freed at the very moment when the new packet is being sent and the free tokens in the flight window are checked for green light.
This is happening from a completely different reason, although actually tricking the flight window size is only increasing the probability for this to happen. If there's a drop-on-reception once, the only way how it can be avoided is to quickly make space in the receiver buffer. BTW. if there was at least one problem that has been solved by increasing the flight window tricky way and temporarily (only up to the next incoming ACK), wouldn't increasing the receiver buffer size do exactly the same thing, just without dangerous tricking? |
The reality is the sender does not know if those cells are free or not, that's true. |
No description provided.