-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(D)CUBIC, NewReno and Prague refactoring #1818
base: master
Are you sure you want to change the base?
Conversation
…atellite dcubic test, unused variables fix,
# Conflicts: # picoquictest/edge_cases.c
…imited, cc_slow_start_increase pass through
Wow! That a lot of work. I will review later, but this is impressive. Regarding recovery: yes, that's a bug. We should make sure that recovery is properly entered for cubic, etc. But I share your concern about trying to change too much in a single PR. It is probably better to first complete the code cleanup that you are doing while keeping test results unchanged, then do a series of smaller PRs each fixing an individual issue. That way, we can separate refactor, which should not change test results, and individual improvements, for which we can verify that the test results change in the expected ways. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a series of comments, mostly asking for a few comments in the code in places when we will need further actions, plus a suggested name change for one of the cc functions.
I see that all tests are passing, with just a few requiring minor adjustment of the completion time target. For me, that's the key. If you address the comments below, we can probably merge that now. I would not try to solve all problems in this one large PR, because I am concerned about potential merge conflicts if we wait much longer.
We should then open specialized PR. First, one PR to add support for recovery. Probably another to verify the "app limited" detection and reaction. And another for DCubic, with additional tests to simulate competition on the same link between Cubic (plain) and Dcubic, and investigate potential issues.
* | ||
* Isn't win_cubic always larger than cubic_state->W_reno? | ||
* If clause is unnecessary here. | ||
*/ | ||
/* Pick the largest */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's review that closely. The Cubic specification assumes a set of "dimensions", such as times in seconds and windows expressed in packets. The picoquic code expects CWIN to be expressed in bytes, with a typical target being "Cwin = bandwith * RTT", with bandwidth in bytes per second and RTT in seconds. Actually using ""Cwin = bandwidth * RTTus / 1000000" since times are measured in microseconds. I think that w_reno and w_cubic are both computed in bytes, and that the test here just follows the spec. I would expect w_reno to be larger than w_cubic when the RTT is small.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I wrote this comment early in December. I don't can clearly remember and have to check that again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit silly, but it gets to the point. Cubic computes a variable "K", which is the time it takes from start of era to t=K and W=W_MAX (per formula):
K = cubic_root(W_max*(1-beta_cubic)/C)
The "Cubic" formula is (kinda) independent of the RTT. That's by design. The authors wanted to ensure fairness between connections of different RTT that share the same bottleneck. Otherwise, the connections that have a smaller RTT will get faster feedback, will track the bottleneck faster, etc.
In the example above, I set W_MAX to 5 -- that's the value of the BDP for a 6Mbps link with 10ms delay and 1500 bytes packets. Cubic sets K to 3.75 seconds. That's 375 RTT! By the time we get to T=K, Reno has increased CWND by 375.
That effect persists even if we consider the same RTT and higher bandwidth. The curve above is for 1.2 Gbps link with the same 10ms RTT. Cubic computes K= 8.55 seconds. Reno reaches W_MAX before that, at T=5 sec. It keeps increasing the window after that, being more aggressive than Cubic during the remaining of the concave period. So No, win_cubic is not always larger than cubic_state->W_reno. Especially not when the RTT is small.
This reverts commit fec4304.
…on), removed unused lines in edge_cases.c
WORK IN PROGRESS
Refactor of CC code. I moved many common parts to cc_common, but I like to keep the code readable and understandable, as well as what the CC algorithm is doing, by just looking at its code. So, some parts stay in the CC code file even if some code duplicates another CC algorithm. (avoiding general picoquic_cc_do_congestion_avoidance(cc_algo, cc_param) functions)
Additionally, I try to change the CWIN (and other path variables) inside the CC code file. Most of the common CC functions return new values or increase them by values.
While doing some coverage runs, I recognized that CUBIC will not enter the recovery phase anyway. (_ alg_state _ isn't set to _ picoquic_cubic_alg_recovery _ at any time.) Even if we call enter_recovery(), we immediately transition to SS or CA. Of course, recovery_sequence is set. Is this intended, or can we remove the recovery state from the enum? Or set it to recovery while the recovery_sequence is set.
Changes:
TODO:
- loss filter in NewReno
- app limited for NewReno
Any feedback, changes, and help would be appreciated.