-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Stop the PMTUD search at the interface MTU #2135
feat: Stop the PMTUD search at the interface MTU #2135
Conversation
WIP Should we optimistically *start* the search at the interface MTU, and only start from 1280 when that fails?
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2135 +/- ##
=======================================
Coverage 93.29% 93.30%
=======================================
Files 113 113
Lines 36709 36727 +18
Branches 36709 36727 +18
=======================================
+ Hits 34248 34268 +20
+ Misses 1678 1676 -2
Partials 783 783 ☔ View full report in Codecov by Sentry. |
Are there other projects using this optimistic approach? If I understand RFC 8899 correctly the local interface MTU is the end value, not the start value.
|
Failed Interop TestsQUIC Interop Runner, client vs. server, differences relative to b070663. neqo-latest as client
neqo-latest as server
All resultsSucceeded Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
Unsupported Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
|
All true, but in practice, the local interface is most often the limiting hop. |
Benchmark resultsPerformance differences relative to ad8f390. decode 4096 bytes, mask ff: 💚 Performance has improved.time: [11.198 µs 11.242 µs 11.290 µs] change: [-5.9617% -5.5346% -5.0685%] (p = 0.00 < 0.05) decode 1048576 bytes, mask ff: 💚 Performance has improved.time: [2.8827 ms 2.8902 ms 2.8995 ms] change: [-7.4445% -7.0117% -6.5851%] (p = 0.00 < 0.05) decode 4096 bytes, mask 7f: 💚 Performance has improved.time: [19.516 µs 19.548 µs 19.588 µs] change: [-1.9813% -1.4895% -1.1080%] (p = 0.00 < 0.05) decode 1048576 bytes, mask 7f: 💚 Performance has improved.time: [5.0715 ms 5.0842 ms 5.0987 ms] change: [-3.4917% -2.3386% -1.5961%] (p = 0.00 < 0.05) decode 4096 bytes, mask 3f: 💚 Performance has improved.time: [5.5312 µs 5.5475 µs 5.5699 µs] change: [-21.078% -19.727% -18.519%] (p = 0.00 < 0.05) decode 1048576 bytes, mask 3f: 💚 Performance has improved.time: [1.4168 ms 1.4268 ms 1.4396 ms] change: [-19.583% -18.994% -18.313%] (p = 0.00 < 0.05) coalesce_acked_from_zero 1+1 entries: No change in performance detected.time: [98.506 ns 98.819 ns 99.144 ns] change: [-0.4338% +0.0330% +0.4766%] (p = 0.89 > 0.05) coalesce_acked_from_zero 3+1 entries: No change in performance detected.time: [116.82 ns 117.14 ns 117.49 ns] change: [-0.1888% +0.1344% +0.4678%] (p = 0.43 > 0.05) coalesce_acked_from_zero 10+1 entries: No change in performance detected.time: [116.63 ns 117.20 ns 117.86 ns] change: [-0.0128% +0.6115% +1.2425%] (p = 0.06 > 0.05) coalesce_acked_from_zero 1000+1 entries: No change in performance detected.time: [97.408 ns 97.540 ns 97.683 ns] change: [+0.0501% +0.7347% +1.5543%] (p = 0.05 > 0.05) RxStreamOrderer::inbound_frame(): Change within noise threshold.time: [111.13 ms 111.27 ms 111.49 ms] change: [-1.0423% -0.7981% -0.5448%] (p = 0.00 < 0.05) SentPackets::take_ranges: No change in performance detected.time: [5.5276 µs 5.6914 µs 5.8570 µs] change: [-15.302% -0.4349% +17.872%] (p = 0.94 > 0.05) transfer/pacing-false/varying-seeds: 💔 Performance has regressed.time: [42.866 ms 42.945 ms 43.024 ms] change: [+42.829% +45.519% +48.464%] (p = 0.00 < 0.05) transfer/pacing-true/varying-seeds: 💔 Performance has regressed.time: [43.121 ms 43.195 ms 43.268 ms] change: [+36.398% +38.723% +41.287%] (p = 0.00 < 0.05) transfer/pacing-false/same-seed: 💔 Performance has regressed.time: [42.670 ms 42.734 ms 42.806 ms] change: [+68.397% +72.050% +75.985%] (p = 0.00 < 0.05) transfer/pacing-true/same-seed: 💔 Performance has regressed.time: [42.933 ms 42.986 ms 43.039 ms] change: [+46.924% +51.065% +55.444%] (p = 0.00 < 0.05) 1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: No change in performance detected.time: [869.17 ms 878.56 ms 888.07 ms] thrpt: [112.60 MiB/s 113.82 MiB/s 115.05 MiB/s] change: time: [-0.2526% +1.3795% +2.9237%] (p = 0.09 > 0.05) thrpt: [-2.8406% -1.3607% +0.2533%] 1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: 💚 Performance has improved.time: [297.26 ms 299.20 ms 301.15 ms] thrpt: [33.207 Kelem/s 33.423 Kelem/s 33.640 Kelem/s] change: time: [-7.0953% -5.9407% -4.8692%] (p = 0.00 < 0.05) thrpt: [+5.1184% +6.3159% +7.6372%] 1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: 💔 Performance has regressed.time: [34.226 ms 34.430 ms 34.655 ms] thrpt: [28.856 elem/s 29.044 elem/s 29.218 elem/s] change: time: [+1.4019% +2.2722% +3.1384%] (p = 0.00 < 0.05) thrpt: [-3.0429% -2.2218% -1.3825%] 1-conn/1-100mb-resp/mtu-1504 (aka. Upload)/client: No change in performance detected.time: [1.6597 s 1.6757 s 1.6916 s] thrpt: [59.115 MiB/s 59.676 MiB/s 60.250 MiB/s] change: time: [-0.6191% +0.6644% +2.0114%] (p = 0.34 > 0.05) thrpt: [-1.9717% -0.6600% +0.6230%] Client/server transfer resultsTransfer of 33554432 bytes over loopback.
|
This PR exposed a bug in |
Let me make sure I understand the implications here correctly. Sorry for any potential mistakes. We only start probing once the connection is confirmed. neqo/neqo-transport/src/connection/mod.rs Lines 2794 to 2802 in 55e3a93
Say that a client's path MTU is smaller than their local interface MTU. Given that probing only starts once confirmed, i.e. after receiving Thus this optimization, and really all of PMTUD probing, assumes that the potential delay of one subsequent flight of HTTP requests by up to one PTO is worth the trade off of potentially increasing the overall connection throughput. Is that correct? |
This would need to change. What I think we should do is roughly this:
n should probably be something like 2, so we don't cause undue delay. |
In the case where a client's path MTU is smaller than their local interface MTU, this would add a delay of 2*PTO to every connection establishment, right? If so, isn't that a high cost for the potential increase in throughput? Or is such scenario just very rare? |
Yes. I think this is a rare case, but maybe we add some telemetry first to confirm? We could also cache a probed MTU towards a destination IP, like the OS does for a TCP MSS it has determined. |
How about we:
|
I was thinking we just log the log the local interface MTU together with the discovered PMTUD, and check for differences. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looking good to me. Minor comments.
…in-reviewers,valentin [mozilla/neqo#2135](mozilla/neqo#2135) adds `mtu` crate to `neqo-*`. `mtu` crate depends on `windows-bindgen`. `windows-bindgen` depends on `rayon` `1.7`. On the other hand mozilla-central depends on [`rayon` `v1.6.1`](https://searchfox.org/mozilla-central/rev/7987501f2c2ed1914e5c682bd328ace9c4a7c6cd/Cargo.lock#5149-5157). Given that mozilla-central allows at most one version of each crate, let's update mozilla-central to `rayon` `1.10.0`, i.e. the most recent version. See mozilla/neqo#2135 (comment) for details. Differential Revision: https://phabricator.services.mozilla.com/D230127
…in-reviewers,valentin [mozilla/neqo#2135](mozilla/neqo#2135) adds `mtu` crate to `neqo-*`. `mtu` crate depends on `windows-bindgen`. `windows-bindgen` depends on `rayon` `1.7`. On the other hand mozilla-central depends on [`rayon` `v1.6.1`](https://searchfox.org/mozilla-central/rev/7987501f2c2ed1914e5c682bd328ace9c4a7c6cd/Cargo.lock#5149-5157). Given that mozilla-central allows at most one version of each crate, let's update mozilla-central to `rayon` `1.10.0`, i.e. the most recent version. See mozilla/neqo#2135 (comment) for details. Differential Revision: https://phabricator.services.mozilla.com/D230127 UltraBlame original commit: a80b258672c95bf02014f72b7fde8609b6f507cc
…in-reviewers,valentin [mozilla/neqo#2135](mozilla/neqo#2135) adds `mtu` crate to `neqo-*`. `mtu` crate depends on `windows-bindgen`. `windows-bindgen` depends on `rayon` `1.7`. On the other hand mozilla-central depends on [`rayon` `v1.6.1`](https://searchfox.org/mozilla-central/rev/7987501f2c2ed1914e5c682bd328ace9c4a7c6cd/Cargo.lock#5149-5157). Given that mozilla-central allows at most one version of each crate, let's update mozilla-central to `rayon` `1.10.0`, i.e. the most recent version. See mozilla/neqo#2135 (comment) for details. Differential Revision: https://phabricator.services.mozilla.com/D230127 UltraBlame original commit: a80b258672c95bf02014f72b7fde8609b6f507cc
…in-reviewers,valentin [mozilla/neqo#2135](mozilla/neqo#2135) adds `mtu` crate to `neqo-*`. `mtu` crate depends on `windows-bindgen`. `windows-bindgen` depends on `rayon` `1.7`. On the other hand mozilla-central depends on [`rayon` `v1.6.1`](https://searchfox.org/mozilla-central/rev/7987501f2c2ed1914e5c682bd328ace9c4a7c6cd/Cargo.lock#5149-5157). Given that mozilla-central allows at most one version of each crate, let's update mozilla-central to `rayon` `1.10.0`, i.e. the most recent version. See mozilla/neqo#2135 (comment) for details. Differential Revision: https://phabricator.services.mozilla.com/D230127 UltraBlame original commit: a80b258672c95bf02014f72b7fde8609b6f507cc
@mxinden should we update more deps on m-c before merging this, or do we want to handle that when we vendor neqo into m-c the next time? |
Other than the |
I think |
If it is audits only, I suggest to "handle that when we vendor neqo into m-c the next time". Though note, auditing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do any of the existing unit tests cover the case where Pmtud
stops at the interface MTU?
Co-authored-by: Max Inden <[email protected]> Signed-off-by: Lars Eggert <[email protected]>
Co-authored-by: Max Inden <[email protected]> Signed-off-by: Lars Eggert <[email protected]>
Co-authored-by: Max Inden <[email protected]> Signed-off-by: Lars Eggert <[email protected]>
I don't think we need to - the author is trusted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do any of the existing unit tests cover the case where Pmtud stops at the interface MTU?
Only remaining question. Otherwise this is ready to merge from my end.
Should we also optimistically start the search at the interface MTU, and only start from 1280 when that fails?