Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] RB compability with compile #2426

Open
wants to merge 13 commits into
base: gh/vmoens/25/base
Choose a base branch
from

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Sep 9, 2024

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 9, 2024
vmoens added a commit that referenced this pull request Sep 9, 2024
ghstack-source-id: 08433d2e158f604ffa58d18e6e9bdee73585bc69
Pull Request resolved: #2426
Copy link

pytorch-bot bot commented Sep 9, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2426

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 11 New Failures, 1 Cancelled Job, 8 Unrelated Failures

As of commit 0f1c06a with merge base e294c68 (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOB - The following job was cancelled. Please retry:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@vmoens vmoens added the enhancement New feature or request label Sep 11, 2024
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 12, 2024
ghstack-source-id: 417f8298328100389209feb0340d18b599c0a16f
Pull Request resolved: #2426
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 12, 2024
ghstack-source-id: 8f95c3526ddfe9f5bc91e8f4fdb7bf88fa6f5673
Pull Request resolved: #2426
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 13, 2024
ghstack-source-id: ae562f3418852f6b21f61666b6230d149897cc87
Pull Request resolved: #2426
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 13, 2024
ghstack-source-id: 65c3b848941214b4df85717c7ca1683d419ef9a3
Pull Request resolved: #2426
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 13, 2024
ghstack-source-id: 06ddce1e213016f26da0c1de7baf7fcefeac3dcc
Pull Request resolved: #2426
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 16, 2024
ghstack-source-id: 95a53dc8d434909d930b12a5bb9c69028caec918
Pull Request resolved: #2426
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 17, 2024
ghstack-source-id: 803de44200c0e113df8abe17b059265ea794c627
Pull Request resolved: #2426
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 17, 2024
ghstack-source-id: 33433a9eb9aa83e3ca603a56c7aa2bb066f1b9da
Pull Request resolved: #2426
Copy link

github-actions bot commented Sep 17, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 146. Improved: $\large\color{#35bf28}10$. Worsened: $\large\color{#d91a1a}1$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 59.4400ms 59.1340ms 16.9107 Ops/s 16.8804 Ops/s $\color{#35bf28}+0.18\%$
test_sync 48.8756ms 32.1334ms 31.1203 Ops/s 31.1645 Ops/s $\color{#d91a1a}-0.14\%$
test_async 60.2030ms 30.8354ms 32.4303 Ops/s 32.0947 Ops/s $\color{#35bf28}+1.05\%$
test_simple 0.4904s 0.4214s 2.3731 Ops/s 2.4769 Ops/s $\color{#d91a1a}-4.19\%$
test_transformed 0.5605s 0.5571s 1.7950 Ops/s 1.7728 Ops/s $\color{#35bf28}+1.26\%$
test_serial 1.2752s 1.2692s 0.7879 Ops/s 0.7774 Ops/s $\color{#35bf28}+1.34\%$
test_parallel 1.1828s 1.1107s 0.9003 Ops/s 0.8937 Ops/s $\color{#35bf28}+0.74\%$
test_step_mdp_speed[True-True-True-True-True] 0.3600ms 27.0282μs 36.9984 KOps/s 36.3851 KOps/s $\color{#35bf28}+1.69\%$
test_step_mdp_speed[True-True-True-True-False] 47.0280μs 15.9289μs 62.7788 KOps/s 61.1313 KOps/s $\color{#35bf28}+2.70\%$
test_step_mdp_speed[True-True-True-False-True] 41.8180μs 15.6611μs 63.8523 KOps/s 62.4433 KOps/s $\color{#35bf28}+2.26\%$
test_step_mdp_speed[True-True-True-False-False] 32.5610μs 9.2280μs 108.3661 KOps/s 106.4307 KOps/s $\color{#35bf28}+1.82\%$
test_step_mdp_speed[True-True-False-True-True] 87.4340μs 28.9867μs 34.4985 KOps/s 33.5183 KOps/s $\color{#35bf28}+2.92\%$
test_step_mdp_speed[True-True-False-True-False] 59.0010μs 17.4422μs 57.3323 KOps/s 55.4692 KOps/s $\color{#35bf28}+3.36\%$
test_step_mdp_speed[True-True-False-False-True] 46.2260μs 17.1413μs 58.3385 KOps/s 57.0434 KOps/s $\color{#35bf28}+2.27\%$
test_step_mdp_speed[True-True-False-False-False] 52.5600μs 10.7258μs 93.2334 KOps/s 89.4337 KOps/s $\color{#35bf28}+4.25\%$
test_step_mdp_speed[True-False-True-True-True] 68.6390μs 30.7054μs 32.5675 KOps/s 31.9481 KOps/s $\color{#35bf28}+1.94\%$
test_step_mdp_speed[True-False-True-True-False] 66.9740μs 19.0180μs 52.5818 KOps/s 51.0739 KOps/s $\color{#35bf28}+2.95\%$
test_step_mdp_speed[True-False-True-False-True] 57.3880μs 17.1627μs 58.2658 KOps/s 56.9233 KOps/s $\color{#35bf28}+2.36\%$
test_step_mdp_speed[True-False-True-False-False] 35.8380μs 10.6529μs 93.8708 KOps/s 89.8737 KOps/s $\color{#35bf28}+4.45\%$
test_step_mdp_speed[True-False-False-True-True] 75.8320μs 32.2925μs 30.9669 KOps/s 30.4552 KOps/s $\color{#35bf28}+1.68\%$
test_step_mdp_speed[True-False-False-True-False] 63.5090μs 20.7021μs 48.3043 KOps/s 46.9123 KOps/s $\color{#35bf28}+2.97\%$
test_step_mdp_speed[True-False-False-False-True] 66.2540μs 18.6272μs 53.6850 KOps/s 52.7706 KOps/s $\color{#35bf28}+1.73\%$
test_step_mdp_speed[True-False-False-False-False] 65.4350μs 12.1999μs 81.9679 KOps/s 78.7651 KOps/s $\color{#35bf28}+4.07\%$
test_step_mdp_speed[False-True-True-True-True] 62.8990μs 30.1962μs 33.1168 KOps/s 31.8970 KOps/s $\color{#35bf28}+3.82\%$
test_step_mdp_speed[False-True-True-True-False] 45.2250μs 18.9875μs 52.6663 KOps/s 50.1618 KOps/s $\color{#35bf28}+4.99\%$
test_step_mdp_speed[False-True-True-False-True] 53.6500μs 19.4710μs 51.3585 KOps/s 49.2904 KOps/s $\color{#35bf28}+4.20\%$
test_step_mdp_speed[False-True-True-False-False] 71.2830μs 12.1018μs 82.6321 KOps/s 81.9654 KOps/s $\color{#35bf28}+0.81\%$
test_step_mdp_speed[False-True-False-True-True] 81.6740μs 32.0295μs 31.2212 KOps/s 30.7836 KOps/s $\color{#35bf28}+1.42\%$
test_step_mdp_speed[False-True-False-True-False] 51.0150μs 20.6858μs 48.3423 KOps/s 47.2530 KOps/s $\color{#35bf28}+2.31\%$
test_step_mdp_speed[False-True-False-False-True] 3.1994ms 21.2816μs 46.9890 KOps/s 46.0890 KOps/s $\color{#35bf28}+1.95\%$
test_step_mdp_speed[False-True-False-False-False] 40.2760μs 13.4687μs 74.2464 KOps/s 71.2501 KOps/s $\color{#35bf28}+4.21\%$
test_step_mdp_speed[False-False-True-True-True] 92.0030μs 33.7967μs 29.5887 KOps/s 28.7163 KOps/s $\color{#35bf28}+3.04\%$
test_step_mdp_speed[False-False-True-True-False] 51.3360μs 22.2836μs 44.8760 KOps/s 43.3890 KOps/s $\color{#35bf28}+3.43\%$
test_step_mdp_speed[False-False-True-False-True] 57.2980μs 22.0697μs 45.3110 KOps/s 46.2029 KOps/s $\color{#d91a1a}-1.93\%$
test_step_mdp_speed[False-False-True-False-False] 40.7070μs 13.4204μs 74.5133 KOps/s 72.3049 KOps/s $\color{#35bf28}+3.05\%$
test_step_mdp_speed[False-False-False-True-True] 70.1110μs 35.0183μs 28.5565 KOps/s 27.9638 KOps/s $\color{#35bf28}+2.12\%$
test_step_mdp_speed[False-False-False-True-False] 55.7250μs 23.5344μs 42.4910 KOps/s 41.1385 KOps/s $\color{#35bf28}+3.29\%$
test_step_mdp_speed[False-False-False-False-True] 0.5519ms 22.5501μs 44.3456 KOps/s 43.3895 KOps/s $\color{#35bf28}+2.20\%$
test_step_mdp_speed[False-False-False-False-False] 41.7480μs 14.7937μs 67.5963 KOps/s 65.3296 KOps/s $\color{#35bf28}+3.47\%$
test_values[generalized_advantage_estimate-True-True] 11.9759ms 9.5901ms 104.2738 Ops/s 103.1534 Ops/s $\color{#35bf28}+1.09\%$
test_values[vec_generalized_advantage_estimate-True-True] 36.9134ms 32.8704ms 30.4225 Ops/s 28.5524 Ops/s $\textbf{\color{#35bf28}+6.55\%}$
test_values[td0_return_estimate-False-False] 0.2129ms 0.1678ms 5.9579 KOps/s 6.0061 KOps/s $\color{#d91a1a}-0.80\%$
test_values[td1_return_estimate-False-False] 24.3473ms 24.0317ms 41.6117 Ops/s 41.8557 Ops/s $\color{#d91a1a}-0.58\%$
test_values[vec_td1_return_estimate-False-False] 34.9579ms 32.9833ms 30.3184 Ops/s 28.4258 Ops/s $\textbf{\color{#35bf28}+6.66\%}$
test_values[td_lambda_return_estimate-True-False] 37.6909ms 34.1886ms 29.2495 Ops/s 29.0285 Ops/s $\color{#35bf28}+0.76\%$
test_values[vec_td_lambda_return_estimate-True-False] 46.0500ms 33.3857ms 29.9529 Ops/s 28.4321 Ops/s $\textbf{\color{#35bf28}+5.35\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 18.1904ms 8.4447ms 118.4178 Ops/s 116.7410 Ops/s $\color{#35bf28}+1.44\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.5540ms 1.9326ms 517.4479 Ops/s 500.8306 Ops/s $\color{#35bf28}+3.32\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.6688ms 0.3598ms 2.7793 KOps/s 2.7952 KOps/s $\color{#d91a1a}-0.57\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 42.6946ms 39.4709ms 25.3351 Ops/s 22.1121 Ops/s $\textbf{\color{#35bf28}+14.58\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.3106ms 3.0649ms 326.2708 Ops/s 330.5679 Ops/s $\color{#d91a1a}-1.30\%$
test_dqn_speed[False-None] 6.2135ms 1.3116ms 762.4055 Ops/s 760.0543 Ops/s $\color{#35bf28}+0.31\%$
test_dqn_speed[False-backward] 1.7990ms 1.7586ms 568.6467 Ops/s 565.6315 Ops/s $\color{#35bf28}+0.53\%$
test_dqn_speed[True-None] 1.2122ms 0.4571ms 2.1876 KOps/s 2.1896 KOps/s $\color{#d91a1a}-0.09\%$
test_dqn_speed[True-backward] 0.9338ms 0.8700ms 1.1494 KOps/s 1.0425 KOps/s $\textbf{\color{#35bf28}+10.26\%}$
test_dqn_speed[reduce-overhead-None] 0.6678ms 0.4645ms 2.1526 KOps/s 2.1656 KOps/s $\color{#d91a1a}-0.60\%$
test_dqn_speed[reduce-overhead-backward] 0.9876ms 0.8703ms 1.1491 KOps/s 1.1418 KOps/s $\color{#35bf28}+0.64\%$
test_ddpg_speed[False-None] 3.5658ms 2.7341ms 365.7517 Ops/s 363.8658 Ops/s $\color{#35bf28}+0.52\%$
test_ddpg_speed[False-backward] 4.0503ms 3.8455ms 260.0462 Ops/s 259.2307 Ops/s $\color{#35bf28}+0.31\%$
test_ddpg_speed[True-None] 1.3553ms 0.9964ms 1.0037 KOps/s 984.9181 Ops/s $\color{#35bf28}+1.90\%$
test_ddpg_speed[True-backward] 1.9434ms 1.8759ms 533.0780 Ops/s 436.1926 Ops/s $\textbf{\color{#35bf28}+22.21\%}$
test_ddpg_speed[reduce-overhead-None] 1.2029ms 0.9969ms 1.0031 KOps/s 984.9910 Ops/s $\color{#35bf28}+1.84\%$
test_ddpg_speed[reduce-overhead-backward] 1.9140ms 1.8686ms 535.1681 Ops/s 516.2357 Ops/s $\color{#35bf28}+3.67\%$
test_sac_speed[False-None] 8.6223ms 7.7314ms 129.3423 Ops/s 129.2768 Ops/s $\color{#35bf28}+0.05\%$
test_sac_speed[False-backward] 10.8079ms 10.4175ms 95.9922 Ops/s 96.2021 Ops/s $\color{#d91a1a}-0.22\%$
test_sac_speed[True-None] 2.4000ms 1.8349ms 544.9837 Ops/s 538.9348 Ops/s $\color{#35bf28}+1.12\%$
test_sac_speed[True-backward] 3.8784ms 3.5234ms 283.8171 Ops/s 280.9428 Ops/s $\color{#35bf28}+1.02\%$
test_sac_speed[reduce-overhead-None] 2.0253ms 1.8314ms 546.0375 Ops/s 527.9417 Ops/s $\color{#35bf28}+3.43\%$
test_sac_speed[reduce-overhead-backward] 3.5422ms 3.4936ms 286.2341 Ops/s 280.9712 Ops/s $\color{#35bf28}+1.87\%$
test_redq_speed[False-None] 14.1734ms 12.6071ms 79.3203 Ops/s 78.9883 Ops/s $\color{#35bf28}+0.42\%$
test_redq_speed[False-backward] 22.8912ms 21.8483ms 45.7702 Ops/s 45.6688 Ops/s $\color{#35bf28}+0.22\%$
test_redq_speed[True-None] 5.1890ms 4.5128ms 221.5911 Ops/s 217.1581 Ops/s $\color{#35bf28}+2.04\%$
test_redq_speed[True-backward] 13.5657ms 11.9112ms 83.9543 Ops/s 83.4122 Ops/s $\color{#35bf28}+0.65\%$
test_redq_speed[reduce-overhead-None] 5.2081ms 4.4454ms 224.9494 Ops/s 218.4377 Ops/s $\color{#35bf28}+2.98\%$
test_redq_speed[reduce-overhead-backward] 11.9542ms 11.6027ms 86.1870 Ops/s 84.0235 Ops/s $\color{#35bf28}+2.57\%$
test_redq_deprec_speed[False-None] 13.9124ms 12.3692ms 80.8460 Ops/s 56.4807 Ops/s $\textbf{\color{#35bf28}+43.14\%}$
test_redq_deprec_speed[False-backward] 19.3723ms 18.0113ms 55.5206 Ops/s 55.4410 Ops/s $\color{#35bf28}+0.14\%$
test_redq_deprec_speed[True-None] 4.0891ms 3.5100ms 284.8966 Ops/s 279.2383 Ops/s $\color{#35bf28}+2.03\%$
test_redq_deprec_speed[True-backward] 8.0169ms 7.8178ms 127.9140 Ops/s 125.8033 Ops/s $\color{#35bf28}+1.68\%$
test_redq_deprec_speed[reduce-overhead-None] 4.2623ms 3.5176ms 284.2842 Ops/s 278.1043 Ops/s $\color{#35bf28}+2.22\%$
test_redq_deprec_speed[reduce-overhead-backward] 8.6467ms 7.8308ms 127.7011 Ops/s 125.2991 Ops/s $\color{#35bf28}+1.92\%$
test_td3_speed[False-None] 32.2790ms 7.8141ms 127.9739 Ops/s 129.0423 Ops/s $\color{#d91a1a}-0.83\%$
test_td3_speed[False-backward] 11.2242ms 9.9912ms 100.0879 Ops/s 98.5224 Ops/s $\color{#35bf28}+1.59\%$
test_td3_speed[True-None] 2.0147ms 1.9084ms 524.0063 Ops/s 503.4777 Ops/s $\color{#35bf28}+4.08\%$
test_td3_speed[True-backward] 3.5878ms 3.5233ms 283.8281 Ops/s 279.0328 Ops/s $\color{#35bf28}+1.72\%$
test_td3_speed[reduce-overhead-None] 2.1878ms 1.9093ms 523.7499 Ops/s 503.5176 Ops/s $\color{#35bf28}+4.02\%$
test_td3_speed[reduce-overhead-backward] 3.9997ms 3.4806ms 287.3028 Ops/s 279.4101 Ops/s $\color{#35bf28}+2.82\%$
test_cql_speed[False-None] 38.3249ms 34.9823ms 28.5859 Ops/s 28.2266 Ops/s $\color{#35bf28}+1.27\%$
test_cql_speed[False-backward] 47.1367ms 44.5045ms 22.4697 Ops/s 22.0382 Ops/s $\color{#35bf28}+1.96\%$
test_cql_speed[True-None] 16.2691ms 15.2243ms 65.6844 Ops/s 64.3894 Ops/s $\color{#35bf28}+2.01\%$
test_cql_speed[True-backward] 24.4449ms 21.4987ms 46.5145 Ops/s 45.5053 Ops/s $\color{#35bf28}+2.22\%$
test_cql_speed[reduce-overhead-None] 16.4927ms 15.2642ms 65.5130 Ops/s 64.1231 Ops/s $\color{#35bf28}+2.17\%$
test_cql_speed[reduce-overhead-backward] 22.3724ms 21.6651ms 46.1571 Ops/s 45.7297 Ops/s $\color{#35bf28}+0.93\%$
test_a2c_speed[False-None] 8.2114ms 6.9933ms 142.9934 Ops/s 140.0789 Ops/s $\color{#35bf28}+2.08\%$
test_a2c_speed[False-backward] 14.7141ms 13.7638ms 72.6545 Ops/s 71.1617 Ops/s $\color{#35bf28}+2.10\%$
test_a2c_speed[True-None] 3.6046ms 3.2852ms 304.3946 Ops/s 299.8929 Ops/s $\color{#35bf28}+1.50\%$
test_a2c_speed[True-backward] 9.9190ms 9.5588ms 104.6162 Ops/s 103.4296 Ops/s $\color{#35bf28}+1.15\%$
test_a2c_speed[reduce-overhead-None] 3.6504ms 3.2881ms 304.1235 Ops/s 299.6899 Ops/s $\color{#35bf28}+1.48\%$
test_a2c_speed[reduce-overhead-backward] 10.1574ms 9.5686ms 104.5086 Ops/s 103.1269 Ops/s $\color{#35bf28}+1.34\%$
test_ppo_speed[False-None] 8.3856ms 7.2716ms 137.5220 Ops/s 134.8254 Ops/s $\color{#35bf28}+2.00\%$
test_ppo_speed[False-backward] 15.0939ms 14.1538ms 70.6525 Ops/s 69.0706 Ops/s $\color{#35bf28}+2.29\%$
test_ppo_speed[True-None] 3.9814ms 3.6608ms 273.1663 Ops/s 267.3169 Ops/s $\color{#35bf28}+2.19\%$
test_ppo_speed[True-backward] 9.9739ms 9.4425ms 105.9045 Ops/s 105.1934 Ops/s $\color{#35bf28}+0.68\%$
test_ppo_speed[reduce-overhead-None] 4.3623ms 3.6652ms 272.8367 Ops/s 261.9883 Ops/s $\color{#35bf28}+4.14\%$
test_ppo_speed[reduce-overhead-backward] 10.3148ms 9.4684ms 105.6144 Ops/s 104.8475 Ops/s $\color{#35bf28}+0.73\%$
test_reinforce_speed[False-None] 7.2494ms 6.3886ms 156.5283 Ops/s 153.7750 Ops/s $\color{#35bf28}+1.79\%$
test_reinforce_speed[False-backward] 11.4477ms 9.5792ms 104.3927 Ops/s 103.5066 Ops/s $\color{#35bf28}+0.86\%$
test_reinforce_speed[True-None] 2.9956ms 2.6033ms 384.1300 Ops/s 377.8198 Ops/s $\color{#35bf28}+1.67\%$
test_reinforce_speed[True-backward] 9.5829ms 8.4324ms 118.5903 Ops/s 117.7530 Ops/s $\color{#35bf28}+0.71\%$
test_reinforce_speed[reduce-overhead-None] 3.0805ms 2.5969ms 385.0779 Ops/s 375.7357 Ops/s $\color{#35bf28}+2.49\%$
test_reinforce_speed[reduce-overhead-backward] 8.7982ms 8.3970ms 119.0900 Ops/s 117.3259 Ops/s $\color{#35bf28}+1.50\%$
test_iql_speed[False-None] 32.0032ms 31.5883ms 31.6572 Ops/s 30.9282 Ops/s $\color{#35bf28}+2.36\%$
test_iql_speed[False-backward] 45.8016ms 44.1677ms 22.6410 Ops/s 22.5180 Ops/s $\color{#35bf28}+0.55\%$
test_iql_speed[True-None] 14.9567ms 13.1600ms 75.9880 Ops/s 75.8510 Ops/s $\color{#35bf28}+0.18\%$
test_iql_speed[True-backward] 25.6875ms 23.7961ms 42.0238 Ops/s 41.5100 Ops/s $\color{#35bf28}+1.24\%$
test_iql_speed[reduce-overhead-None] 15.4067ms 13.1224ms 76.2055 Ops/s 75.0384 Ops/s $\color{#35bf28}+1.56\%$
test_iql_speed[reduce-overhead-backward] 24.5924ms 23.5990ms 42.3747 Ops/s 41.6926 Ops/s $\color{#35bf28}+1.64\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.8979ms 4.8857ms 204.6801 Ops/s 196.2882 Ops/s $\color{#35bf28}+4.28\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.0051ms 0.4782ms 2.0911 KOps/s 2.0977 KOps/s $\color{#d91a1a}-0.31\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6578ms 0.4513ms 2.2157 KOps/s 2.2038 KOps/s $\color{#35bf28}+0.54\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.1657ms 4.8253ms 207.2395 Ops/s 204.1685 Ops/s $\color{#35bf28}+1.50\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9109ms 0.4674ms 2.1394 KOps/s 884.0062 Ops/s $\textbf{\color{#35bf28}+142.01\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6306ms 0.4421ms 2.2618 KOps/s 2.2291 KOps/s $\color{#35bf28}+1.47\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.9178ms 1.6070ms 622.2917 Ops/s 612.6850 Ops/s $\color{#35bf28}+1.57\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.0452ms 1.5177ms 658.8863 Ops/s 648.4717 Ops/s $\color{#35bf28}+1.61\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.2989ms 5.0052ms 199.7939 Ops/s 190.6060 Ops/s $\color{#35bf28}+4.82\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8579ms 0.6055ms 1.6516 KOps/s 1.6154 KOps/s $\color{#35bf28}+2.24\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7980ms 0.5756ms 1.7372 KOps/s 1.7125 KOps/s $\color{#35bf28}+1.45\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.8749ms 4.9062ms 203.8240 Ops/s 194.8205 Ops/s $\color{#35bf28}+4.62\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.5848ms 0.4806ms 2.0809 KOps/s 2.0524 KOps/s $\color{#35bf28}+1.39\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6662ms 0.4535ms 2.2052 KOps/s 2.2084 KOps/s $\color{#d91a1a}-0.15\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.1715ms 4.7989ms 208.3808 Ops/s 199.0358 Ops/s $\color{#35bf28}+4.70\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7043ms 0.4711ms 2.1228 KOps/s 2.1239 KOps/s $\color{#d91a1a}-0.05\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 7.3053ms 0.4536ms 2.2045 KOps/s 2.2098 KOps/s $\color{#d91a1a}-0.24\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.2733ms 5.0144ms 199.4268 Ops/s 192.3044 Ops/s $\color{#35bf28}+3.70\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.3002ms 0.6100ms 1.6393 KOps/s 1.6339 KOps/s $\color{#35bf28}+0.33\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7459ms 0.5773ms 1.7323 KOps/s 1.6979 KOps/s $\color{#35bf28}+2.02\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.7540ms 4.1800ms 239.2323 Ops/s 232.3989 Ops/s $\color{#35bf28}+2.94\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 17.1573ms 12.9391ms 77.2852 Ops/s 75.2602 Ops/s $\color{#35bf28}+2.69\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.7797ms 1.2544ms 797.1689 Ops/s 759.0832 Ops/s $\textbf{\color{#35bf28}+5.02\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.3418s 10.9293ms 91.4976 Ops/s 239.7535 Ops/s $\textbf{\color{#d91a1a}-61.84\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 17.7496ms 12.9676ms 77.1154 Ops/s 75.1776 Ops/s $\color{#35bf28}+2.58\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.7147ms 1.3857ms 721.6552 Ops/s 724.1087 Ops/s $\color{#d91a1a}-0.34\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 5.5081ms 4.3231ms 231.3181 Ops/s 37.7422 Ops/s $\textbf{\color{#35bf28}+512.89\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 17.2801ms 13.0207ms 76.8008 Ops/s 73.4667 Ops/s $\color{#35bf28}+4.54\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 5.9911ms 1.5244ms 655.9835 Ops/s 673.6951 Ops/s $\color{#d91a1a}-2.63\%$

vmoens added a commit that referenced this pull request Sep 17, 2024
ghstack-source-id: 33433a9eb9aa83e3ca603a56c7aa2bb066f1b9da
Pull Request resolved: #2426
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 17, 2024
ghstack-source-id: 65f013f0ea3531ce0b6f778bc771bcc845d961d1
Pull Request resolved: #2426
Copy link

github-actions bot commented Sep 17, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}15$. Worsened: $\large\color{#d91a1a}5$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1036s 0.1018s 9.8249 Ops/s 9.8670 Ops/s $\color{#d91a1a}-0.43\%$
test_sync 89.3994ms 87.0710ms 11.4849 Ops/s 11.3383 Ops/s $\color{#35bf28}+1.29\%$
test_async 0.2604s 84.5743ms 11.8239 Ops/s 12.0116 Ops/s $\color{#d91a1a}-1.56\%$
test_single_pixels 0.1084s 0.1080s 9.2604 Ops/s 9.3006 Ops/s $\color{#d91a1a}-0.43\%$
test_sync_pixels 71.7855ms 70.5214ms 14.1801 Ops/s 14.1185 Ops/s $\color{#35bf28}+0.44\%$
test_async_pixels 0.2021s 66.3508ms 15.0714 Ops/s 15.0553 Ops/s $\color{#35bf28}+0.11\%$
test_simple 0.7398s 0.7253s 1.3787 Ops/s 1.3290 Ops/s $\color{#35bf28}+3.73\%$
test_transformed 0.9566s 0.9501s 1.0525 Ops/s 1.0382 Ops/s $\color{#35bf28}+1.37\%$
test_serial 2.1193s 2.0718s 0.4827 Ops/s 0.4912 Ops/s $\color{#d91a1a}-1.73\%$
test_parallel 1.8573s 1.8407s 0.5433 Ops/s 0.5358 Ops/s $\color{#35bf28}+1.40\%$
test_step_mdp_speed[True-True-True-True-True] 0.1848ms 35.6284μs 28.0675 KOps/s 27.8658 KOps/s $\color{#35bf28}+0.72\%$
test_step_mdp_speed[True-True-True-True-False] 55.3310μs 20.5130μs 48.7497 KOps/s 48.4277 KOps/s $\color{#35bf28}+0.66\%$
test_step_mdp_speed[True-True-True-False-True] 45.4510μs 19.7832μs 50.5478 KOps/s 48.7690 KOps/s $\color{#35bf28}+3.65\%$
test_step_mdp_speed[True-True-True-False-False] 38.0910μs 11.6314μs 85.9745 KOps/s 84.7535 KOps/s $\color{#35bf28}+1.44\%$
test_step_mdp_speed[True-True-False-True-True] 72.9420μs 37.7241μs 26.5083 KOps/s 26.4843 KOps/s $\color{#35bf28}+0.09\%$
test_step_mdp_speed[True-True-False-True-False] 47.0910μs 22.3404μs 44.7619 KOps/s 43.8857 KOps/s $\color{#35bf28}+2.00\%$
test_step_mdp_speed[True-True-False-False-True] 55.7220μs 22.4009μs 44.6410 KOps/s 45.3678 KOps/s $\color{#d91a1a}-1.60\%$
test_step_mdp_speed[True-True-False-False-False] 50.8010μs 13.7039μs 72.9721 KOps/s 72.6055 KOps/s $\color{#35bf28}+0.50\%$
test_step_mdp_speed[True-False-True-True-True] 73.7610μs 40.3216μs 24.8006 KOps/s 24.8588 KOps/s $\color{#d91a1a}-0.23\%$
test_step_mdp_speed[True-False-True-True-False] 43.6710μs 24.8496μs 40.2422 KOps/s 40.2032 KOps/s $\color{#35bf28}+0.10\%$
test_step_mdp_speed[True-False-True-False-True] 63.1620μs 22.6778μs 44.0960 KOps/s 45.0370 KOps/s $\color{#d91a1a}-2.09\%$
test_step_mdp_speed[True-False-True-False-False] 43.7810μs 13.7477μs 72.7393 KOps/s 72.5380 KOps/s $\color{#35bf28}+0.28\%$
test_step_mdp_speed[True-False-False-True-True] 68.5620μs 42.2892μs 23.6467 KOps/s 23.6644 KOps/s $\color{#d91a1a}-0.07\%$
test_step_mdp_speed[True-False-False-True-False] 56.0910μs 26.4666μs 37.7834 KOps/s 37.5948 KOps/s $\color{#35bf28}+0.50\%$
test_step_mdp_speed[True-False-False-False-True] 52.4510μs 24.2210μs 41.2865 KOps/s 41.5255 KOps/s $\color{#d91a1a}-0.58\%$
test_step_mdp_speed[True-False-False-False-False] 50.1310μs 15.6728μs 63.8046 KOps/s 62.8542 KOps/s $\color{#35bf28}+1.51\%$
test_step_mdp_speed[False-True-True-True-True] 70.1820μs 40.1355μs 24.9156 KOps/s 24.7894 KOps/s $\color{#35bf28}+0.51\%$
test_step_mdp_speed[False-True-True-True-False] 58.2310μs 24.7609μs 40.3863 KOps/s 39.5658 KOps/s $\color{#35bf28}+2.07\%$
test_step_mdp_speed[False-True-True-False-True] 55.8810μs 25.8924μs 38.6214 KOps/s 39.6375 KOps/s $\color{#d91a1a}-2.56\%$
test_step_mdp_speed[False-True-True-False-False] 38.5710μs 15.2011μs 65.7849 KOps/s 64.2801 KOps/s $\color{#35bf28}+2.34\%$
test_step_mdp_speed[False-True-False-True-True] 72.9720μs 41.9072μs 23.8623 KOps/s 23.9696 KOps/s $\color{#d91a1a}-0.45\%$
test_step_mdp_speed[False-True-False-True-False] 54.7710μs 26.3901μs 37.8931 KOps/s 37.1065 KOps/s $\color{#35bf28}+2.12\%$
test_step_mdp_speed[False-True-False-False-True] 3.5592ms 27.4954μs 36.3698 KOps/s 36.3641 KOps/s $\color{#35bf28}+0.02\%$
test_step_mdp_speed[False-True-False-False-False] 42.1310μs 17.3407μs 57.6679 KOps/s 57.3170 KOps/s $\color{#35bf28}+0.61\%$
test_step_mdp_speed[False-False-True-True-True] 79.1520μs 43.6488μs 22.9101 KOps/s 22.5131 KOps/s $\color{#35bf28}+1.76\%$
test_step_mdp_speed[False-False-True-True-False] 55.0610μs 28.4097μs 35.1992 KOps/s 34.2951 KOps/s $\color{#35bf28}+2.64\%$
test_step_mdp_speed[False-False-True-False-True] 55.3820μs 27.5875μs 36.2483 KOps/s 36.3613 KOps/s $\color{#d91a1a}-0.31\%$
test_step_mdp_speed[False-False-True-False-False] 48.7410μs 17.0952μs 58.4959 KOps/s 57.5235 KOps/s $\color{#35bf28}+1.69\%$
test_step_mdp_speed[False-False-False-True-True] 81.5320μs 45.3623μs 22.0447 KOps/s 21.6560 KOps/s $\color{#35bf28}+1.80\%$
test_step_mdp_speed[False-False-False-True-False] 71.2210μs 29.9799μs 33.3556 KOps/s 32.4400 KOps/s $\color{#35bf28}+2.82\%$
test_step_mdp_speed[False-False-False-False-True] 52.0510μs 27.9375μs 35.7941 KOps/s 35.1026 KOps/s $\color{#35bf28}+1.97\%$
test_step_mdp_speed[False-False-False-False-False] 47.4210μs 18.8610μs 53.0194 KOps/s 52.0081 KOps/s $\color{#35bf28}+1.94\%$
test_values[generalized_advantage_estimate-True-True] 25.9757ms 24.5649ms 40.7085 Ops/s 42.8252 Ops/s $\color{#d91a1a}-4.94\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1086s 3.0540ms 327.4419 Ops/s 344.7722 Ops/s $\textbf{\color{#d91a1a}-5.03\%}$
test_values[td0_return_estimate-False-False] 95.8920μs 65.4893μs 15.2697 KOps/s 15.2303 KOps/s $\color{#35bf28}+0.26\%$
test_values[td1_return_estimate-False-False] 57.3846ms 53.5467ms 18.6753 Ops/s 18.9755 Ops/s $\color{#d91a1a}-1.58\%$
test_values[vec_td1_return_estimate-False-False] 1.3488ms 1.0615ms 942.0204 Ops/s 947.3363 Ops/s $\color{#d91a1a}-0.56\%$
test_values[td_lambda_return_estimate-True-False] 89.6038ms 84.8194ms 11.7898 Ops/s 11.9884 Ops/s $\color{#d91a1a}-1.66\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3883ms 1.0591ms 944.1592 Ops/s 953.4763 Ops/s $\color{#d91a1a}-0.98\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 23.4873ms 23.3693ms 42.7912 Ops/s 42.8921 Ops/s $\color{#d91a1a}-0.24\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 0.9810ms 0.7064ms 1.4157 KOps/s 1.4275 KOps/s $\color{#d91a1a}-0.83\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7142ms 0.6459ms 1.5482 KOps/s 1.5525 KOps/s $\color{#d91a1a}-0.28\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.4903ms 1.4524ms 688.5018 Ops/s 693.2611 Ops/s $\color{#d91a1a}-0.69\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7203ms 0.6622ms 1.5101 KOps/s 1.5206 KOps/s $\color{#d91a1a}-0.69\%$
test_dqn_speed[False-None] 6.9769ms 1.3014ms 768.3782 Ops/s 723.0953 Ops/s $\textbf{\color{#35bf28}+6.26\%}$
test_dqn_speed[False-backward] 1.9324ms 1.8316ms 545.9727 Ops/s 552.9544 Ops/s $\color{#d91a1a}-1.26\%$
test_dqn_speed[True-None] 0.7467ms 0.5486ms 1.8228 KOps/s 1.7818 KOps/s $\color{#35bf28}+2.30\%$
test_dqn_speed[True-backward] 1.0191ms 0.9817ms 1.0187 KOps/s 838.5104 Ops/s $\textbf{\color{#35bf28}+21.49\%}$
test_dqn_speed[reduce-overhead-None] 0.9231ms 0.5546ms 1.8030 KOps/s 1.7567 KOps/s $\color{#35bf28}+2.64\%$
test_dqn_speed[reduce-overhead-backward] 1.0178ms 0.9859ms 1.0143 KOps/s 1.0176 KOps/s $\color{#d91a1a}-0.33\%$
test_ddpg_speed[False-None] 3.3269ms 2.6583ms 376.1837 Ops/s 373.1850 Ops/s $\color{#35bf28}+0.80\%$
test_ddpg_speed[False-backward] 3.9527ms 3.8782ms 257.8543 Ops/s 256.7565 Ops/s $\color{#35bf28}+0.43\%$
test_ddpg_speed[True-None] 1.5992ms 1.2299ms 813.0685 Ops/s 820.8133 Ops/s $\color{#d91a1a}-0.94\%$
test_ddpg_speed[True-backward] 2.2721ms 2.1927ms 456.0538 Ops/s 456.8327 Ops/s $\color{#d91a1a}-0.17\%$
test_ddpg_speed[reduce-overhead-None] 1.5995ms 1.2416ms 805.4064 Ops/s 819.0041 Ops/s $\color{#d91a1a}-1.66\%$
test_ddpg_speed[reduce-overhead-backward] 2.2480ms 2.1908ms 456.4516 Ops/s 458.2327 Ops/s $\color{#d91a1a}-0.39\%$
test_sac_speed[False-None] 7.8329ms 7.4597ms 134.0530 Ops/s 134.5867 Ops/s $\color{#d91a1a}-0.40\%$
test_sac_speed[False-backward] 10.9844ms 10.7080ms 93.3885 Ops/s 94.3460 Ops/s $\color{#d91a1a}-1.01\%$
test_sac_speed[True-None] 2.4178ms 1.9974ms 500.6426 Ops/s 497.6937 Ops/s $\color{#35bf28}+0.59\%$
test_sac_speed[True-backward] 4.0090ms 3.9161ms 255.3587 Ops/s 244.5579 Ops/s $\color{#35bf28}+4.42\%$
test_sac_speed[reduce-overhead-None] 2.3752ms 2.0073ms 498.1873 Ops/s 493.1668 Ops/s $\color{#35bf28}+1.02\%$
test_sac_speed[reduce-overhead-backward] 3.9810ms 3.8987ms 256.4961 Ops/s 252.0557 Ops/s $\color{#35bf28}+1.76\%$
test_redq_speed[False-None] 10.9025ms 10.0261ms 99.7396 Ops/s 101.2224 Ops/s $\color{#d91a1a}-1.46\%$
test_redq_speed[False-backward] 18.3181ms 17.2993ms 57.8059 Ops/s 58.0554 Ops/s $\color{#d91a1a}-0.43\%$
test_redq_speed[True-None] 3.6945ms 3.3751ms 296.2903 Ops/s 279.2065 Ops/s $\textbf{\color{#35bf28}+6.12\%}$
test_redq_speed[True-backward] 8.6341ms 8.3101ms 120.3351 Ops/s 116.2389 Ops/s $\color{#35bf28}+3.52\%$
test_redq_speed[reduce-overhead-None] 3.7749ms 3.3506ms 298.4558 Ops/s 281.5190 Ops/s $\textbf{\color{#35bf28}+6.02\%}$
test_redq_speed[reduce-overhead-backward] 9.3871ms 8.3360ms 119.9613 Ops/s 118.6213 Ops/s $\color{#35bf28}+1.13\%$
test_redq_deprec_speed[False-None] 10.7918ms 10.2261ms 97.7892 Ops/s 93.9116 Ops/s $\color{#35bf28}+4.13\%$
test_redq_deprec_speed[False-backward] 15.1515ms 14.6854ms 68.0949 Ops/s 65.0067 Ops/s $\color{#35bf28}+4.75\%$
test_redq_deprec_speed[True-None] 3.5047ms 3.1094ms 321.6066 Ops/s 301.9357 Ops/s $\textbf{\color{#35bf28}+6.51\%}$
test_redq_deprec_speed[True-backward] 6.9551ms 6.6840ms 149.6109 Ops/s 141.4464 Ops/s $\textbf{\color{#35bf28}+5.77\%}$
test_redq_deprec_speed[reduce-overhead-None] 3.5516ms 3.1374ms 318.7369 Ops/s 308.1619 Ops/s $\color{#35bf28}+3.43\%$
test_redq_deprec_speed[reduce-overhead-backward] 6.9542ms 6.6896ms 149.4857 Ops/s 140.7235 Ops/s $\textbf{\color{#35bf28}+6.23\%}$
test_td3_speed[False-None] 7.5444ms 7.3651ms 135.7755 Ops/s 134.2348 Ops/s $\color{#35bf28}+1.15\%$
test_td3_speed[False-backward] 10.6366ms 10.2637ms 97.4308 Ops/s 96.0416 Ops/s $\color{#35bf28}+1.45\%$
test_td3_speed[True-None] 2.0747ms 2.0361ms 491.1251 Ops/s 482.2346 Ops/s $\color{#35bf28}+1.84\%$
test_td3_speed[True-backward] 4.0282ms 3.8561ms 259.3262 Ops/s 256.2128 Ops/s $\color{#35bf28}+1.22\%$
test_td3_speed[reduce-overhead-None] 2.0721ms 2.0275ms 493.2197 Ops/s 483.6917 Ops/s $\color{#35bf28}+1.97\%$
test_td3_speed[reduce-overhead-backward] 3.9511ms 3.8386ms 260.5090 Ops/s 258.1930 Ops/s $\color{#35bf28}+0.90\%$
test_cql_speed[False-None] 26.6539ms 24.1824ms 41.3524 Ops/s 40.4828 Ops/s $\color{#35bf28}+2.15\%$
test_cql_speed[False-backward] 38.3618ms 34.1152ms 29.3125 Ops/s 29.5573 Ops/s $\color{#d91a1a}-0.83\%$
test_cql_speed[True-None] 11.2266ms 10.8316ms 92.3228 Ops/s 94.1392 Ops/s $\color{#d91a1a}-1.93\%$
test_cql_speed[True-backward] 16.9726ms 16.3231ms 61.2627 Ops/s 62.4583 Ops/s $\color{#d91a1a}-1.91\%$
test_cql_speed[reduce-overhead-None] 11.2267ms 10.7270ms 93.2230 Ops/s 93.1649 Ops/s $\color{#35bf28}+0.06\%$
test_cql_speed[reduce-overhead-backward] 17.0723ms 16.2192ms 61.6552 Ops/s 62.3585 Ops/s $\color{#d91a1a}-1.13\%$
test_a2c_speed[False-None] 5.7757ms 5.2179ms 191.6477 Ops/s 187.4182 Ops/s $\color{#35bf28}+2.26\%$
test_a2c_speed[False-backward] 11.7953ms 11.4509ms 87.3296 Ops/s 86.4707 Ops/s $\color{#35bf28}+0.99\%$
test_a2c_speed[True-None] 3.1633ms 3.0268ms 330.3830 Ops/s 324.2458 Ops/s $\color{#35bf28}+1.89\%$
test_a2c_speed[True-backward] 8.7994ms 8.4615ms 118.1819 Ops/s 110.6028 Ops/s $\textbf{\color{#35bf28}+6.85\%}$
test_a2c_speed[reduce-overhead-None] 3.3865ms 3.0204ms 331.0824 Ops/s 334.1178 Ops/s $\color{#d91a1a}-0.91\%$
test_a2c_speed[reduce-overhead-backward] 8.7718ms 8.4116ms 118.8836 Ops/s 117.2822 Ops/s $\color{#35bf28}+1.37\%$
test_ppo_speed[False-None] 6.1381ms 5.5130ms 181.3888 Ops/s 177.5046 Ops/s $\color{#35bf28}+2.19\%$
test_ppo_speed[False-backward] 12.2331ms 11.9881ms 83.4162 Ops/s 82.1189 Ops/s $\color{#35bf28}+1.58\%$
test_ppo_speed[True-None] 3.5624ms 3.4163ms 292.7143 Ops/s 293.3727 Ops/s $\color{#d91a1a}-0.22\%$
test_ppo_speed[True-backward] 8.4943ms 8.0477ms 124.2590 Ops/s 122.1261 Ops/s $\color{#35bf28}+1.75\%$
test_ppo_speed[reduce-overhead-None] 3.5752ms 3.4078ms 293.4467 Ops/s 293.0196 Ops/s $\color{#35bf28}+0.15\%$
test_ppo_speed[reduce-overhead-backward] 8.3711ms 8.1027ms 123.4152 Ops/s 121.4074 Ops/s $\color{#35bf28}+1.65\%$
test_reinforce_speed[False-None] 6.0919ms 4.3639ms 229.1549 Ops/s 229.5303 Ops/s $\color{#d91a1a}-0.16\%$
test_reinforce_speed[False-backward] 7.3641ms 7.0970ms 140.9047 Ops/s 141.0111 Ops/s $\color{#d91a1a}-0.08\%$
test_reinforce_speed[True-None] 2.5681ms 2.1711ms 460.5916 Ops/s 447.9360 Ops/s $\color{#35bf28}+2.83\%$
test_reinforce_speed[True-backward] 7.4798ms 6.9691ms 143.4897 Ops/s 142.0635 Ops/s $\color{#35bf28}+1.00\%$
test_reinforce_speed[reduce-overhead-None] 2.3278ms 2.1837ms 457.9457 Ops/s 453.8810 Ops/s $\color{#35bf28}+0.90\%$
test_reinforce_speed[reduce-overhead-backward] 7.1726ms 6.9590ms 143.6979 Ops/s 144.2760 Ops/s $\color{#d91a1a}-0.40\%$
test_iql_speed[False-None] 24.4695ms 19.4444ms 51.4287 Ops/s 54.4171 Ops/s $\textbf{\color{#d91a1a}-5.49\%}$
test_iql_speed[False-backward] 30.0144ms 29.2855ms 34.1466 Ops/s 34.8840 Ops/s $\color{#d91a1a}-2.11\%$
test_iql_speed[True-None] 8.2116ms 7.7175ms 129.5751 Ops/s 128.4216 Ops/s $\color{#35bf28}+0.90\%$
test_iql_speed[True-backward] 16.8725ms 16.2187ms 61.6571 Ops/s 59.5682 Ops/s $\color{#35bf28}+3.51\%$
test_iql_speed[reduce-overhead-None] 9.4887ms 7.8869ms 126.7917 Ops/s 129.7603 Ops/s $\color{#d91a1a}-2.29\%$
test_iql_speed[reduce-overhead-backward] 16.6501ms 16.2525ms 61.5290 Ops/s 61.3377 Ops/s $\color{#35bf28}+0.31\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.6817ms 6.5531ms 152.6006 Ops/s 151.3241 Ops/s $\color{#35bf28}+0.84\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5265ms 0.2423ms 4.1277 KOps/s 4.2384 KOps/s $\color{#d91a1a}-2.61\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6331ms 0.2208ms 4.5284 KOps/s 3.2698 KOps/s $\textbf{\color{#35bf28}+38.49\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.7265ms 6.4753ms 154.4323 Ops/s 154.2648 Ops/s $\color{#35bf28}+0.11\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.3268ms 0.2365ms 4.2290 KOps/s 4.3481 KOps/s $\color{#d91a1a}-2.74\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4334ms 0.2102ms 4.7579 KOps/s 3.3434 KOps/s $\textbf{\color{#35bf28}+42.31\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.3695ms 1.2068ms 828.6333 Ops/s 794.2204 Ops/s $\color{#35bf28}+4.33\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.2882ms 1.1152ms 896.7395 Ops/s 845.1430 Ops/s $\textbf{\color{#35bf28}+6.11\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.7091ms 6.5863ms 151.8293 Ops/s 149.7119 Ops/s $\color{#35bf28}+1.41\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0533ms 0.4721ms 2.1184 KOps/s 2.6791 KOps/s $\textbf{\color{#d91a1a}-20.93\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6314ms 0.4532ms 2.2064 KOps/s 2.2654 KOps/s $\color{#d91a1a}-2.60\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.8937ms 6.5308ms 153.1213 Ops/s 152.9491 Ops/s $\color{#35bf28}+0.11\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5138ms 0.2429ms 4.1170 KOps/s 2.6476 KOps/s $\textbf{\color{#35bf28}+55.50\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4170ms 0.2223ms 4.4981 KOps/s 2.7544 KOps/s $\textbf{\color{#35bf28}+63.30\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.8089ms 6.5280ms 153.1856 Ops/s 154.3221 Ops/s $\color{#d91a1a}-0.74\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.5840ms 0.2328ms 4.2955 KOps/s 3.1732 KOps/s $\textbf{\color{#35bf28}+35.37\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4207ms 0.2128ms 4.6991 KOps/s 4.5252 KOps/s $\color{#35bf28}+3.84\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.9936ms 6.7084ms 149.0669 Ops/s 149.8951 Ops/s $\color{#d91a1a}-0.55\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.7754ms 0.3818ms 2.6189 KOps/s 2.6390 KOps/s $\color{#d91a1a}-0.76\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.5932ms 0.3598ms 2.7790 KOps/s 2.8427 KOps/s $\color{#d91a1a}-2.24\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.4219s 13.6710ms 73.1477 Ops/s 186.3365 Ops/s $\textbf{\color{#d91a1a}-60.74\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 21.3040ms 15.7612ms 63.4470 Ops/s 64.0655 Ops/s $\color{#d91a1a}-0.97\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 2.3301ms 1.1973ms 835.2078 Ops/s 822.9209 Ops/s $\color{#35bf28}+1.49\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 9.5988ms 5.4108ms 184.8149 Ops/s 190.5728 Ops/s $\color{#d91a1a}-3.02\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 21.4871ms 15.5757ms 64.2025 Ops/s 63.5525 Ops/s $\color{#35bf28}+1.02\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.3427ms 1.1921ms 838.8368 Ops/s 832.5178 Ops/s $\color{#35bf28}+0.76\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.3546s 12.5903ms 79.4264 Ops/s 33.3953 Ops/s $\textbf{\color{#35bf28}+137.84\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 21.5990ms 15.8040ms 63.2752 Ops/s 63.5858 Ops/s $\color{#d91a1a}-0.49\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 9.4404ms 1.5941ms 627.2940 Ops/s 795.1057 Ops/s $\textbf{\color{#d91a1a}-21.11\%}$

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 17, 2024
ghstack-source-id: f3bc11cd46440629f8e0fb51799c683a2679c0b9
Pull Request resolved: #2426
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 17, 2024
ghstack-source-id: d79df3dc9462e05c5d26ba8f2510796d8f245c37
Pull Request resolved: #2426
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 17, 2024
ghstack-source-id: 2f65c39e339ce808813ec977ce0121d6fb6057de
Pull Request resolved: #2426
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants