Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "[BugFix] Allow expanding TensorDictPrimer transforms shape with parent batch size" #2544

Merged
merged 1 commit into from
Nov 8, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Nov 8, 2024

Reverts #2521

cc @albertbou92 this was causing test_tensordictprimer_batching to break - can you have a look and resubmit a PR?

Copy link

pytorch-bot bot commented Nov 8, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2544

Note: Links to docs will display an error until the docs builds have been completed.

❌ 18 New Failures, 4 Unrelated Failures

As of commit 8a5c278 with merge base e9d1677 (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 8, 2024
@vmoens vmoens added the revert label Nov 8, 2024
@vmoens vmoens merged commit 8a8b4c3 into main Nov 8, 2024
43 of 61 checks passed
@vmoens vmoens deleted the revert-2521-primer_batch branch November 8, 2024 14:23
Copy link

github-actions bot commented Nov 8, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}1$. Worsened: $\large\color{#d91a1a}18$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4391s 0.4370s 2.2881 Ops/s 2.2733 Ops/s $\color{#35bf28}+0.65\%$
test_transformed 0.7106s 0.6309s 1.5851 Ops/s 1.7013 Ops/s $\textbf{\color{#d91a1a}-6.83\%}$
test_serial 1.4657s 1.3768s 0.7263 Ops/s 0.7391 Ops/s $\color{#d91a1a}-1.73\%$
test_parallel 1.4163s 1.3127s 0.7618 Ops/s 0.7576 Ops/s $\color{#35bf28}+0.55\%$
test_step_mdp_speed[True-True-True-True-True] 0.3115ms 26.8242μs 37.2798 KOps/s 36.5515 KOps/s $\color{#35bf28}+1.99\%$
test_step_mdp_speed[True-True-True-True-False] 49.9440μs 15.8989μs 62.8974 KOps/s 61.7836 KOps/s $\color{#35bf28}+1.80\%$
test_step_mdp_speed[True-True-True-False-True] 51.3060μs 15.3932μs 64.9638 KOps/s 64.4236 KOps/s $\color{#35bf28}+0.84\%$
test_step_mdp_speed[True-True-True-False-False] 30.3370μs 9.0180μs 110.8892 KOps/s 109.4466 KOps/s $\color{#35bf28}+1.32\%$
test_step_mdp_speed[True-True-False-True-True] 66.7950μs 29.4033μs 34.0098 KOps/s 34.2038 KOps/s $\color{#d91a1a}-0.57\%$
test_step_mdp_speed[True-True-False-True-False] 62.3770μs 17.5961μs 56.8308 KOps/s 56.4619 KOps/s $\color{#35bf28}+0.65\%$
test_step_mdp_speed[True-True-False-False-True] 57.8090μs 17.1627μs 58.2658 KOps/s 58.9102 KOps/s $\color{#d91a1a}-1.09\%$
test_step_mdp_speed[True-True-False-False-False] 37.6710μs 10.6285μs 94.0863 KOps/s 94.3503 KOps/s $\color{#d91a1a}-0.28\%$
test_step_mdp_speed[True-False-True-True-True] 75.2210μs 31.0493μs 32.2068 KOps/s 32.7170 KOps/s $\color{#d91a1a}-1.56\%$
test_step_mdp_speed[True-False-True-True-False] 63.2190μs 19.4671μs 51.3687 KOps/s 52.0003 KOps/s $\color{#d91a1a}-1.21\%$
test_step_mdp_speed[True-False-True-False-True] 0.7270ms 17.3743μs 57.5564 KOps/s 59.0149 KOps/s $\color{#d91a1a}-2.47\%$
test_step_mdp_speed[True-False-True-False-False] 44.5140μs 10.7666μs 92.8801 KOps/s 94.0767 KOps/s $\color{#d91a1a}-1.27\%$
test_step_mdp_speed[True-False-False-True-True] 71.9350μs 32.3410μs 30.9205 KOps/s 31.0760 KOps/s $\color{#d91a1a}-0.50\%$
test_step_mdp_speed[True-False-False-True-False] 48.5110μs 21.0369μs 47.5354 KOps/s 48.2470 KOps/s $\color{#d91a1a}-1.47\%$
test_step_mdp_speed[True-False-False-False-True] 81.2720μs 18.5991μs 53.7661 KOps/s 53.6960 KOps/s $\color{#35bf28}+0.13\%$
test_step_mdp_speed[True-False-False-False-False] 43.1100μs 12.1879μs 82.0487 KOps/s 82.3833 KOps/s $\color{#d91a1a}-0.41\%$
test_step_mdp_speed[False-True-True-True-True] 0.1047ms 30.5726μs 32.7091 KOps/s 32.3991 KOps/s $\color{#35bf28}+0.96\%$
test_step_mdp_speed[False-True-True-True-False] 58.4890μs 19.2544μs 51.9362 KOps/s 52.4042 KOps/s $\color{#d91a1a}-0.89\%$
test_step_mdp_speed[False-True-True-False-True] 60.4730μs 19.6232μs 50.9600 KOps/s 51.8240 KOps/s $\color{#d91a1a}-1.67\%$
test_step_mdp_speed[False-True-True-False-False] 62.1560μs 12.0392μs 83.0622 KOps/s 84.5866 KOps/s $\color{#d91a1a}-1.80\%$
test_step_mdp_speed[False-True-False-True-True] 85.3510μs 32.1578μs 31.0967 KOps/s 30.8255 KOps/s $\color{#35bf28}+0.88\%$
test_step_mdp_speed[False-True-False-True-False] 62.2060μs 20.9878μs 47.6468 KOps/s 48.2863 KOps/s $\color{#d91a1a}-1.32\%$
test_step_mdp_speed[False-True-False-False-True] 2.8379ms 21.3540μs 46.8295 KOps/s 47.5812 KOps/s $\color{#d91a1a}-1.58\%$
test_step_mdp_speed[False-True-False-False-False] 55.5040μs 13.5183μs 73.9738 KOps/s 73.6502 KOps/s $\color{#35bf28}+0.44\%$
test_step_mdp_speed[False-False-True-True-True] 0.1325ms 34.5974μs 28.9039 KOps/s 29.4556 KOps/s $\color{#d91a1a}-1.87\%$
test_step_mdp_speed[False-False-True-True-False] 56.7060μs 22.4507μs 44.5421 KOps/s 44.1322 KOps/s $\color{#35bf28}+0.93\%$
test_step_mdp_speed[False-False-True-False-True] 59.0810μs 21.0452μs 47.5168 KOps/s 47.1825 KOps/s $\color{#35bf28}+0.71\%$
test_step_mdp_speed[False-False-True-False-False] 47.1980μs 13.6213μs 73.4142 KOps/s 73.6942 KOps/s $\color{#d91a1a}-0.38\%$
test_step_mdp_speed[False-False-False-True-True] 91.7930μs 34.9566μs 28.6069 KOps/s 28.5005 KOps/s $\color{#35bf28}+0.37\%$
test_step_mdp_speed[False-False-False-True-False] 0.5156ms 23.5756μs 42.4168 KOps/s 42.0033 KOps/s $\color{#35bf28}+0.98\%$
test_step_mdp_speed[False-False-False-False-True] 56.3250μs 22.6188μs 44.2110 KOps/s 44.6870 KOps/s $\color{#d91a1a}-1.07\%$
test_step_mdp_speed[False-False-False-False-False] 63.4090μs 15.2568μs 65.5444 KOps/s 66.8512 KOps/s $\color{#d91a1a}-1.95\%$
test_values[generalized_advantage_estimate-True-True] 9.6861ms 9.5265ms 104.9702 Ops/s 104.4360 Ops/s $\color{#35bf28}+0.51\%$
test_values[vec_generalized_advantage_estimate-True-True] 37.9870ms 35.8811ms 27.8698 Ops/s 28.7448 Ops/s $\color{#d91a1a}-3.04\%$
test_values[td0_return_estimate-False-False] 0.2421ms 0.1792ms 5.5790 KOps/s 5.6506 KOps/s $\color{#d91a1a}-1.27\%$
test_values[td1_return_estimate-False-False] 24.1341ms 23.8415ms 41.9437 Ops/s 41.9455 Ops/s $-0.00\%$
test_values[vec_td1_return_estimate-False-False] 44.2448ms 36.3739ms 27.4923 Ops/s 28.7942 Ops/s $\color{#d91a1a}-4.52\%$
test_values[td_lambda_return_estimate-True-False] 34.8415ms 34.2733ms 29.1772 Ops/s 28.7351 Ops/s $\color{#35bf28}+1.54\%$
test_values[vec_td_lambda_return_estimate-True-False] 40.4809ms 36.1763ms 27.6424 Ops/s 29.2679 Ops/s $\textbf{\color{#d91a1a}-5.55\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 10.3390ms 8.2686ms 120.9396 Ops/s 121.6153 Ops/s $\color{#d91a1a}-0.56\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.5225ms 1.9151ms 522.1608 Ops/s 498.2789 Ops/s $\color{#35bf28}+4.79\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4793ms 0.3674ms 2.7221 KOps/s 2.7241 KOps/s $\color{#d91a1a}-0.07\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 49.0120ms 46.7680ms 21.3821 Ops/s 22.0547 Ops/s $\color{#d91a1a}-3.05\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.0623ms 3.1015ms 322.4223 Ops/s 327.9540 Ops/s $\color{#d91a1a}-1.69\%$
test_dqn_speed[False-None] 5.9662ms 1.3818ms 723.7003 Ops/s 751.6873 Ops/s $\color{#d91a1a}-3.72\%$
test_dqn_speed[False-backward] 2.0418ms 1.8819ms 531.3743 Ops/s 554.9042 Ops/s $\color{#d91a1a}-4.24\%$
test_dqn_speed[True-None] 1.4854ms 0.4729ms 2.1148 KOps/s 2.1479 KOps/s $\color{#d91a1a}-1.54\%$
test_dqn_speed[True-backward] 0.9583ms 0.8945ms 1.1180 KOps/s 1.1019 KOps/s $\color{#35bf28}+1.46\%$
test_dqn_speed[reduce-overhead-None] 0.7223ms 0.4780ms 2.0921 KOps/s 2.1430 KOps/s $\color{#d91a1a}-2.37\%$
test_dqn_speed[reduce-overhead-backward] 1.0536ms 0.9572ms 1.0447 KOps/s 1.0760 KOps/s $\color{#d91a1a}-2.91\%$
test_ddpg_speed[False-None] 3.7514ms 2.8776ms 347.5107 Ops/s 353.7121 Ops/s $\color{#d91a1a}-1.75\%$
test_ddpg_speed[False-backward] 4.2651ms 4.0846ms 244.8235 Ops/s 250.7775 Ops/s $\color{#d91a1a}-2.37\%$
test_ddpg_speed[True-None] 1.1736ms 1.0070ms 993.0222 Ops/s 987.6992 Ops/s $\color{#35bf28}+0.54\%$
test_ddpg_speed[True-backward] 2.0700ms 1.9591ms 510.4498 Ops/s 511.2017 Ops/s $\color{#d91a1a}-0.15\%$
test_ddpg_speed[reduce-overhead-None] 1.5377ms 1.0248ms 975.7955 Ops/s 995.8154 Ops/s $\color{#d91a1a}-2.01\%$
test_ddpg_speed[reduce-overhead-backward] 2.2150ms 1.9855ms 503.6415 Ops/s 520.6761 Ops/s $\color{#d91a1a}-3.27\%$
test_sac_speed[False-None] 9.8476ms 8.1848ms 122.1772 Ops/s 127.3322 Ops/s $\color{#d91a1a}-4.05\%$
test_sac_speed[False-backward] 14.0010ms 11.3918ms 87.7828 Ops/s 93.5960 Ops/s $\textbf{\color{#d91a1a}-6.21\%}$
test_sac_speed[True-None] 2.4235ms 1.8664ms 535.7819 Ops/s 541.9936 Ops/s $\color{#d91a1a}-1.15\%$
test_sac_speed[True-backward] 3.7342ms 3.6069ms 277.2435 Ops/s 281.0629 Ops/s $\color{#d91a1a}-1.36\%$
test_sac_speed[reduce-overhead-None] 2.4242ms 1.8580ms 538.2186 Ops/s 535.6039 Ops/s $\color{#35bf28}+0.49\%$
test_sac_speed[reduce-overhead-backward] 3.8754ms 3.6281ms 275.6277 Ops/s 271.2612 Ops/s $\color{#35bf28}+1.61\%$
test_redq_speed[False-None] 14.9398ms 13.7138ms 72.9192 Ops/s 77.7289 Ops/s $\textbf{\color{#d91a1a}-6.19\%}$
test_redq_speed[False-backward] 24.9201ms 22.7895ms 43.8798 Ops/s 43.9460 Ops/s $\color{#d91a1a}-0.15\%$
test_redq_speed[True-None] 5.7571ms 4.9509ms 201.9847 Ops/s 206.2642 Ops/s $\color{#d91a1a}-2.07\%$
test_redq_speed[True-backward] 14.6242ms 13.1131ms 76.2595 Ops/s 81.8374 Ops/s $\textbf{\color{#d91a1a}-6.82\%}$
test_redq_speed[reduce-overhead-None] 6.9071ms 5.6380ms 177.3667 Ops/s 212.9106 Ops/s $\textbf{\color{#d91a1a}-16.69\%}$
test_redq_speed[reduce-overhead-backward] 13.9437ms 13.1818ms 75.8620 Ops/s 83.5367 Ops/s $\textbf{\color{#d91a1a}-9.19\%}$
test_redq_deprec_speed[False-None] 17.4389ms 14.1148ms 70.8478 Ops/s 79.1314 Ops/s $\textbf{\color{#d91a1a}-10.47\%}$
test_redq_deprec_speed[False-backward] 22.9705ms 20.5087ms 48.7599 Ops/s 54.5306 Ops/s $\textbf{\color{#d91a1a}-10.58\%}$
test_redq_deprec_speed[True-None] 4.6669ms 4.1889ms 238.7259 Ops/s 271.2263 Ops/s $\textbf{\color{#d91a1a}-11.98\%}$
test_redq_deprec_speed[True-backward] 9.5224ms 9.2675ms 107.9041 Ops/s 122.0213 Ops/s $\textbf{\color{#d91a1a}-11.57\%}$
test_redq_deprec_speed[reduce-overhead-None] 6.1914ms 4.1692ms 239.8562 Ops/s 277.3626 Ops/s $\textbf{\color{#d91a1a}-13.52\%}$
test_redq_deprec_speed[reduce-overhead-backward] 9.4523ms 9.1365ms 109.4509 Ops/s 114.9579 Ops/s $\color{#d91a1a}-4.79\%$
test_td3_speed[False-None] 8.7266ms 8.2641ms 121.0046 Ops/s 125.2779 Ops/s $\color{#d91a1a}-3.41\%$
test_td3_speed[False-backward] 11.3304ms 10.6456ms 93.9352 Ops/s 96.0879 Ops/s $\color{#d91a1a}-2.24\%$
test_td3_speed[True-None] 1.9781ms 1.7495ms 571.5998 Ops/s 578.9123 Ops/s $\color{#d91a1a}-1.26\%$
test_td3_speed[True-backward] 4.0565ms 3.6132ms 276.7636 Ops/s 272.0302 Ops/s $\color{#35bf28}+1.74\%$
test_td3_speed[reduce-overhead-None] 2.3092ms 1.7548ms 569.8782 Ops/s 580.6084 Ops/s $\color{#d91a1a}-1.85\%$
test_td3_speed[reduce-overhead-backward] 3.5872ms 3.4507ms 289.7981 Ops/s 300.6964 Ops/s $\color{#d91a1a}-3.62\%$
test_cql_speed[False-None] 39.2348ms 36.3696ms 27.4955 Ops/s 27.6863 Ops/s $\color{#d91a1a}-0.69\%$
test_cql_speed[False-backward] 55.3363ms 47.0576ms 21.2505 Ops/s 21.4125 Ops/s $\color{#d91a1a}-0.76\%$
test_cql_speed[True-None] 16.9617ms 16.0108ms 62.4577 Ops/s 61.6492 Ops/s $\color{#35bf28}+1.31\%$
test_cql_speed[True-backward] 24.4340ms 22.8433ms 43.7765 Ops/s 45.1382 Ops/s $\color{#d91a1a}-3.02\%$
test_cql_speed[reduce-overhead-None] 17.2188ms 16.2644ms 61.4841 Ops/s 61.7585 Ops/s $\color{#d91a1a}-0.44\%$
test_cql_speed[reduce-overhead-backward] 24.3464ms 23.2845ms 42.9470 Ops/s 43.2896 Ops/s $\color{#d91a1a}-0.79\%$
test_a2c_speed[False-None] 8.6797ms 7.5261ms 132.8716 Ops/s 140.1342 Ops/s $\textbf{\color{#d91a1a}-5.18\%}$
test_a2c_speed[False-backward] 16.8192ms 15.3137ms 65.3008 Ops/s 66.0273 Ops/s $\color{#d91a1a}-1.10\%$
test_a2c_speed[True-None] 4.1747ms 3.4357ms 291.0607 Ops/s 299.2180 Ops/s $\color{#d91a1a}-2.73\%$
test_a2c_speed[True-backward] 11.1461ms 10.4633ms 95.5724 Ops/s 98.8238 Ops/s $\color{#d91a1a}-3.29\%$
test_a2c_speed[reduce-overhead-None] 4.0876ms 3.3990ms 294.2039 Ops/s 300.6360 Ops/s $\color{#d91a1a}-2.14\%$
test_a2c_speed[reduce-overhead-backward] 10.8636ms 10.3525ms 96.5955 Ops/s 101.8039 Ops/s $\textbf{\color{#d91a1a}-5.12\%}$
test_ppo_speed[False-None] 9.5356ms 8.0136ms 124.7882 Ops/s 129.3203 Ops/s $\color{#d91a1a}-3.50\%$
test_ppo_speed[False-backward] 0.3259s 22.2254ms 44.9935 Ops/s 63.0814 Ops/s $\textbf{\color{#d91a1a}-28.67\%}$
test_ppo_speed[True-None] 4.7004ms 3.9345ms 254.1599 Ops/s 264.0425 Ops/s $\color{#d91a1a}-3.74\%$
test_ppo_speed[True-backward] 11.5926ms 10.3490ms 96.6276 Ops/s 102.4128 Ops/s $\textbf{\color{#d91a1a}-5.65\%}$
test_ppo_speed[reduce-overhead-None] 4.5316ms 3.8587ms 259.1515 Ops/s 265.9234 Ops/s $\color{#d91a1a}-2.55\%$
test_ppo_speed[reduce-overhead-backward] 10.7377ms 10.2665ms 97.4044 Ops/s 99.4899 Ops/s $\color{#d91a1a}-2.10\%$
test_reinforce_speed[False-None] 7.8837ms 7.0112ms 142.6282 Ops/s 149.9560 Ops/s $\color{#d91a1a}-4.89\%$
test_reinforce_speed[False-backward] 11.7366ms 10.2003ms 98.0360 Ops/s 101.1821 Ops/s $\color{#d91a1a}-3.11\%$
test_reinforce_speed[True-None] 3.5006ms 2.7816ms 359.5054 Ops/s 369.5653 Ops/s $\color{#d91a1a}-2.72\%$
test_reinforce_speed[True-backward] 10.3409ms 9.0422ms 110.5926 Ops/s 112.0140 Ops/s $\color{#d91a1a}-1.27\%$
test_reinforce_speed[reduce-overhead-None] 3.0763ms 2.7026ms 370.0094 Ops/s 358.9805 Ops/s $\color{#35bf28}+3.07\%$
test_reinforce_speed[reduce-overhead-backward] 9.5971ms 8.9663ms 111.5292 Ops/s 111.6625 Ops/s $\color{#d91a1a}-0.12\%$
test_iql_speed[False-None] 34.3034ms 33.0661ms 30.2425 Ops/s 29.8420 Ops/s $\color{#35bf28}+1.34\%$
test_iql_speed[False-backward] 47.3511ms 45.6807ms 21.8911 Ops/s 21.3306 Ops/s $\color{#35bf28}+2.63\%$
test_iql_speed[True-None] 11.5267ms 11.1108ms 90.0024 Ops/s 88.3091 Ops/s $\color{#35bf28}+1.92\%$
test_iql_speed[True-backward] 23.3648ms 22.3386ms 44.7656 Ops/s 43.6556 Ops/s $\color{#35bf28}+2.54\%$
test_iql_speed[reduce-overhead-None] 12.1873ms 11.1339ms 89.8161 Ops/s 88.0379 Ops/s $\color{#35bf28}+2.02\%$
test_iql_speed[reduce-overhead-backward] 24.6736ms 22.6352ms 44.1790 Ops/s 42.5728 Ops/s $\color{#35bf28}+3.77\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.9837ms 5.1457ms 194.3359 Ops/s 194.9006 Ops/s $\color{#d91a1a}-0.29\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7722ms 0.5367ms 1.8634 KOps/s 1.9175 KOps/s $\color{#d91a1a}-2.82\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7675ms 0.5070ms 1.9725 KOps/s 1.9836 KOps/s $\color{#d91a1a}-0.56\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.7251ms 4.8189ms 207.5163 Ops/s 201.2274 Ops/s $\color{#35bf28}+3.13\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.2973ms 0.5170ms 1.9344 KOps/s 2.0017 KOps/s $\color{#d91a1a}-3.36\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7266ms 0.4940ms 2.0245 KOps/s 2.0916 KOps/s $\color{#d91a1a}-3.21\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.7652ms 1.6799ms 595.2859 Ops/s 610.5823 Ops/s $\color{#d91a1a}-2.51\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.9984ms 1.6065ms 622.4905 Ops/s 633.6255 Ops/s $\color{#d91a1a}-1.76\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.5952ms 5.0001ms 199.9960 Ops/s 203.4018 Ops/s $\color{#d91a1a}-1.67\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.6236ms 0.6653ms 1.5030 KOps/s 1.5188 KOps/s $\color{#d91a1a}-1.04\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0699ms 0.6403ms 1.5618 KOps/s 1.5739 KOps/s $\color{#d91a1a}-0.77\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.5031ms 4.8633ms 205.6214 Ops/s 204.5652 Ops/s $\color{#35bf28}+0.52\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8150ms 0.5380ms 1.8587 KOps/s 1.9251 KOps/s $\color{#d91a1a}-3.45\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 8.1738ms 0.5176ms 1.9318 KOps/s 1.9989 KOps/s $\color{#d91a1a}-3.35\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.7695ms 4.8522ms 206.0921 Ops/s 207.1904 Ops/s $\color{#d91a1a}-0.53\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.3700ms 0.5162ms 1.9371 KOps/s 1.9817 KOps/s $\color{#d91a1a}-2.25\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6826ms 0.5009ms 1.9963 KOps/s 2.0682 KOps/s $\color{#d91a1a}-3.48\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.7679ms 5.1010ms 196.0409 Ops/s 200.1950 Ops/s $\color{#d91a1a}-2.08\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.2427ms 0.6729ms 1.4862 KOps/s 1.5145 KOps/s $\color{#d91a1a}-1.87\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9229ms 0.6516ms 1.5346 KOps/s 1.5691 KOps/s $\color{#d91a1a}-2.20\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.7143ms 4.4012ms 227.2101 Ops/s 218.0573 Ops/s $\color{#35bf28}+4.20\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.7320ms 2.3575ms 424.1871 Ops/s 413.9290 Ops/s $\color{#35bf28}+2.48\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.6712ms 1.3495ms 740.9975 Ops/s 778.0839 Ops/s $\color{#d91a1a}-4.77\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.5123s 14.5593ms 68.6848 Ops/s 233.1302 Ops/s $\textbf{\color{#d91a1a}-70.54\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.0730ms 2.3050ms 433.8401 Ops/s 443.3021 Ops/s $\color{#d91a1a}-2.13\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 5.6220ms 1.3067ms 765.2748 Ops/s 755.3831 Ops/s $\color{#35bf28}+1.31\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 6.0246ms 4.4532ms 224.5601 Ops/s 222.6353 Ops/s $\color{#35bf28}+0.86\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 0.4202s 11.0602ms 90.4141 Ops/s 417.4230 Ops/s $\textbf{\color{#d91a1a}-78.34\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 4.8501ms 1.4914ms 670.5324 Ops/s 691.5935 Ops/s $\color{#d91a1a}-3.05\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 11.5838ms 11.1449ms 89.7270 Ops/s 84.3184 Ops/s $\textbf{\color{#35bf28}+6.41\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 15.9145ms 14.8209ms 67.4722 Ops/s 68.7551 Ops/s $\color{#d91a1a}-1.87\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 20.6944ms 20.2291ms 49.4338 Ops/s 47.1409 Ops/s $\color{#35bf28}+4.86\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.6701ms 15.1612ms 65.9579 Ops/s 66.6776 Ops/s $\color{#d91a1a}-1.08\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 20.7666ms 20.0567ms 49.8588 Ops/s 48.8310 Ops/s $\color{#35bf28}+2.10\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.4958ms 16.2455ms 61.5555 Ops/s 61.7445 Ops/s $\color{#d91a1a}-0.31\%$

Copy link

github-actions bot commented Nov 8, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}14$. Worsened: $\large\color{#d91a1a}9$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7442s 0.7430s 1.3460 Ops/s 1.3247 Ops/s $\color{#35bf28}+1.60\%$
test_transformed 1.0835s 1.0101s 0.9900 Ops/s 0.9954 Ops/s $\color{#d91a1a}-0.54\%$
test_serial 2.2205s 2.1395s 0.4674 Ops/s 0.4613 Ops/s $\color{#35bf28}+1.33\%$
test_parallel 2.0524s 1.9749s 0.5064 Ops/s 0.5196 Ops/s $\color{#d91a1a}-2.55\%$
test_step_mdp_speed[True-True-True-True-True] 0.4183ms 36.0728μs 27.7217 KOps/s 27.2511 KOps/s $\color{#35bf28}+1.73\%$
test_step_mdp_speed[True-True-True-True-False] 0.1281ms 20.3232μs 49.2048 KOps/s 47.8901 KOps/s $\color{#35bf28}+2.75\%$
test_step_mdp_speed[True-True-True-False-True] 48.5700μs 19.5208μs 51.2273 KOps/s 49.3329 KOps/s $\color{#35bf28}+3.84\%$
test_step_mdp_speed[True-True-True-False-False] 0.4062ms 11.4853μs 87.0675 KOps/s 84.9264 KOps/s $\color{#35bf28}+2.52\%$
test_step_mdp_speed[True-True-False-True-True] 0.4822ms 38.0549μs 26.2778 KOps/s 25.7756 KOps/s $\color{#35bf28}+1.95\%$
test_step_mdp_speed[True-True-False-True-False] 73.1810μs 22.1705μs 45.1049 KOps/s 43.6845 KOps/s $\color{#35bf28}+3.25\%$
test_step_mdp_speed[True-True-False-False-True] 0.4146ms 22.3084μs 44.8261 KOps/s 46.7552 KOps/s $\color{#d91a1a}-4.13\%$
test_step_mdp_speed[True-True-False-False-False] 0.4132ms 13.5169μs 73.9817 KOps/s 73.2138 KOps/s $\color{#35bf28}+1.05\%$
test_step_mdp_speed[True-False-True-True-True] 85.4210μs 40.1656μs 24.8969 KOps/s 24.7362 KOps/s $\color{#35bf28}+0.65\%$
test_step_mdp_speed[True-False-True-True-False] 0.4186ms 24.4202μs 40.9498 KOps/s 40.2835 KOps/s $\color{#35bf28}+1.65\%$
test_step_mdp_speed[True-False-True-False-True] 0.4029ms 21.9412μs 45.5765 KOps/s 44.9500 KOps/s $\color{#35bf28}+1.39\%$
test_step_mdp_speed[True-False-True-False-False] 0.4065ms 13.7146μs 72.9147 KOps/s 72.1028 KOps/s $\color{#35bf28}+1.13\%$
test_step_mdp_speed[True-False-False-True-True] 0.2380ms 41.9144μs 23.8582 KOps/s 23.9985 KOps/s $\color{#d91a1a}-0.58\%$
test_step_mdp_speed[True-False-False-True-False] 0.4260ms 26.1205μs 38.2841 KOps/s 37.3912 KOps/s $\color{#35bf28}+2.39\%$
test_step_mdp_speed[True-False-False-False-True] 0.4135ms 23.6274μs 42.3237 KOps/s 41.6333 KOps/s $\color{#35bf28}+1.66\%$
test_step_mdp_speed[True-False-False-False-False] 43.2000μs 15.4983μs 64.5234 KOps/s 63.1710 KOps/s $\color{#35bf28}+2.14\%$
test_step_mdp_speed[False-True-True-True-True] 0.4269ms 40.3928μs 24.7569 KOps/s 24.9817 KOps/s $\color{#d91a1a}-0.90\%$
test_step_mdp_speed[False-True-True-True-False] 0.4052ms 24.3427μs 41.0800 KOps/s 40.1126 KOps/s $\color{#35bf28}+2.41\%$
test_step_mdp_speed[False-True-True-False-True] 0.4172ms 25.4463μs 39.2984 KOps/s 39.1914 KOps/s $\color{#35bf28}+0.27\%$
test_step_mdp_speed[False-True-True-False-False] 48.0200μs 14.8324μs 67.4199 KOps/s 65.7058 KOps/s $\color{#35bf28}+2.61\%$
test_step_mdp_speed[False-True-False-True-True] 0.4395ms 42.3752μs 23.5987 KOps/s 23.5814 KOps/s $\color{#35bf28}+0.07\%$
test_step_mdp_speed[False-True-False-True-False] 0.4043ms 26.2666μs 38.0711 KOps/s 37.6517 KOps/s $\color{#35bf28}+1.11\%$
test_step_mdp_speed[False-True-False-False-True] 3.6122ms 27.3207μs 36.6023 KOps/s 36.6009 KOps/s $+0.00\%$
test_step_mdp_speed[False-True-False-False-False] 0.4098ms 16.6862μs 59.9299 KOps/s 57.8187 KOps/s $\color{#35bf28}+3.65\%$
test_step_mdp_speed[False-False-True-True-True] 0.4273ms 43.5325μs 22.9713 KOps/s 22.6735 KOps/s $\color{#35bf28}+1.31\%$
test_step_mdp_speed[False-False-True-True-False] 95.3310μs 28.3286μs 35.3000 KOps/s 34.4242 KOps/s $\color{#35bf28}+2.54\%$
test_step_mdp_speed[False-False-True-False-True] 0.4236ms 27.2133μs 36.7468 KOps/s 36.8085 KOps/s $\color{#d91a1a}-0.17\%$
test_step_mdp_speed[False-False-True-False-False] 0.3910ms 16.5780μs 60.3208 KOps/s 57.7983 KOps/s $\color{#35bf28}+4.36\%$
test_step_mdp_speed[False-False-False-True-True] 0.5906ms 44.5212μs 22.4612 KOps/s 21.9532 KOps/s $\color{#35bf28}+2.31\%$
test_step_mdp_speed[False-False-False-True-False] 57.8310μs 29.6834μs 33.6889 KOps/s 32.4019 KOps/s $\color{#35bf28}+3.97\%$
test_step_mdp_speed[False-False-False-False-True] 0.4107ms 28.5090μs 35.0767 KOps/s 34.7784 KOps/s $\color{#35bf28}+0.86\%$
test_step_mdp_speed[False-False-False-False-False] 0.3933ms 18.2547μs 54.7805 KOps/s 52.7630 KOps/s $\color{#35bf28}+3.82\%$
test_values[generalized_advantage_estimate-True-True] 27.0701ms 26.5701ms 37.6363 Ops/s 39.9665 Ops/s $\textbf{\color{#d91a1a}-5.83\%}$
test_values[vec_generalized_advantage_estimate-True-True] 94.8426ms 2.7934ms 357.9924 Ops/s 366.4198 Ops/s $\color{#d91a1a}-2.30\%$
test_values[td0_return_estimate-False-False] 88.8110μs 69.0215μs 14.4882 KOps/s 14.8023 KOps/s $\color{#d91a1a}-2.12\%$
test_values[td1_return_estimate-False-False] 59.6499ms 58.1635ms 17.1929 Ops/s 17.8497 Ops/s $\color{#d91a1a}-3.68\%$
test_values[vec_td1_return_estimate-False-False] 1.3032ms 1.0807ms 925.3668 Ops/s 925.6001 Ops/s $\color{#d91a1a}-0.03\%$
test_values[td_lambda_return_estimate-True-False] 95.2700ms 93.0154ms 10.7509 Ops/s 11.2429 Ops/s $\color{#d91a1a}-4.38\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3376ms 1.0866ms 920.2740 Ops/s 927.1137 Ops/s $\color{#d91a1a}-0.74\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 26.6296ms 25.8618ms 38.6670 Ops/s 40.8395 Ops/s $\textbf{\color{#d91a1a}-5.32\%}$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0941ms 0.7576ms 1.3200 KOps/s 1.3119 KOps/s $\color{#35bf28}+0.62\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.8211ms 0.6693ms 1.4941 KOps/s 1.4999 KOps/s $\color{#d91a1a}-0.39\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.7792ms 1.4877ms 672.1871 Ops/s 674.1338 Ops/s $\color{#d91a1a}-0.29\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 1.0720ms 0.6824ms 1.4655 KOps/s 1.4737 KOps/s $\color{#d91a1a}-0.56\%$
test_dqn_speed[False-None] 7.0634ms 1.3520ms 739.6513 Ops/s 730.5080 Ops/s $\color{#35bf28}+1.25\%$
test_dqn_speed[False-backward] 2.3128ms 1.9112ms 523.2422 Ops/s 530.7285 Ops/s $\color{#d91a1a}-1.41\%$
test_dqn_speed[True-None] 0.8937ms 0.5670ms 1.7637 KOps/s 1.6821 KOps/s $\color{#35bf28}+4.85\%$
test_dqn_speed[True-backward] 1.1893ms 1.0263ms 974.4116 Ops/s 806.0520 Ops/s $\textbf{\color{#35bf28}+20.89\%}$
test_dqn_speed[reduce-overhead-None] 0.9394ms 0.5623ms 1.7783 KOps/s 1.7047 KOps/s $\color{#35bf28}+4.31\%$
test_dqn_speed[reduce-overhead-backward] 1.1646ms 1.0258ms 974.8037 Ops/s 956.7320 Ops/s $\color{#35bf28}+1.89\%$
test_ddpg_speed[False-None] 9.1039ms 2.7652ms 361.6412 Ops/s 364.9208 Ops/s $\color{#d91a1a}-0.90\%$
test_ddpg_speed[False-backward] 4.2023ms 3.9769ms 251.4529 Ops/s 248.8916 Ops/s $\color{#35bf28}+1.03\%$
test_ddpg_speed[True-None] 1.4807ms 1.2569ms 795.6116 Ops/s 770.9756 Ops/s $\color{#35bf28}+3.20\%$
test_ddpg_speed[True-backward] 2.4328ms 2.2614ms 442.1998 Ops/s 400.7169 Ops/s $\textbf{\color{#35bf28}+10.35\%}$
test_ddpg_speed[reduce-overhead-None] 1.5113ms 1.2668ms 789.4183 Ops/s 753.1005 Ops/s $\color{#35bf28}+4.82\%$
test_ddpg_speed[reduce-overhead-backward] 2.4066ms 2.2605ms 442.3831 Ops/s 439.8624 Ops/s $\color{#35bf28}+0.57\%$
test_sac_speed[False-None] 8.8231ms 7.6791ms 130.2231 Ops/s 130.7104 Ops/s $\color{#d91a1a}-0.37\%$
test_sac_speed[False-backward] 11.6634ms 10.9864ms 91.0214 Ops/s 91.4723 Ops/s $\color{#d91a1a}-0.49\%$
test_sac_speed[True-None] 2.2952ms 2.0385ms 490.5514 Ops/s 472.8967 Ops/s $\color{#35bf28}+3.73\%$
test_sac_speed[True-backward] 4.1414ms 3.9846ms 250.9652 Ops/s 243.7239 Ops/s $\color{#35bf28}+2.97\%$
test_sac_speed[reduce-overhead-None] 2.3027ms 2.0685ms 483.4310 Ops/s 484.9998 Ops/s $\color{#d91a1a}-0.32\%$
test_sac_speed[reduce-overhead-backward] 4.3174ms 4.1114ms 243.2272 Ops/s 247.3968 Ops/s $\color{#d91a1a}-1.69\%$
test_redq_speed[False-None] 16.6104ms 10.8117ms 92.4926 Ops/s 79.1625 Ops/s $\textbf{\color{#35bf28}+16.84\%}$
test_redq_speed[False-backward] 18.6129ms 17.6877ms 56.5366 Ops/s 54.2947 Ops/s $\color{#35bf28}+4.13\%$
test_redq_speed[True-None] 3.8948ms 3.5737ms 279.8194 Ops/s 275.1720 Ops/s $\color{#35bf28}+1.69\%$
test_redq_speed[True-backward] 9.0309ms 8.7506ms 114.2779 Ops/s 110.7546 Ops/s $\color{#35bf28}+3.18\%$
test_redq_speed[reduce-overhead-None] 3.9361ms 3.6840ms 271.4440 Ops/s 288.4524 Ops/s $\textbf{\color{#d91a1a}-5.90\%}$
test_redq_speed[reduce-overhead-backward] 9.0452ms 8.7844ms 113.8377 Ops/s 115.1209 Ops/s $\color{#d91a1a}-1.11\%$
test_redq_deprec_speed[False-None] 11.2461ms 10.7858ms 92.7144 Ops/s 91.3097 Ops/s $\color{#35bf28}+1.54\%$
test_redq_deprec_speed[False-backward] 16.2574ms 15.6211ms 64.0161 Ops/s 62.5624 Ops/s $\color{#35bf28}+2.32\%$
test_redq_deprec_speed[True-None] 3.5273ms 3.3090ms 302.2084 Ops/s 292.1461 Ops/s $\color{#35bf28}+3.44\%$
test_redq_deprec_speed[True-backward] 7.6768ms 7.3823ms 135.4595 Ops/s 133.2320 Ops/s $\color{#35bf28}+1.67\%$
test_redq_deprec_speed[reduce-overhead-None] 3.5529ms 3.2553ms 307.1951 Ops/s 298.2594 Ops/s $\color{#35bf28}+3.00\%$
test_redq_deprec_speed[reduce-overhead-backward] 7.7560ms 7.3287ms 136.4507 Ops/s 132.4745 Ops/s $\color{#35bf28}+3.00\%$
test_td3_speed[False-None] 7.6839ms 7.5507ms 132.4385 Ops/s 131.9825 Ops/s $\color{#35bf28}+0.35\%$
test_td3_speed[False-backward] 11.1471ms 10.5483ms 94.8021 Ops/s 93.0199 Ops/s $\color{#35bf28}+1.92\%$
test_td3_speed[True-None] 1.9796ms 1.9278ms 518.7223 Ops/s 510.9538 Ops/s $\color{#35bf28}+1.52\%$
test_td3_speed[True-backward] 3.9240ms 3.7464ms 266.9228 Ops/s 259.3373 Ops/s $\color{#35bf28}+2.92\%$
test_td3_speed[reduce-overhead-None] 1.9643ms 1.9275ms 518.8056 Ops/s 501.1966 Ops/s $\color{#35bf28}+3.51\%$
test_td3_speed[reduce-overhead-backward] 3.9294ms 3.7347ms 267.7608 Ops/s 267.9758 Ops/s $\color{#d91a1a}-0.08\%$
test_cql_speed[False-None] 28.5839ms 25.7491ms 38.8363 Ops/s 39.7975 Ops/s $\color{#d91a1a}-2.42\%$
test_cql_speed[False-backward] 35.5432ms 34.6393ms 28.8689 Ops/s 21.5749 Ops/s $\textbf{\color{#35bf28}+33.81\%}$
test_cql_speed[True-None] 12.0630ms 11.1063ms 90.0392 Ops/s 87.1933 Ops/s $\color{#35bf28}+3.26\%$
test_cql_speed[True-backward] 17.8337ms 17.1066ms 58.4569 Ops/s 57.3776 Ops/s $\color{#35bf28}+1.88\%$
test_cql_speed[reduce-overhead-None] 11.3650ms 11.0378ms 90.5974 Ops/s 89.6271 Ops/s $\color{#35bf28}+1.08\%$
test_cql_speed[reduce-overhead-backward] 17.9207ms 17.1371ms 58.3531 Ops/s 57.6133 Ops/s $\color{#35bf28}+1.28\%$
test_a2c_speed[False-None] 5.5769ms 5.2556ms 190.2729 Ops/s 181.8816 Ops/s $\color{#35bf28}+4.61\%$
test_a2c_speed[False-backward] 12.3885ms 11.9135ms 83.9383 Ops/s 79.5510 Ops/s $\textbf{\color{#35bf28}+5.52\%}$
test_a2c_speed[True-None] 3.4354ms 3.0864ms 324.0069 Ops/s 323.0002 Ops/s $\color{#35bf28}+0.31\%$
test_a2c_speed[True-backward] 8.8959ms 8.6741ms 115.2861 Ops/s 110.4571 Ops/s $\color{#35bf28}+4.37\%$
test_a2c_speed[reduce-overhead-None] 3.3813ms 3.0431ms 328.6133 Ops/s 322.8250 Ops/s $\color{#35bf28}+1.79\%$
test_a2c_speed[reduce-overhead-backward] 9.0816ms 8.6184ms 116.0303 Ops/s 117.2643 Ops/s $\color{#d91a1a}-1.05\%$
test_ppo_speed[False-None] 6.0846ms 5.7720ms 173.2495 Ops/s 172.4289 Ops/s $\color{#35bf28}+0.48\%$
test_ppo_speed[False-backward] 13.0994ms 12.6585ms 78.9982 Ops/s 78.9764 Ops/s $\color{#35bf28}+0.03\%$
test_ppo_speed[True-None] 3.6433ms 3.4470ms 290.1068 Ops/s 286.8755 Ops/s $\color{#35bf28}+1.13\%$
test_ppo_speed[True-backward] 8.7126ms 8.3139ms 120.2811 Ops/s 119.7207 Ops/s $\color{#35bf28}+0.47\%$
test_ppo_speed[reduce-overhead-None] 3.8015ms 3.4625ms 288.8117 Ops/s 290.8223 Ops/s $\color{#d91a1a}-0.69\%$
test_ppo_speed[reduce-overhead-backward] 8.7606ms 8.3313ms 120.0297 Ops/s 119.0397 Ops/s $\color{#35bf28}+0.83\%$
test_reinforce_speed[False-None] 6.2918ms 4.5527ms 219.6504 Ops/s 220.6008 Ops/s $\color{#d91a1a}-0.43\%$
test_reinforce_speed[False-backward] 7.8336ms 7.5099ms 133.1570 Ops/s 134.6003 Ops/s $\color{#d91a1a}-1.07\%$
test_reinforce_speed[True-None] 2.6519ms 2.2362ms 447.1899 Ops/s 450.5431 Ops/s $\color{#d91a1a}-0.74\%$
test_reinforce_speed[True-backward] 7.6052ms 7.2617ms 137.7095 Ops/s 137.2616 Ops/s $\color{#35bf28}+0.33\%$
test_reinforce_speed[reduce-overhead-None] 2.6782ms 2.2580ms 442.8782 Ops/s 453.2123 Ops/s $\color{#d91a1a}-2.28\%$
test_reinforce_speed[reduce-overhead-backward] 7.5215ms 7.2625ms 137.6935 Ops/s 137.7221 Ops/s $\color{#d91a1a}-0.02\%$
test_iql_speed[False-None] 20.5273ms 19.7831ms 50.5481 Ops/s 49.5607 Ops/s $\color{#35bf28}+1.99\%$
test_iql_speed[False-backward] 33.0085ms 31.0367ms 32.2199 Ops/s 32.3143 Ops/s $\color{#d91a1a}-0.29\%$
test_iql_speed[True-None] 7.3008ms 6.8828ms 145.2887 Ops/s 146.0786 Ops/s $\color{#d91a1a}-0.54\%$
test_iql_speed[True-backward] 16.2794ms 15.6605ms 63.8551 Ops/s 62.5495 Ops/s $\color{#35bf28}+2.09\%$
test_iql_speed[reduce-overhead-None] 7.3479ms 6.8950ms 145.0319 Ops/s 143.6177 Ops/s $\color{#35bf28}+0.98\%$
test_iql_speed[reduce-overhead-backward] 16.2773ms 15.7072ms 63.6649 Ops/s 62.7320 Ops/s $\color{#35bf28}+1.49\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.5368ms 6.3138ms 158.3830 Ops/s 157.2500 Ops/s $\color{#35bf28}+0.72\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7902ms 0.3597ms 2.7804 KOps/s 2.9297 KOps/s $\textbf{\color{#d91a1a}-5.10\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6831ms 0.3325ms 3.0077 KOps/s 3.1143 KOps/s $\color{#d91a1a}-3.42\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4520ms 6.1125ms 163.5985 Ops/s 162.4725 Ops/s $\color{#35bf28}+0.69\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.2896ms 0.3358ms 2.9782 KOps/s 3.0380 KOps/s $\color{#d91a1a}-1.97\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5509ms 0.3151ms 3.1732 KOps/s 3.2211 KOps/s $\color{#d91a1a}-1.49\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6981ms 1.3795ms 724.8907 Ops/s 742.3201 Ops/s $\color{#d91a1a}-2.35\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5659ms 1.3676ms 731.2141 Ops/s 770.2926 Ops/s $\textbf{\color{#d91a1a}-5.07\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5026ms 6.2858ms 159.0887 Ops/s 159.9016 Ops/s $\color{#d91a1a}-0.51\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.6480ms 0.4250ms 2.3529 KOps/s 2.1717 KOps/s $\textbf{\color{#35bf28}+8.34\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6229ms 0.4040ms 2.4750 KOps/s 2.2537 KOps/s $\textbf{\color{#35bf28}+9.82\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.3357ms 6.1643ms 162.2247 Ops/s 161.4637 Ops/s $\color{#35bf28}+0.47\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.3211ms 0.2866ms 3.4897 KOps/s 3.0443 KOps/s $\textbf{\color{#35bf28}+14.63\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6484ms 0.3556ms 2.8122 KOps/s 3.3786 KOps/s $\textbf{\color{#d91a1a}-16.76\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4605ms 6.1160ms 163.5065 Ops/s 163.8598 Ops/s $\color{#d91a1a}-0.22\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6821ms 0.2648ms 3.7769 KOps/s 3.2829 KOps/s $\textbf{\color{#35bf28}+15.05\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6109ms 0.3319ms 3.0131 KOps/s 2.7619 KOps/s $\textbf{\color{#35bf28}+9.10\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4514ms 6.2958ms 158.8358 Ops/s 158.5833 Ops/s $\color{#35bf28}+0.16\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.9749ms 0.4485ms 2.2298 KOps/s 2.0934 KOps/s $\textbf{\color{#35bf28}+6.51\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7102ms 0.4315ms 2.3174 KOps/s 2.2439 KOps/s $\color{#35bf28}+3.28\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.4550s 14.3951ms 69.4679 Ops/s 191.0328 Ops/s $\textbf{\color{#d91a1a}-63.64\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 9.8475ms 2.1175ms 472.2454 Ops/s 450.4197 Ops/s $\color{#35bf28}+4.85\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.3157ms 1.2389ms 807.1420 Ops/s 908.6344 Ops/s $\textbf{\color{#d91a1a}-11.17\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.1804ms 5.3633ms 186.4521 Ops/s 32.6159 Ops/s $\textbf{\color{#35bf28}+471.66\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 9.2528ms 2.0706ms 482.9417 Ops/s 493.2199 Ops/s $\color{#d91a1a}-2.08\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.0371ms 1.1546ms 866.1274 Ops/s 782.9747 Ops/s $\textbf{\color{#35bf28}+10.62\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.3794s 13.0569ms 76.5881 Ops/s 171.7467 Ops/s $\textbf{\color{#d91a1a}-55.41\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 11.7300ms 2.2743ms 439.6951 Ops/s 442.8806 Ops/s $\color{#d91a1a}-0.72\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.2667ms 1.2826ms 779.6792 Ops/s 704.8316 Ops/s $\textbf{\color{#35bf28}+10.62\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.6829ms 13.1829ms 75.8561 Ops/s 77.4724 Ops/s $\color{#d91a1a}-2.09\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 18.0388ms 17.0183ms 58.7603 Ops/s 57.7457 Ops/s $\color{#35bf28}+1.76\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.1322ms 17.5636ms 56.9360 Ops/s 56.0215 Ops/s $\color{#35bf28}+1.63\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.3566ms 17.4847ms 57.1927 Ops/s 58.9718 Ops/s $\color{#d91a1a}-3.02\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 19.9877ms 17.7931ms 56.2014 Ops/s 55.1873 Ops/s $\color{#35bf28}+1.84\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 22.0092ms 19.0830ms 52.4026 Ops/s 53.6880 Ops/s $\color{#d91a1a}-2.39\%$

vmoens added a commit that referenced this pull request Nov 14, 2024
…th parent batch size" (#2544)

(cherry picked from commit 8a8b4c3)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. revert
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants