Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Same behaviour in compiled and non-compiled versions of _new_unsafe #1197

Merged
merged 1 commit into from
Jan 30, 2025

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jan 30, 2025

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Jan 30, 2025
…_unsafe

ghstack-source-id: 075f953ca7fbd7fc54d797e12437db59b44bde03
Pull Request resolved: #1197
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 30, 2025
@vmoens
Copy link
Contributor Author

vmoens commented Jan 30, 2025

@anijain2305 Just reporting a very minor gb: self = cls.__new__(cls) (which is a very hacky way of creating a python object I admit) causes a graph break. In general we don't care too much but in a very few cases people want to subclass TensorDict and then we need that call. It's far from being a high-pri issue :)

@vmoens vmoens added the bug Something isn't working label Jan 30, 2025
@vmoens vmoens merged commit 90e8f8b into gh/vmoens/48/base Jan 30, 2025
34 of 40 checks passed
vmoens added a commit that referenced this pull request Jan 30, 2025
…_unsafe

ghstack-source-id: 075f953ca7fbd7fc54d797e12437db59b44bde03
Pull Request resolved: #1197
@vmoens vmoens deleted the gh/vmoens/48/head branch January 30, 2025 21:20
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}10$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 46.9180μs 20.7259μs 48.2489 KOps/s 48.5726 KOps/s $\color{#d91a1a}-0.67\%$
test_plain_set_stack_nested 52.3370μs 20.8117μs 48.0500 KOps/s 48.1692 KOps/s $\color{#d91a1a}-0.25\%$
test_plain_set_nested_inplace 50.4850μs 22.4886μs 44.4670 KOps/s 44.1645 KOps/s $\color{#35bf28}+0.68\%$
test_plain_set_stack_nested_inplace 75.7330μs 22.4712μs 44.5013 KOps/s 44.3009 KOps/s $\color{#35bf28}+0.45\%$
test_items 28.1730μs 4.1951μs 238.3716 KOps/s 234.6552 KOps/s $\color{#35bf28}+1.58\%$
test_items_nested 0.5130ms 0.4007ms 2.4956 KOps/s 2.4890 KOps/s $\color{#35bf28}+0.26\%$
test_items_nested_locked 0.4630ms 0.4033ms 2.4793 KOps/s 2.4715 KOps/s $\color{#35bf28}+0.32\%$
test_items_nested_leaf 0.1383ms 75.9129μs 13.1730 KOps/s 12.7091 KOps/s $\color{#35bf28}+3.65\%$
test_items_stack_nested 0.5129ms 0.4066ms 2.4593 KOps/s 2.4892 KOps/s $\color{#d91a1a}-1.20\%$
test_items_stack_nested_leaf 0.1271ms 78.6654μs 12.7121 KOps/s 12.2665 KOps/s $\color{#35bf28}+3.63\%$
test_items_stack_nested_locked 0.5981ms 0.4049ms 2.4698 KOps/s 2.4649 KOps/s $\color{#35bf28}+0.20\%$
test_keys 22.8930μs 3.5299μs 283.2960 KOps/s 286.6233 KOps/s $\color{#d91a1a}-1.16\%$
test_keys_nested 0.2217ms 0.1640ms 6.0965 KOps/s 5.9659 KOps/s $\color{#35bf28}+2.19\%$
test_keys_nested_locked 1.7364ms 0.1715ms 5.8324 KOps/s 5.8456 KOps/s $\color{#d91a1a}-0.23\%$
test_keys_nested_leaf 0.2690ms 0.1439ms 6.9486 KOps/s 7.0013 KOps/s $\color{#d91a1a}-0.75\%$
test_keys_stack_nested 0.2683ms 0.1644ms 6.0819 KOps/s 6.1663 KOps/s $\color{#d91a1a}-1.37\%$
test_keys_stack_nested_leaf 0.3256ms 0.1436ms 6.9658 KOps/s 7.0528 KOps/s $\color{#d91a1a}-1.23\%$
test_keys_stack_nested_locked 0.2979ms 0.1701ms 5.8795 KOps/s 5.9300 KOps/s $\color{#d91a1a}-0.85\%$
test_values 8.9366μs 1.0399μs 961.6549 KOps/s 945.6124 KOps/s $\color{#35bf28}+1.70\%$
test_values_nested 0.1468ms 62.0415μs 16.1183 KOps/s 16.2113 KOps/s $\color{#d91a1a}-0.57\%$
test_values_nested_locked 0.1128ms 61.5156μs 16.2561 KOps/s 16.1643 KOps/s $\color{#35bf28}+0.57\%$
test_values_nested_leaf 0.1202ms 70.7795μs 14.1284 KOps/s 13.3768 KOps/s $\textbf{\color{#35bf28}+5.62\%}$
test_values_stack_nested 0.1110ms 62.7095μs 15.9466 KOps/s 15.8486 KOps/s $\color{#35bf28}+0.62\%$
test_values_stack_nested_leaf 0.1389ms 71.2636μs 14.0324 KOps/s 14.0895 KOps/s $\color{#d91a1a}-0.41\%$
test_values_stack_nested_locked 0.1123ms 63.3787μs 15.7782 KOps/s 15.8525 KOps/s $\color{#d91a1a}-0.47\%$
test_membership 36.0070μs 0.8697μs 1.1498 MOps/s 1.1625 MOps/s $\color{#d91a1a}-1.09\%$
test_membership_nested 20.1370μs 2.8857μs 346.5338 KOps/s 348.0655 KOps/s $\color{#d91a1a}-0.44\%$
test_membership_nested_leaf 41.7770μs 2.8981μs 345.0562 KOps/s 345.0971 KOps/s $\color{#d91a1a}-0.01\%$
test_membership_stacked_nested 31.8590μs 2.8621μs 349.3984 KOps/s 348.7326 KOps/s $\color{#35bf28}+0.19\%$
test_membership_stacked_nested_leaf 50.3140μs 2.8699μs 348.4494 KOps/s 340.3026 KOps/s $\color{#35bf28}+2.39\%$
test_membership_nested_last 26.1490μs 4.3197μs 231.4973 KOps/s 234.1807 KOps/s $\color{#d91a1a}-1.15\%$
test_membership_nested_leaf_last 34.2040μs 4.3000μs 232.5558 KOps/s 234.6854 KOps/s $\color{#d91a1a}-0.91\%$
test_membership_stacked_nested_last 45.0740μs 4.2780μs 233.7567 KOps/s 106.5999 KOps/s $\textbf{\color{#35bf28}+119.28\%}$
test_membership_stacked_nested_leaf_last 25.7880μs 4.2969μs 232.7233 KOps/s 107.4461 KOps/s $\textbf{\color{#35bf28}+116.60\%}$
test_nested_getleaf 60.5620μs 10.4576μs 95.6244 KOps/s 95.8948 KOps/s $\color{#d91a1a}-0.28\%$
test_nested_get 34.4940μs 9.9610μs 100.3918 KOps/s 100.4640 KOps/s $\color{#d91a1a}-0.07\%$
test_stacked_getleaf 31.0880μs 10.4674μs 95.5349 KOps/s 94.9007 KOps/s $\color{#35bf28}+0.67\%$
test_stacked_get 30.6770μs 9.9423μs 100.5806 KOps/s 99.5029 KOps/s $\color{#35bf28}+1.08\%$
test_nested_getitemleaf 31.8990μs 11.0662μs 90.3656 KOps/s 89.7426 KOps/s $\color{#35bf28}+0.69\%$
test_nested_getitem 44.8030μs 10.5509μs 94.7791 KOps/s 93.7805 KOps/s $\color{#35bf28}+1.06\%$
test_stacked_getitemleaf 49.6940μs 11.0455μs 90.5344 KOps/s 89.5265 KOps/s $\color{#35bf28}+1.13\%$
test_stacked_getitem 35.9870μs 10.6902μs 93.5436 KOps/s 92.7883 KOps/s $\color{#35bf28}+0.81\%$
test_lock_nested 0.5312ms 0.4107ms 2.4350 KOps/s 2.4099 KOps/s $\color{#35bf28}+1.04\%$
test_lock_stack_nested 0.5833ms 0.4156ms 2.4060 KOps/s 2.3903 KOps/s $\color{#35bf28}+0.66\%$
test_unlock_nested 0.7877ms 0.3477ms 2.8763 KOps/s 2.9849 KOps/s $\color{#d91a1a}-3.64\%$
test_unlock_stack_nested 0.6661ms 0.3422ms 2.9219 KOps/s 2.9711 KOps/s $\color{#d91a1a}-1.66\%$
test_flatten_speed 0.2007ms 98.9567μs 10.1054 KOps/s 9.8063 KOps/s $\color{#35bf28}+3.05\%$
test_unflatten_speed 0.8887ms 0.5205ms 1.9212 KOps/s 1.9279 KOps/s $\color{#d91a1a}-0.35\%$
test_common_ops 3.9522ms 0.8146ms 1.2276 KOps/s 1.2228 KOps/s $\color{#35bf28}+0.40\%$
test_creation 37.6010μs 2.4701μs 404.8499 KOps/s 397.9331 KOps/s $\color{#35bf28}+1.74\%$
test_creation_empty 38.3120μs 12.3892μs 80.7154 KOps/s 81.8137 KOps/s $\color{#d91a1a}-1.34\%$
test_creation_nested_1 45.0950μs 15.7290μs 63.5770 KOps/s 66.5186 KOps/s $\color{#d91a1a}-4.42\%$
test_creation_nested_2 45.1340μs 20.2803μs 49.3089 KOps/s 50.6393 KOps/s $\color{#d91a1a}-2.63\%$
test_clone 39.5940μs 13.2640μs 75.3918 KOps/s 73.4811 KOps/s $\color{#35bf28}+2.60\%$
test_getitem[int] 0.7230ms 12.6816μs 78.8544 KOps/s 79.5262 KOps/s $\color{#d91a1a}-0.84\%$
test_getitem[slice_int] 0.1349ms 24.8962μs 40.1668 KOps/s 40.7346 KOps/s $\color{#d91a1a}-1.39\%$
test_getitem[range] 0.1580ms 50.9748μs 19.6176 KOps/s 19.3407 KOps/s $\color{#35bf28}+1.43\%$
test_getitem[tuple] 0.1236ms 19.9340μs 50.1654 KOps/s 49.7412 KOps/s $\color{#35bf28}+0.85\%$
test_getitem[list] 0.1597ms 45.5170μs 21.9698 KOps/s 21.6485 KOps/s $\color{#35bf28}+1.48\%$
test_setitem_dim[int] 47.0780μs 26.3623μs 37.9330 KOps/s 36.4213 KOps/s $\color{#35bf28}+4.15\%$
test_setitem_dim[slice_int] 86.6710μs 52.5550μs 19.0277 KOps/s 18.4651 KOps/s $\color{#35bf28}+3.05\%$
test_setitem_dim[range] 0.1404ms 78.4312μs 12.7500 KOps/s 12.6381 KOps/s $\color{#35bf28}+0.89\%$
test_setitem_dim[tuple] 74.2980μs 41.5362μs 24.0754 KOps/s 23.5306 KOps/s $\color{#35bf28}+2.32\%$
test_setitem 67.8360μs 21.5631μs 46.3756 KOps/s 48.0337 KOps/s $\color{#d91a1a}-3.45\%$
test_set 56.1250μs 20.4058μs 49.0056 KOps/s 49.2652 KOps/s $\color{#d91a1a}-0.53\%$
test_set_shared 0.3063ms 0.1776ms 5.6310 KOps/s 5.3343 KOps/s $\textbf{\color{#35bf28}+5.56\%}$
test_update 0.1210ms 23.6318μs 42.3159 KOps/s 42.7137 KOps/s $\color{#d91a1a}-0.93\%$
test_update_nested 91.0300μs 33.7202μs 29.6558 KOps/s 29.6000 KOps/s $\color{#35bf28}+0.19\%$
test_update__nested 0.4307ms 34.4236μs 29.0499 KOps/s 29.0671 KOps/s $\color{#d91a1a}-0.06\%$
test_set_nested 84.7080μs 22.6843μs 44.0834 KOps/s 44.6290 KOps/s $\color{#d91a1a}-1.22\%$
test_set_nested_new 69.3890μs 27.9376μs 35.7941 KOps/s 36.4281 KOps/s $\color{#d91a1a}-1.74\%$
test_select 94.8970μs 44.2573μs 22.5952 KOps/s 22.2233 KOps/s $\color{#35bf28}+1.67\%$
test_select_nested 0.1280ms 62.4304μs 16.0178 KOps/s 15.9265 KOps/s $\color{#35bf28}+0.57\%$
test_exclude_nested 0.1561ms 80.3586μs 12.4442 KOps/s 12.3591 KOps/s $\color{#35bf28}+0.69\%$
test_empty[True] 0.5694ms 0.4085ms 2.4477 KOps/s 2.4303 KOps/s $\color{#35bf28}+0.71\%$
test_empty[False] 9.0243μs 1.3692μs 730.3283 KOps/s 719.7549 KOps/s $\color{#35bf28}+1.47\%$
test_unbind_speed 0.4622ms 0.2755ms 3.6293 KOps/s 3.6865 KOps/s $\color{#d91a1a}-1.55\%$
test_unbind_speed_stack0 0.4135ms 0.2714ms 3.6840 KOps/s 3.7602 KOps/s $\color{#d91a1a}-2.03\%$
test_unbind_speed_stack1 94.4239ms 0.7239ms 1.3815 KOps/s 1.3961 KOps/s $\color{#d91a1a}-1.05\%$
test_split 98.6612ms 1.7134ms 583.6266 Ops/s 527.3089 Ops/s $\textbf{\color{#35bf28}+10.68\%}$
test_chunk 0.1021s 1.7310ms 577.7174 Ops/s 636.3724 Ops/s $\textbf{\color{#d91a1a}-9.22\%}$
test_consolidate_njt[False-None] 8.9671ms 8.3954ms 119.1132 Ops/s 120.5016 Ops/s $\color{#d91a1a}-1.15\%$
test_creation[device0] 0.2320ms 91.5815μs 10.9192 KOps/s 11.0135 KOps/s $\color{#d91a1a}-0.86\%$
test_creation_from_tensor 0.2290ms 94.1887μs 10.6170 KOps/s 10.5388 KOps/s $\color{#35bf28}+0.74\%$
test_add_one[memmap_tensor0] 92.7730μs 5.1953μs 192.4802 KOps/s 197.2354 KOps/s $\color{#d91a1a}-2.41\%$
test_contiguous[memmap_tensor0] 21.6110μs 0.5065μs 1.9744 MOps/s 1.8168 MOps/s $\textbf{\color{#35bf28}+8.67\%}$
test_stack[memmap_tensor0] 25.7480μs 3.3855μs 295.3786 KOps/s 295.1337 KOps/s $\color{#35bf28}+0.08\%$
test_memmaptd_index 1.2558ms 0.2301ms 4.3462 KOps/s 4.4180 KOps/s $\color{#d91a1a}-1.63\%$
test_memmaptd_index_astensor 0.6637ms 0.3175ms 3.1499 KOps/s 3.2274 KOps/s $\color{#d91a1a}-2.40\%$
test_memmaptd_index_op 0.9360ms 0.5909ms 1.6923 KOps/s 1.6802 KOps/s $\color{#35bf28}+0.72\%$
test_serialize_model 0.2070s 0.1294s 7.7261 Ops/s 8.7998 Ops/s $\textbf{\color{#d91a1a}-12.20\%}$
test_serialize_model_pickle 0.4600s 0.3904s 2.5616 Ops/s 2.5701 Ops/s $\color{#d91a1a}-0.33\%$
test_serialize_weights 0.1194s 0.1123s 8.9030 Ops/s 8.7159 Ops/s $\color{#35bf28}+2.15\%$
test_serialize_weights_returnearly 0.1779s 0.1607s 6.2227 Ops/s 6.4811 Ops/s $\color{#d91a1a}-3.99\%$
test_serialize_weights_pickle 0.5729s 0.4310s 2.3200 Ops/s 2.4981 Ops/s $\textbf{\color{#d91a1a}-7.13\%}$
test_serialize_weights_filesystem 0.2341s 0.1558s 6.4180 Ops/s 6.9585 Ops/s $\textbf{\color{#d91a1a}-7.77\%}$
test_serialize_model_filesystem 0.1586s 0.1463s 6.8370 Ops/s 6.5465 Ops/s $\color{#35bf28}+4.44\%$
test_reshape_pytree 70.9720μs 26.1963μs 38.1733 KOps/s 36.9297 KOps/s $\color{#35bf28}+3.37\%$
test_reshape_td 0.1373ms 33.3935μs 29.9459 KOps/s 29.6352 KOps/s $\color{#35bf28}+1.05\%$
test_view_pytree 73.4970μs 26.1928μs 38.1784 KOps/s 38.5272 KOps/s $\color{#d91a1a}-0.91\%$
test_view_td 81.2420μs 38.4767μs 25.9898 KOps/s 25.4867 KOps/s $\color{#35bf28}+1.97\%$
test_unbind_pytree 95.4060μs 29.4831μs 33.9178 KOps/s 33.9332 KOps/s $\color{#d91a1a}-0.05\%$
test_unbind_td 0.3391ms 40.9415μs 24.4251 KOps/s 24.9783 KOps/s $\color{#d91a1a}-2.21\%$
test_split_pytree 74.6790μs 29.3479μs 34.0740 KOps/s 34.2796 KOps/s $\color{#d91a1a}-0.60\%$
test_split_td 0.4673ms 45.3862μs 22.0331 KOps/s 22.3567 KOps/s $\color{#d91a1a}-1.45\%$
test_add_pytree 79.2080μs 35.4503μs 28.2085 KOps/s 28.1414 KOps/s $\color{#35bf28}+0.24\%$
test_add_td 0.1664ms 59.1011μs 16.9202 KOps/s 16.9177 KOps/s $\color{#35bf28}+0.01\%$
test_compile_add_one_nested[tensordict-compile] 0.1489ms 67.0023μs 14.9249 KOps/s 14.6499 KOps/s $\color{#35bf28}+1.88\%$
test_compile_add_one_nested[tensordict-eager] 0.3589ms 0.1747ms 5.7247 KOps/s 5.7156 KOps/s $\color{#35bf28}+0.16\%$
test_compile_add_one_nested[pytree-compile] 0.1155ms 45.9728μs 21.7520 KOps/s 21.9363 KOps/s $\color{#d91a1a}-0.84\%$
test_compile_add_one_nested[pytree-eager] 0.2323ms 0.1186ms 8.4348 KOps/s 8.4700 KOps/s $\color{#d91a1a}-0.42\%$
test_compile_copy_nested[tensordict-compile] 95.4080μs 29.9771μs 33.3588 KOps/s 34.4664 KOps/s $\color{#d91a1a}-3.21\%$
test_compile_copy_nested[tensordict-eager] 0.1179ms 57.7106μs 17.3278 KOps/s 16.8148 KOps/s $\color{#35bf28}+3.05\%$
test_compile_copy_nested[pytree-compile] 0.1694ms 79.5268μs 12.5744 KOps/s 12.2701 KOps/s $\color{#35bf28}+2.48\%$
test_compile_copy_nested[pytree-eager] 0.1261ms 65.9651μs 15.1595 KOps/s 14.7854 KOps/s $\color{#35bf28}+2.53\%$
test_compile_add_one_flat[tensordict-compile] 0.2451ms 0.1070ms 9.3447 KOps/s 9.2633 KOps/s $\color{#35bf28}+0.88\%$
test_compile_add_one_flat[tensordict-eager] 0.3143ms 0.2179ms 4.5886 KOps/s 4.6365 KOps/s $\color{#d91a1a}-1.03\%$
test_compile_add_one_flat[tensorclass-compile] 0.1012ms 46.2876μs 21.6041 KOps/s 21.3114 KOps/s $\color{#35bf28}+1.37\%$
test_compile_add_one_flat[tensorclass-eager] 0.1558ms 67.8201μs 14.7449 KOps/s 14.6224 KOps/s $\color{#35bf28}+0.84\%$
test_compile_add_one_flat[pytree-compile] 0.1813ms 99.9242μs 10.0076 KOps/s 9.9343 KOps/s $\color{#35bf28}+0.74\%$
test_compile_add_one_flat[pytree-eager] 0.3612ms 0.2065ms 4.8432 KOps/s 4.9183 KOps/s $\color{#d91a1a}-1.53\%$
test_compile_add_self_flat[tensordict-eager] 1.3993ms 0.2376ms 4.2093 KOps/s 4.2387 KOps/s $\color{#d91a1a}-0.69\%$
test_compile_add_self_flat[tensordict-compile] 0.2370ms 0.1097ms 9.1186 KOps/s 9.3799 KOps/s $\color{#d91a1a}-2.79\%$
test_compile_add_self_flat[tensorclass-eager] 0.1671ms 65.3462μs 15.3031 KOps/s 15.4605 KOps/s $\color{#d91a1a}-1.02\%$
test_compile_add_self_flat[tensorclass-compile] 0.1113ms 48.9036μs 20.4484 KOps/s 20.6459 KOps/s $\color{#d91a1a}-0.96\%$
test_compile_add_self_flat[pytree-eager] 0.2919ms 0.1600ms 6.2498 KOps/s 6.2755 KOps/s $\color{#d91a1a}-0.41\%$
test_compile_add_self_flat[pytree-compile] 0.2280ms 0.1005ms 9.9477 KOps/s 9.9199 KOps/s $\color{#35bf28}+0.28\%$
test_compile_copy_flat[tensordict-compile] 78.0330μs 21.8261μs 45.8166 KOps/s 47.0918 KOps/s $\color{#d91a1a}-2.71\%$
test_compile_copy_flat[tensordict-eager] 0.1606ms 67.3843μs 14.8402 KOps/s 14.4707 KOps/s $\color{#35bf28}+2.55\%$
test_compile_copy_flat[pytree-compile] 0.1632ms 81.3273μs 12.2960 KOps/s 11.7956 KOps/s $\color{#35bf28}+4.24\%$
test_compile_copy_flat[pytree-eager] 0.1670ms 66.9902μs 14.9276 KOps/s 14.2072 KOps/s $\textbf{\color{#35bf28}+5.07\%}$
test_compile_assign_and_add[tensordict-compile] 0.4300ms 0.2148ms 4.6564 KOps/s 4.7312 KOps/s $\color{#d91a1a}-1.58\%$
test_compile_assign_and_add[tensordict-eager] 2.4188ms 1.4146ms 706.9270 Ops/s 743.1811 Ops/s $\color{#d91a1a}-4.88\%$
test_compile_assign_and_add[pytree-compile] 0.4246ms 0.2115ms 4.7277 KOps/s 4.7463 KOps/s $\color{#d91a1a}-0.39\%$
test_compile_assign_and_add[pytree-eager] 1.0114ms 0.8345ms 1.1983 KOps/s 1.2119 KOps/s $\color{#d91a1a}-1.12\%$
test_compile_assign_and_add_stack[compile] 0.5576ms 0.4575ms 2.1857 KOps/s 2.1724 KOps/s $\color{#35bf28}+0.61\%$
test_compile_assign_and_add_stack[eager] 4.3735ms 2.7952ms 357.7538 Ops/s 360.0877 Ops/s $\color{#d91a1a}-0.65\%$
test_compile_indexing[tensor-tensordict-compile] 86.3210μs 38.1298μs 26.2262 KOps/s 25.4827 KOps/s $\color{#35bf28}+2.92\%$
test_compile_indexing[tensor-tensordict-eager] 0.5470ms 32.4742μs 30.7937 KOps/s 29.4404 KOps/s $\color{#35bf28}+4.60\%$
test_compile_indexing[tensor-tensorclass-compile] 96.1990μs 30.7084μs 32.5643 KOps/s 32.0452 KOps/s $\color{#35bf28}+1.62\%$
test_compile_indexing[tensor-tensorclass-eager] 64.8010μs 23.0635μs 43.3585 KOps/s 43.3485 KOps/s $\color{#35bf28}+0.02\%$
test_compile_indexing[tensor-pytree-compile] 81.8230μs 31.5004μs 31.7457 KOps/s 31.3761 KOps/s $\color{#35bf28}+1.18\%$
test_compile_indexing[tensor-pytree-eager] 57.9890μs 23.2594μs 42.9934 KOps/s 43.5737 KOps/s $\color{#d91a1a}-1.33\%$
test_compile_indexing[slice-tensordict-compile] 0.1402ms 54.6625μs 18.2941 KOps/s 18.4405 KOps/s $\color{#d91a1a}-0.79\%$
test_compile_indexing[slice-tensordict-eager] 0.4362ms 19.7113μs 50.7322 KOps/s 48.8079 KOps/s $\color{#35bf28}+3.94\%$
test_compile_indexing[slice-tensorclass-compile] 94.5270μs 47.2268μs 21.1744 KOps/s 21.2165 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_indexing[slice-tensorclass-eager] 68.8480μs 18.8470μs 53.0588 KOps/s 54.1759 KOps/s $\color{#d91a1a}-2.06\%$
test_compile_indexing[slice-pytree-compile] 0.1117ms 47.3816μs 21.1052 KOps/s 20.8187 KOps/s $\color{#35bf28}+1.38\%$
test_compile_indexing[slice-pytree-eager] 71.5730μs 18.9364μs 52.8084 KOps/s 53.9820 KOps/s $\color{#d91a1a}-2.17\%$
test_compile_indexing[int-tensordict-compile] 0.1277ms 56.1459μs 17.8107 KOps/s 18.1029 KOps/s $\color{#d91a1a}-1.61\%$
test_compile_indexing[int-tensordict-eager] 0.9063ms 20.2408μs 49.4052 KOps/s 50.4727 KOps/s $\color{#d91a1a}-2.11\%$
test_compile_indexing[int-tensorclass-compile] 96.6400μs 46.8570μs 21.3415 KOps/s 20.8684 KOps/s $\color{#35bf28}+2.27\%$
test_compile_indexing[int-tensorclass-eager] 60.0520μs 18.5919μs 53.7868 KOps/s 53.8311 KOps/s $\color{#d91a1a}-0.08\%$
test_compile_indexing[int-pytree-compile] 0.1262ms 48.1630μs 20.7628 KOps/s 20.8311 KOps/s $\color{#d91a1a}-0.33\%$
test_compile_indexing[int-pytree-eager] 59.8410μs 18.6300μs 53.6767 KOps/s 53.8431 KOps/s $\color{#d91a1a}-0.31\%$
test_mod_add[eager] 97.6020μs 35.0677μs 28.5163 KOps/s 27.4788 KOps/s $\color{#35bf28}+3.78\%$
test_mod_add[compile] 0.1255ms 66.3932μs 15.0618 KOps/s 15.5799 KOps/s $\color{#d91a1a}-3.33\%$
test_mod_add[compile-overhead] 0.1117ms 64.7728μs 15.4386 KOps/s 15.5293 KOps/s $\color{#d91a1a}-0.58\%$
test_mod_wrap[eager] 0.3246ms 0.2237ms 4.4697 KOps/s 4.4417 KOps/s $\color{#35bf28}+0.63\%$
test_mod_wrap[compile] 1.5771ms 0.2295ms 4.3578 KOps/s 4.4105 KOps/s $\color{#d91a1a}-1.19\%$
test_mod_wrap[compile-overhead] 0.3723ms 0.2245ms 4.4540 KOps/s 4.4680 KOps/s $\color{#d91a1a}-0.32\%$
test_mod_wrap_and_backward[eager] 18.9306ms 12.9410ms 77.2740 Ops/s 72.7614 Ops/s $\textbf{\color{#35bf28}+6.20\%}$
test_mod_wrap_and_backward[compile] 14.5410ms 11.4313ms 87.4793 Ops/s 88.2068 Ops/s $\color{#d91a1a}-0.82\%$
test_mod_wrap_and_backward[compile-overhead] 13.6019ms 11.4003ms 87.7169 Ops/s 85.6350 Ops/s $\color{#35bf28}+2.43\%$
test_seq_add[eager] 0.2535ms 0.1174ms 8.5209 KOps/s 8.2414 KOps/s $\color{#35bf28}+3.39\%$
test_seq_add[compile] 0.1422ms 77.2482μs 12.9453 KOps/s 12.9695 KOps/s $\color{#d91a1a}-0.19\%$
test_seq_add[compile-overhead] 0.1515ms 77.1385μs 12.9637 KOps/s 13.3900 KOps/s $\color{#d91a1a}-3.18\%$
test_seq_wrap[eager] 0.6694ms 0.4477ms 2.2339 KOps/s 2.2286 KOps/s $\color{#35bf28}+0.24\%$
test_seq_wrap[compile] 0.4077ms 0.2415ms 4.1416 KOps/s 4.1656 KOps/s $\color{#d91a1a}-0.58\%$
test_seq_wrap[compile-overhead] 0.4317ms 0.2423ms 4.1271 KOps/s 4.1794 KOps/s $\color{#d91a1a}-1.25\%$
test_func_call_runtime[False-eager] 0.9530ms 0.5485ms 1.8233 KOps/s 1.7884 KOps/s $\color{#35bf28}+1.95\%$
test_func_call_runtime[False-compile] 0.5479ms 0.4462ms 2.2409 KOps/s 2.2652 KOps/s $\color{#d91a1a}-1.07\%$
test_func_call_runtime[False-compile-overhead] 0.6240ms 0.4428ms 2.2581 KOps/s 2.2593 KOps/s $\color{#d91a1a}-0.05\%$
test_func_call_runtime[True-eager] 0.8819ms 0.7511ms 1.3314 KOps/s 1.2985 KOps/s $\color{#35bf28}+2.54\%$
test_func_call_runtime[True-compile] 0.7098ms 0.4652ms 2.1495 KOps/s 2.1610 KOps/s $\color{#d91a1a}-0.53\%$
test_func_call_runtime[True-compile-overhead] 0.6342ms 0.4665ms 2.1438 KOps/s 2.1512 KOps/s $\color{#d91a1a}-0.35\%$
test_func_call_cm_runtime[False-eager] 0.9220ms 0.5471ms 1.8279 KOps/s 1.8071 KOps/s $\color{#35bf28}+1.15\%$
test_func_call_cm_runtime[False-compile] 0.6130ms 0.4436ms 2.2541 KOps/s 2.2772 KOps/s $\color{#d91a1a}-1.02\%$
test_func_call_cm_runtime[False-compile-overhead] 0.8376ms 0.4466ms 2.2391 KOps/s 2.2632 KOps/s $\color{#d91a1a}-1.06\%$
test_func_call_cm_runtime[True-eager] 1.3700ms 0.9061ms 1.1036 KOps/s 1.0925 KOps/s $\color{#35bf28}+1.01\%$
test_func_call_cm_runtime[True-compile] 0.9398ms 0.8040ms 1.2437 KOps/s 1.2134 KOps/s $\color{#35bf28}+2.50\%$
test_func_call_cm_runtime[True-compile-overhead] 0.9213ms 0.8099ms 1.2347 KOps/s 1.2064 KOps/s $\color{#35bf28}+2.34\%$
test_vmap_func_call_cm_runtime[eager] 2.4445ms 1.9092ms 523.7852 Ops/s 510.2828 Ops/s $\color{#35bf28}+2.65\%$
test_vmap_func_call_cm_runtime[compile] 0.9166ms 0.5428ms 1.8424 KOps/s 1.8415 KOps/s $\color{#35bf28}+0.04\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.9632ms 0.5425ms 1.8432 KOps/s 1.8606 KOps/s $\color{#d91a1a}-0.94\%$
test_distributed 0.2693ms 0.1245ms 8.0321 KOps/s 7.7932 KOps/s $\color{#35bf28}+3.06\%$
test_tdmodule 83.1850μs 27.4331μs 36.4523 KOps/s 36.9092 KOps/s $\color{#d91a1a}-1.24\%$
test_tdmodule_dispatch 0.1178ms 61.1391μs 16.3562 KOps/s 20.5099 KOps/s $\textbf{\color{#d91a1a}-20.25\%}$
test_tdseq 51.3960μs 30.1103μs 33.2112 KOps/s 33.7387 KOps/s $\color{#d91a1a}-1.56\%$
test_tdseq_dispatch 86.1710μs 56.2184μs 17.7878 KOps/s 18.0814 KOps/s $\color{#d91a1a}-1.62\%$
test_instantiation_functorch 2.4234ms 1.5393ms 649.6437 Ops/s 649.3174 Ops/s $\color{#35bf28}+0.05\%$
test_exec_functorch 1.2351ms 0.1807ms 5.5352 KOps/s 5.5456 KOps/s $\color{#d91a1a}-0.19\%$
test_exec_functional_call 0.3145ms 0.1697ms 5.8935 KOps/s 5.7465 KOps/s $\color{#35bf28}+2.56\%$
test_exec_td_decorator 0.4873ms 0.2313ms 4.3235 KOps/s 4.2549 KOps/s $\color{#35bf28}+1.61\%$
test_vmap_mlp_speed_decorator[True-True] 1.1182ms 0.6667ms 1.4999 KOps/s 1.4867 KOps/s $\color{#35bf28}+0.89\%$
test_vmap_mlp_speed_decorator[True-False] 1.1693ms 0.6799ms 1.4708 KOps/s 1.4934 KOps/s $\color{#d91a1a}-1.51\%$
test_vmap_mlp_speed_decorator[False-True] 0.8533ms 0.5416ms 1.8464 KOps/s 1.8496 KOps/s $\color{#d91a1a}-0.17\%$
test_vmap_mlp_speed_decorator[False-False] 0.7391ms 0.5366ms 1.8637 KOps/s 1.8542 KOps/s $\color{#35bf28}+0.51\%$
test_to_module_speed[True] 1.9018ms 1.3540ms 738.5363 Ops/s 741.6160 Ops/s $\color{#d91a1a}-0.42\%$
test_to_module_speed[False] 2.1712ms 1.3238ms 755.4125 Ops/s 753.8535 Ops/s $\color{#35bf28}+0.21\%$
test_tc_init 85.6500μs 47.7872μs 20.9261 KOps/s 20.5212 KOps/s $\color{#35bf28}+1.97\%$
test_tc_init_nested 0.1794ms 96.2895μs 10.3853 KOps/s 10.5290 KOps/s $\color{#d91a1a}-1.36\%$
test_tc_first_layer_tensor 20.9790μs 1.5799μs 632.9619 KOps/s 625.3602 KOps/s $\color{#35bf28}+1.22\%$
test_tc_first_layer_nontensor 30.8380μs 4.6697μs 214.1444 KOps/s 209.6225 KOps/s $\color{#35bf28}+2.16\%$
test_tc_second_layer_tensor 16.6410μs 2.8555μs 350.1973 KOps/s 342.0548 KOps/s $\color{#35bf28}+2.38\%$
test_tc_second_layer_nontensor 35.5460μs 5.9924μs 166.8775 KOps/s 162.4516 KOps/s $\color{#35bf28}+2.72\%$
test_unbind 0.2317s 13.1289ms 76.1677 Ops/s 71.6354 Ops/s $\textbf{\color{#35bf28}+6.33\%}$
test_full_like 8.8656ms 7.5763ms 131.9906 Ops/s 142.5827 Ops/s $\textbf{\color{#d91a1a}-7.43\%}$
test_zeros_like 8.0023ms 4.5671ms 218.9566 Ops/s 360.1037 Ops/s $\textbf{\color{#d91a1a}-39.20\%}$
test_ones_like 3.8182ms 3.2096ms 311.5685 Ops/s 314.5215 Ops/s $\color{#d91a1a}-0.94\%$
test_clone 6.5950ms 4.9694ms 201.2330 Ops/s 145.8524 Ops/s $\textbf{\color{#35bf28}+37.97\%}$
test_squeeze 59.1100μs 11.9042μs 84.0038 KOps/s 82.3075 KOps/s $\color{#35bf28}+2.06\%$
test_unsqueeze 0.1559ms 94.2769μs 10.6071 KOps/s 10.7222 KOps/s $\color{#d91a1a}-1.07\%$
test_split 0.4806ms 0.1944ms 5.1447 KOps/s 5.1136 KOps/s $\color{#35bf28}+0.61\%$
test_permute 0.3356ms 0.2030ms 4.9266 KOps/s 4.9956 KOps/s $\color{#d91a1a}-1.38\%$
test_stack 30.8350ms 25.6202ms 39.0317 Ops/s 39.2689 Ops/s $\color{#d91a1a}-0.60\%$
test_cat 31.3121ms 25.3185ms 39.4968 Ops/s 39.1688 Ops/s $\color{#35bf28}+0.84\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}42$. Worsened: $\large\color{#d91a1a}13$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 34.2120μs 12.0118μs 83.2514 KOps/s 75.0706 KOps/s $\textbf{\color{#35bf28}+10.90\%}$
test_plain_set_stack_nested 35.6910μs 12.3043μs 81.2724 KOps/s 74.2687 KOps/s $\textbf{\color{#35bf28}+9.43\%}$
test_plain_set_nested_inplace 75.1730μs 13.0428μs 76.6709 KOps/s 70.0296 KOps/s $\textbf{\color{#35bf28}+9.48\%}$
test_plain_set_stack_nested_inplace 46.7930μs 13.1337μs 76.1399 KOps/s 69.1898 KOps/s $\textbf{\color{#35bf28}+10.04\%}$
test_items 22.8410μs 2.9414μs 339.9774 KOps/s 341.9676 KOps/s $\color{#d91a1a}-0.58\%$
test_items_nested 0.5157ms 0.3692ms 2.7082 KOps/s 2.7036 KOps/s $\color{#35bf28}+0.17\%$
test_items_nested_locked 0.4225ms 0.3693ms 2.7076 KOps/s 2.7010 KOps/s $\color{#35bf28}+0.24\%$
test_items_nested_leaf 0.1845ms 58.7249μs 17.0286 KOps/s 16.8228 KOps/s $\color{#35bf28}+1.22\%$
test_items_stack_nested 0.4190ms 0.3660ms 2.7324 KOps/s 2.7030 KOps/s $\color{#35bf28}+1.09\%$
test_items_stack_nested_leaf 89.5740μs 58.9740μs 16.9566 KOps/s 16.6034 KOps/s $\color{#35bf28}+2.13\%$
test_items_stack_nested_locked 0.4528ms 0.3603ms 2.7754 KOps/s 2.6918 KOps/s $\color{#35bf28}+3.11\%$
test_keys 35.8120μs 3.9629μs 252.3433 KOps/s 286.7386 KOps/s $\textbf{\color{#d91a1a}-12.00\%}$
test_keys_nested 0.1216ms 90.4293μs 11.0584 KOps/s 11.2360 KOps/s $\color{#d91a1a}-1.58\%$
test_keys_nested_locked 0.7082ms 97.0687μs 10.3020 KOps/s 10.5764 KOps/s $\color{#d91a1a}-2.59\%$
test_keys_nested_leaf 0.2720ms 81.4500μs 12.2775 KOps/s 12.4888 KOps/s $\color{#d91a1a}-1.69\%$
test_keys_stack_nested 0.1728ms 90.9651μs 10.9932 KOps/s 11.0427 KOps/s $\color{#d91a1a}-0.45\%$
test_keys_stack_nested_leaf 0.1129ms 81.1321μs 12.3256 KOps/s 12.2321 KOps/s $\color{#35bf28}+0.76\%$
test_keys_stack_nested_locked 0.1438ms 96.8199μs 10.3285 KOps/s 10.3849 KOps/s $\color{#d91a1a}-0.54\%$
test_values 4.8737μs 0.8573μs 1.1665 MOps/s 1.1606 MOps/s $\color{#35bf28}+0.51\%$
test_values_nested 0.1061ms 39.0741μs 25.5924 KOps/s 26.3603 KOps/s $\color{#d91a1a}-2.91\%$
test_values_nested_locked 0.1190ms 40.9359μs 24.4284 KOps/s 25.0165 KOps/s $\color{#d91a1a}-2.35\%$
test_values_nested_leaf 66.9040μs 44.3028μs 22.5719 KOps/s 23.7288 KOps/s $\color{#d91a1a}-4.88\%$
test_values_stack_nested 82.7840μs 39.1183μs 25.5635 KOps/s 26.1903 KOps/s $\color{#d91a1a}-2.39\%$
test_values_stack_nested_leaf 74.6830μs 44.2798μs 22.5836 KOps/s 23.3844 KOps/s $\color{#d91a1a}-3.42\%$
test_values_stack_nested_locked 81.5340μs 40.8572μs 24.4755 KOps/s 25.1003 KOps/s $\color{#d91a1a}-2.49\%$
test_membership 1.8251μs 0.5297μs 1.8878 MOps/s 1.9521 MOps/s $\color{#d91a1a}-3.30\%$
test_membership_nested 19.0710μs 2.0594μs 485.5692 KOps/s 469.6807 KOps/s $\color{#35bf28}+3.38\%$
test_membership_nested_leaf 14.2405μs 2.0721μs 482.6104 KOps/s 491.2522 KOps/s $\color{#d91a1a}-1.76\%$
test_membership_stacked_nested 21.4010μs 2.1962μs 455.3359 KOps/s 472.3534 KOps/s $\color{#d91a1a}-3.60\%$
test_membership_stacked_nested_leaf 29.3820μs 2.1299μs 469.5051 KOps/s 471.5876 KOps/s $\color{#d91a1a}-0.44\%$
test_membership_nested_last 57.5120μs 3.1789μs 314.5714 KOps/s 317.2171 KOps/s $\color{#d91a1a}-0.83\%$
test_membership_nested_leaf_last 25.2610μs 3.1745μs 315.0106 KOps/s 319.8643 KOps/s $\color{#d91a1a}-1.52\%$
test_membership_stacked_nested_last 31.3710μs 3.1971μs 312.7846 KOps/s 255.8945 KOps/s $\textbf{\color{#35bf28}+22.23\%}$
test_membership_stacked_nested_leaf_last 25.7410μs 3.2011μs 312.3887 KOps/s 256.1465 KOps/s $\textbf{\color{#35bf28}+21.96\%}$
test_nested_getleaf 36.4720μs 6.5121μs 153.5602 KOps/s 160.5756 KOps/s $\color{#d91a1a}-4.37\%$
test_nested_get 33.3610μs 6.1514μs 162.5647 KOps/s 169.9114 KOps/s $\color{#d91a1a}-4.32\%$
test_stacked_getleaf 27.4710μs 6.4499μs 155.0412 KOps/s 162.6193 KOps/s $\color{#d91a1a}-4.66\%$
test_stacked_get 28.2010μs 6.1709μs 162.0512 KOps/s 171.1267 KOps/s $\textbf{\color{#d91a1a}-5.30\%}$
test_nested_getitemleaf 38.8310μs 6.7562μs 148.0113 KOps/s 155.2945 KOps/s $\color{#d91a1a}-4.69\%$
test_nested_getitem 0.2087ms 6.4152μs 155.8795 KOps/s 160.4490 KOps/s $\color{#d91a1a}-2.85\%$
test_stacked_getitemleaf 28.0410μs 6.7181μs 148.8507 KOps/s 154.8143 KOps/s $\color{#d91a1a}-3.85\%$
test_stacked_getitem 35.2120μs 6.3900μs 156.4952 KOps/s 163.7727 KOps/s $\color{#d91a1a}-4.44\%$
test_lock_nested 0.7709ms 0.3461ms 2.8893 KOps/s 2.8310 KOps/s $\color{#35bf28}+2.06\%$
test_lock_stack_nested 0.4773ms 0.3499ms 2.8576 KOps/s 2.8831 KOps/s $\color{#d91a1a}-0.88\%$
test_unlock_nested 0.6335ms 0.2819ms 3.5474 KOps/s 3.5451 KOps/s $\color{#35bf28}+0.06\%$
test_unlock_stack_nested 0.3107ms 0.2834ms 3.5281 KOps/s 3.5268 KOps/s $\color{#35bf28}+0.04\%$
test_flatten_speed 0.1468ms 78.0574μs 12.8111 KOps/s 12.9948 KOps/s $\color{#d91a1a}-1.41\%$
test_unflatten_speed 0.3903ms 0.3363ms 2.9733 KOps/s 3.0706 KOps/s $\color{#d91a1a}-3.17\%$
test_common_ops 0.8694ms 0.5965ms 1.6764 KOps/s 1.4511 KOps/s $\textbf{\color{#35bf28}+15.53\%}$
test_creation 0.1499ms 1.7741μs 563.6724 KOps/s 563.5301 KOps/s $\color{#35bf28}+0.03\%$
test_creation_empty 28.2810μs 6.9772μs 143.3241 KOps/s 95.7935 KOps/s $\textbf{\color{#35bf28}+49.62\%}$
test_creation_nested_1 39.8720μs 8.6850μs 115.1414 KOps/s 82.4244 KOps/s $\textbf{\color{#35bf28}+39.69\%}$
test_creation_nested_2 33.8120μs 11.4431μs 87.3892 KOps/s 67.1998 KOps/s $\textbf{\color{#35bf28}+30.04\%}$
test_clone 0.1507ms 10.7655μs 92.8897 KOps/s 94.1331 KOps/s $\color{#d91a1a}-1.32\%$
test_getitem[int] 1.3586ms 10.7248μs 93.2422 KOps/s 93.5223 KOps/s $\color{#d91a1a}-0.30\%$
test_getitem[slice_int] 0.1214ms 20.9583μs 47.7138 KOps/s 47.1253 KOps/s $\color{#35bf28}+1.25\%$
test_getitem[range] 0.1553ms 37.1337μs 26.9297 KOps/s 26.8778 KOps/s $\color{#35bf28}+0.19\%$
test_getitem[tuple] 0.1062ms 18.1860μs 54.9873 KOps/s 54.0001 KOps/s $\color{#35bf28}+1.83\%$
test_getitem[list] 0.2269ms 33.0218μs 30.2830 KOps/s 30.2425 KOps/s $\color{#35bf28}+0.13\%$
test_setitem_dim[int] 39.9920μs 19.2927μs 51.8330 KOps/s 50.0419 KOps/s $\color{#35bf28}+3.58\%$
test_setitem_dim[slice_int] 61.1630μs 39.1329μs 25.5540 KOps/s 25.5384 KOps/s $\color{#35bf28}+0.06\%$
test_setitem_dim[range] 0.1479ms 54.6404μs 18.3015 KOps/s 18.5073 KOps/s $\color{#d91a1a}-1.11\%$
test_setitem_dim[tuple] 0.1364ms 32.2871μs 30.9722 KOps/s 29.6065 KOps/s $\color{#35bf28}+4.61\%$
test_setitem 72.2230μs 14.4226μs 69.3355 KOps/s 61.6379 KOps/s $\textbf{\color{#35bf28}+12.49\%}$
test_set 70.9530μs 14.0983μs 70.9306 KOps/s 63.2422 KOps/s $\textbf{\color{#35bf28}+12.16\%}$
test_set_shared 0.5136ms 0.1638ms 6.1058 KOps/s 5.9319 KOps/s $\color{#35bf28}+2.93\%$
test_update 0.4169ms 16.4257μs 60.8803 KOps/s 46.9188 KOps/s $\textbf{\color{#35bf28}+29.76\%}$
test_update_nested 0.1900ms 22.7882μs 43.8824 KOps/s 37.3343 KOps/s $\textbf{\color{#35bf28}+17.54\%}$
test_update__nested 0.5111ms 26.1770μs 38.2015 KOps/s 37.6947 KOps/s $\color{#35bf28}+1.34\%$
test_set_nested 0.1289ms 15.8060μs 63.2671 KOps/s 58.7112 KOps/s $\textbf{\color{#35bf28}+7.76\%}$
test_set_nested_new 0.1021ms 18.2085μs 54.9196 KOps/s 51.2996 KOps/s $\textbf{\color{#35bf28}+7.06\%}$
test_select 73.0730μs 30.3009μs 33.0024 KOps/s 31.6922 KOps/s $\color{#35bf28}+4.13\%$
test_select_nested 0.1942ms 44.3098μs 22.5684 KOps/s 22.4956 KOps/s $\color{#35bf28}+0.32\%$
test_exclude_nested 95.1740μs 64.3104μs 15.5496 KOps/s 15.7929 KOps/s $\color{#d91a1a}-1.54\%$
test_empty[True] 0.3731ms 0.3025ms 3.3056 KOps/s 3.3545 KOps/s $\color{#d91a1a}-1.46\%$
test_empty[False] 2.9872μs 0.8561μs 1.1681 MOps/s 1.1753 MOps/s $\color{#d91a1a}-0.61\%$
test_to 86.8340μs 60.2371μs 16.6011 KOps/s 17.8572 KOps/s $\textbf{\color{#d91a1a}-7.03\%}$
test_to_nonblocking 0.1949ms 47.4282μs 21.0845 KOps/s 21.0910 KOps/s $\color{#d91a1a}-0.03\%$
test_unbind_speed 0.3314ms 0.2455ms 4.0735 KOps/s 4.0648 KOps/s $\color{#35bf28}+0.21\%$
test_unbind_speed_stack0 0.3725ms 0.2413ms 4.1445 KOps/s 4.0807 KOps/s $\color{#35bf28}+1.56\%$
test_unbind_speed_stack1 98.5888ms 0.7470ms 1.3388 KOps/s 1.3378 KOps/s $\color{#35bf28}+0.08\%$
test_split 98.9011ms 1.6220ms 616.5277 Ops/s 623.8153 Ops/s $\color{#d91a1a}-1.17\%$
test_chunk 98.7181ms 1.6373ms 610.7648 Ops/s 623.8897 Ops/s $\color{#d91a1a}-2.10\%$
test_consolidate[False-None] 3.4176ms 2.6818ms 372.8808 Ops/s 368.9532 Ops/s $\color{#35bf28}+1.06\%$
test_consolidate[default-None] 1.8422ms 1.7260ms 579.3821 Ops/s 580.2726 Ops/s $\color{#d91a1a}-0.15\%$
test_consolidate[reduce-overhead-None] 1.8638ms 1.7608ms 567.9086 Ops/s 567.5432 Ops/s $\color{#35bf28}+0.06\%$
test_consolidate_njt[False-None] 6.7888ms 6.4913ms 154.0524 Ops/s 155.2707 Ops/s $\color{#d91a1a}-0.78\%$
test_to[False-False-None] 1.9183ms 1.7010ms 587.8905 Ops/s 590.7753 Ops/s $\color{#d91a1a}-0.49\%$
test_to[True-False-None] 1.5469ms 1.3304ms 751.6628 Ops/s 756.0876 Ops/s $\color{#d91a1a}-0.59\%$
test_to[within-False-None] 4.4248ms 4.1195ms 242.7498 Ops/s 242.0640 Ops/s $\color{#35bf28}+0.28\%$
test_to[True-default-None] 5.6568ms 5.3216ms 187.9119 Ops/s 187.2875 Ops/s $\color{#35bf28}+0.33\%$
test_to_njt[False-False-None] 7.3589ms 6.9654ms 143.5664 Ops/s 146.6790 Ops/s $\color{#d91a1a}-2.12\%$
test_to_njt[True-False-None] 5.6960ms 5.4300ms 184.1632 Ops/s 184.4491 Ops/s $\color{#d91a1a}-0.16\%$
test_to_njt[within-False-None] 12.4105ms 11.9386ms 83.7616 Ops/s 83.3080 Ops/s $\color{#35bf28}+0.54\%$
test_creation[device0] 0.4477ms 81.7769μs 12.2284 KOps/s 11.6585 KOps/s $\color{#35bf28}+4.89\%$
test_creation_from_tensor 0.5401ms 87.4568μs 11.4342 KOps/s 11.5381 KOps/s $\color{#d91a1a}-0.90\%$
test_add_one[memmap_tensor0] 0.3705ms 6.8345μs 146.3163 KOps/s 142.9702 KOps/s $\color{#35bf28}+2.34\%$
test_contiguous[memmap_tensor0] 9.0849μs 0.4218μs 2.3710 MOps/s 2.3226 MOps/s $\color{#35bf28}+2.09\%$
test_stack[memmap_tensor0] 38.2120μs 4.3137μs 231.8172 KOps/s 227.9275 KOps/s $\color{#35bf28}+1.71\%$
test_memmaptd_index 0.4400ms 0.2376ms 4.2096 KOps/s 4.1472 KOps/s $\color{#35bf28}+1.51\%$
test_memmaptd_index_astensor 0.4671ms 0.2986ms 3.3494 KOps/s 3.2957 KOps/s $\color{#35bf28}+1.63\%$
test_memmaptd_index_op 0.7839ms 0.5506ms 1.8164 KOps/s 1.6389 KOps/s $\textbf{\color{#35bf28}+10.83\%}$
test_serialize_model 0.1320s 0.1308s 7.6476 Ops/s 7.6563 Ops/s $\color{#d91a1a}-0.11\%$
test_serialize_model_pickle 1.3495s 1.2149s 0.8231 Ops/s 0.8224 Ops/s $\color{#35bf28}+0.09\%$
test_serialize_weights 0.4468s 0.1753s 5.7034 Ops/s 7.6900 Ops/s $\textbf{\color{#d91a1a}-25.83\%}$
test_serialize_weights_returnearly 0.3622s 56.7500ms 17.6212 Ops/s 24.2111 Ops/s $\textbf{\color{#d91a1a}-27.22\%}$
test_serialize_weights_pickle 1.3501s 1.2131s 0.8244 Ops/s 0.8209 Ops/s $\color{#35bf28}+0.42\%$
test_reshape_pytree 0.1731ms 22.4726μs 44.4985 KOps/s 44.1097 KOps/s $\color{#35bf28}+0.88\%$
test_reshape_td 0.1395ms 27.5861μs 36.2501 KOps/s 36.1132 KOps/s $\color{#35bf28}+0.38\%$
test_view_pytree 0.1840ms 22.2452μs 44.9536 KOps/s 45.3279 KOps/s $\color{#d91a1a}-0.83\%$
test_view_td 0.1533ms 30.7907μs 32.4774 KOps/s 31.4128 KOps/s $\color{#35bf28}+3.39\%$
test_unbind_pytree 0.1563ms 28.1853μs 35.4794 KOps/s 35.6514 KOps/s $\color{#d91a1a}-0.48\%$
test_unbind_td 0.7311ms 36.6020μs 27.3209 KOps/s 26.5673 KOps/s $\color{#35bf28}+2.84\%$
test_split_pytree 0.1938ms 30.9224μs 32.3390 KOps/s 32.8855 KOps/s $\color{#d91a1a}-1.66\%$
test_split_td 0.9572ms 38.4046μs 26.0385 KOps/s 25.8657 KOps/s $\color{#35bf28}+0.67\%$
test_add_pytree 0.2293ms 34.9955μs 28.5751 KOps/s 28.1095 KOps/s $\color{#35bf28}+1.66\%$
test_add_td 0.2910ms 46.0899μs 21.6967 KOps/s 19.3672 KOps/s $\textbf{\color{#35bf28}+12.03\%}$
test_compile_add_one_nested[tensordict-compile] 0.2701ms 0.1238ms 8.0792 KOps/s 7.8455 KOps/s $\color{#35bf28}+2.98\%$
test_compile_add_one_nested[tensordict-eager] 0.3493ms 0.1346ms 7.4272 KOps/s 7.5097 KOps/s $\color{#d91a1a}-1.10\%$
test_compile_add_one_nested[pytree-compile] 0.2861ms 0.1009ms 9.9142 KOps/s 10.1873 KOps/s $\color{#d91a1a}-2.68\%$
test_compile_add_one_nested[pytree-eager] 0.3349ms 0.1486ms 6.7298 KOps/s 6.5519 KOps/s $\color{#35bf28}+2.72\%$
test_compile_copy_nested[tensordict-compile] 0.1500ms 23.5499μs 42.4630 KOps/s 43.6537 KOps/s $\color{#d91a1a}-2.73\%$
test_compile_copy_nested[tensordict-eager] 0.1695ms 30.1330μs 33.1862 KOps/s 34.1564 KOps/s $\color{#d91a1a}-2.84\%$
test_compile_copy_nested[pytree-compile] 0.3494ms 66.4000μs 15.0602 KOps/s 15.2104 KOps/s $\color{#d91a1a}-0.99\%$
test_compile_copy_nested[pytree-eager] 83.1140μs 50.2172μs 19.9135 KOps/s 19.8337 KOps/s $\color{#35bf28}+0.40\%$
test_compile_add_one_flat[tensordict-compile] 0.2939ms 0.1473ms 6.7910 KOps/s 7.0400 KOps/s $\color{#d91a1a}-3.54\%$
test_compile_add_one_flat[tensordict-eager] 0.3762ms 0.2257ms 4.4297 KOps/s 4.5966 KOps/s $\color{#d91a1a}-3.63\%$
test_compile_add_one_flat[tensorclass-compile] 0.2442ms 96.9049μs 10.3194 KOps/s 10.3278 KOps/s $\color{#d91a1a}-0.08\%$
test_compile_add_one_flat[tensorclass-eager] 0.2215ms 57.3602μs 17.4337 KOps/s 17.7547 KOps/s $\color{#d91a1a}-1.81\%$
test_compile_add_one_flat[pytree-compile] 0.2948ms 0.1352ms 7.3986 KOps/s 7.1654 KOps/s $\color{#35bf28}+3.25\%$
test_compile_add_one_flat[pytree-eager] 0.6829ms 0.4964ms 2.0144 KOps/s 2.0244 KOps/s $\color{#d91a1a}-0.50\%$
test_compile_add_self_flat[tensordict-eager] 0.4024ms 0.2645ms 3.7805 KOps/s 3.8137 KOps/s $\color{#d91a1a}-0.87\%$
test_compile_add_self_flat[tensordict-compile] 0.2804ms 0.1510ms 6.6233 KOps/s 6.9983 KOps/s $\textbf{\color{#d91a1a}-5.36\%}$
test_compile_add_self_flat[tensorclass-eager] 0.2715ms 71.7829μs 13.9309 KOps/s 14.1747 KOps/s $\color{#d91a1a}-1.72\%$
test_compile_add_self_flat[tensorclass-compile] 0.2955ms 0.1060ms 9.4319 KOps/s 10.0355 KOps/s $\textbf{\color{#d91a1a}-6.01\%}$
test_compile_add_self_flat[pytree-eager] 0.6533ms 0.4333ms 2.3077 KOps/s 2.4655 KOps/s $\textbf{\color{#d91a1a}-6.40\%}$
test_compile_add_self_flat[pytree-compile] 0.3220ms 0.1342ms 7.4519 KOps/s 7.4772 KOps/s $\color{#d91a1a}-0.34\%$
test_compile_copy_flat[tensordict-compile] 0.1414ms 17.6060μs 56.7989 KOps/s 56.8437 KOps/s $\color{#d91a1a}-0.08\%$
test_compile_copy_flat[tensordict-eager] 0.4186ms 32.3800μs 30.8833 KOps/s 31.9558 KOps/s $\color{#d91a1a}-3.36\%$
test_compile_copy_flat[pytree-compile] 0.4627ms 71.7714μs 13.9331 KOps/s 13.8667 KOps/s $\color{#35bf28}+0.48\%$
test_compile_copy_flat[pytree-eager] 0.4460ms 52.9606μs 18.8820 KOps/s 18.7015 KOps/s $\color{#35bf28}+0.96\%$
test_compile_assign_and_add[tensordict-compile] 1.6386ms 0.3910ms 2.5576 KOps/s 2.2085 KOps/s $\textbf{\color{#35bf28}+15.81\%}$
test_compile_assign_and_add[tensordict-eager] 3.0008ms 2.5857ms 386.7473 Ops/s 385.9416 Ops/s $\color{#35bf28}+0.21\%$
test_compile_assign_and_add[pytree-compile] 1.6011ms 0.4323ms 2.3133 KOps/s 2.2752 KOps/s $\color{#35bf28}+1.68\%$
test_compile_assign_and_add[pytree-eager] 2.9770ms 2.5730ms 388.6504 Ops/s 379.5385 Ops/s $\color{#35bf28}+2.40\%$
test_compile_indexing[tensor-tensordict-compile] 0.2705ms 0.1159ms 8.6292 KOps/s 8.7542 KOps/s $\color{#d91a1a}-1.43\%$
test_compile_indexing[tensor-tensordict-eager] 0.5554ms 81.9159μs 12.2076 KOps/s 11.9913 KOps/s $\color{#35bf28}+1.80\%$
test_compile_indexing[tensor-tensorclass-compile] 0.2847ms 0.1108ms 9.0254 KOps/s 9.0007 KOps/s $\color{#35bf28}+0.27\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2581ms 67.4881μs 14.8174 KOps/s 13.9636 KOps/s $\textbf{\color{#35bf28}+6.11\%}$
test_compile_indexing[tensor-pytree-compile] 0.5033ms 0.1091ms 9.1628 KOps/s 8.9326 KOps/s $\color{#35bf28}+2.58\%$
test_compile_indexing[tensor-pytree-eager] 0.4923ms 67.7124μs 14.7683 KOps/s 13.9352 KOps/s $\textbf{\color{#35bf28}+5.98\%}$
test_compile_indexing[slice-tensordict-compile] 0.2699ms 0.1045ms 9.5687 KOps/s 9.4489 KOps/s $\color{#35bf28}+1.27\%$
test_compile_indexing[slice-tensordict-eager] 0.1514ms 17.0987μs 58.4839 KOps/s 53.9109 KOps/s $\textbf{\color{#35bf28}+8.48\%}$
test_compile_indexing[slice-tensorclass-compile] 0.2484ms 95.4908μs 10.4722 KOps/s 9.8878 KOps/s $\textbf{\color{#35bf28}+5.91\%}$
test_compile_indexing[slice-tensorclass-eager] 0.1732ms 15.7442μs 63.5155 KOps/s 63.5568 KOps/s $\color{#d91a1a}-0.06\%$
test_compile_indexing[slice-pytree-compile] 0.2572ms 99.7202μs 10.0281 KOps/s 9.8449 KOps/s $\color{#35bf28}+1.86\%$
test_compile_indexing[slice-pytree-eager] 0.1403ms 15.6987μs 63.6996 KOps/s 61.3806 KOps/s $\color{#35bf28}+3.78\%$
test_compile_indexing[int-tensordict-compile] 0.5023ms 0.1037ms 9.6390 KOps/s 9.9440 KOps/s $\color{#d91a1a}-3.07\%$
test_compile_indexing[int-tensordict-eager] 0.6451ms 17.0503μs 58.6500 KOps/s 57.2745 KOps/s $\color{#35bf28}+2.40\%$
test_compile_indexing[int-tensorclass-compile] 0.3131ms 0.1017ms 9.8294 KOps/s 10.4307 KOps/s $\textbf{\color{#d91a1a}-5.76\%}$
test_compile_indexing[int-tensorclass-eager] 0.1434ms 15.7149μs 63.6340 KOps/s 63.1452 KOps/s $\color{#35bf28}+0.77\%$
test_compile_indexing[int-pytree-compile] 0.5141ms 96.4209μs 10.3712 KOps/s 9.9111 KOps/s $\color{#35bf28}+4.64\%$
test_compile_indexing[int-pytree-eager] 0.4164ms 15.6042μs 64.0855 KOps/s 63.2414 KOps/s $\color{#35bf28}+1.33\%$
test_mod_add[eager] 0.4433ms 37.2589μs 26.8392 KOps/s 25.5124 KOps/s $\textbf{\color{#35bf28}+5.20\%}$
test_mod_add[compile] 0.4841ms 79.6589μs 12.5535 KOps/s 12.5340 KOps/s $\color{#35bf28}+0.16\%$
test_mod_add[compile-overhead] 0.3353ms 0.1685ms 5.9338 KOps/s 5.5971 KOps/s $\textbf{\color{#35bf28}+6.01\%}$
test_mod_wrap[eager] 0.6639ms 0.2529ms 3.9546 KOps/s 3.9817 KOps/s $\color{#d91a1a}-0.68\%$
test_mod_wrap[compile] 0.4672ms 0.2912ms 3.4344 KOps/s 3.5187 KOps/s $\color{#d91a1a}-2.40\%$
test_mod_wrap[compile-overhead] 7.0850ms 3.7629ms 265.7550 Ops/s 275.3629 Ops/s $\color{#d91a1a}-3.49\%$
test_mod_wrap_and_backward[eager] 1.6659ms 1.4857ms 673.0870 Ops/s 691.2640 Ops/s $\color{#d91a1a}-2.63\%$
test_mod_wrap_and_backward[compile] 1.5995ms 1.3763ms 726.5766 Ops/s 731.5180 Ops/s $\color{#d91a1a}-0.68\%$
test_mod_wrap_and_backward[compile-overhead] 1.4470ms 0.9396ms 1.0643 KOps/s 868.6051 Ops/s $\textbf{\color{#35bf28}+22.53\%}$
test_seq_add[eager] 0.3191ms 0.1173ms 8.5276 KOps/s 8.4112 KOps/s $\color{#35bf28}+1.38\%$
test_seq_add[compile] 0.2595ms 87.0726μs 11.4847 KOps/s 11.3711 KOps/s $\color{#35bf28}+1.00\%$
test_seq_add[compile-overhead] 0.2972ms 0.1284ms 7.7881 KOps/s 7.7577 KOps/s $\color{#35bf28}+0.39\%$
test_seq_wrap[eager] 0.6122ms 0.4335ms 2.3068 KOps/s 2.3160 KOps/s $\color{#d91a1a}-0.40\%$
test_seq_wrap[compile] 0.4830ms 0.3080ms 3.2469 KOps/s 3.3434 KOps/s $\color{#d91a1a}-2.89\%$
test_seq_wrap[compile-overhead] 0.3825ms 0.2304ms 4.3408 KOps/s 4.4257 KOps/s $\color{#d91a1a}-1.92\%$
test_func_call_runtime[False-eager] 0.9595ms 0.7683ms 1.3016 KOps/s 1.3407 KOps/s $\color{#d91a1a}-2.92\%$
test_func_call_runtime[False-compile] 0.9516ms 0.7583ms 1.3188 KOps/s 1.3551 KOps/s $\color{#d91a1a}-2.68\%$
test_func_call_runtime[False-compile-overhead] 0.5578ms 0.3751ms 2.6661 KOps/s 2.6939 KOps/s $\color{#d91a1a}-1.03\%$
test_func_call_runtime[True-eager] 1.1182ms 0.9276ms 1.0781 KOps/s 1.0953 KOps/s $\color{#d91a1a}-1.57\%$
test_func_call_runtime[True-compile] 0.9471ms 0.8004ms 1.2494 KOps/s 1.3252 KOps/s $\textbf{\color{#d91a1a}-5.72\%}$
test_func_call_runtime[True-compile-overhead] 0.5356ms 0.3873ms 2.5817 KOps/s 2.5565 KOps/s $\color{#35bf28}+0.99\%$
test_func_call_cm_runtime[False-eager] 0.9596ms 0.7712ms 1.2967 KOps/s 1.3497 KOps/s $\color{#d91a1a}-3.93\%$
test_func_call_cm_runtime[False-compile] 0.9751ms 0.7733ms 1.2931 KOps/s 1.3518 KOps/s $\color{#d91a1a}-4.34\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5837ms 0.3712ms 2.6942 KOps/s 2.7000 KOps/s $\color{#d91a1a}-0.21\%$
test_func_call_cm_runtime[True-eager] 1.2427ms 1.0208ms 979.6377 Ops/s 985.8582 Ops/s $\color{#d91a1a}-0.63\%$
test_func_call_cm_runtime[True-compile] 1.3086ms 1.0436ms 958.2337 Ops/s 1.0078 KOps/s $\color{#d91a1a}-4.92\%$
test_func_call_cm_runtime[True-compile-overhead] 1.1394ms 0.9759ms 1.0247 KOps/s 996.5990 Ops/s $\color{#35bf28}+2.82\%$
test_vmap_func_call_cm_runtime[eager] 2.5886ms 2.0948ms 477.3780 Ops/s 468.8813 Ops/s $\color{#35bf28}+1.81\%$
test_vmap_func_call_cm_runtime[compile] 0.9894ms 0.8105ms 1.2338 KOps/s 1.2370 KOps/s $\color{#d91a1a}-0.26\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5651ms 0.4171ms 2.3977 KOps/s 2.3590 KOps/s $\color{#35bf28}+1.64\%$
test_distributed 6.4877ms 0.1805ms 5.5409 KOps/s 8.5432 KOps/s $\textbf{\color{#d91a1a}-35.14\%}$
test_tdmodule 65.7430μs 19.0254μs 52.5614 KOps/s 45.2409 KOps/s $\textbf{\color{#35bf28}+16.18\%}$
test_tdmodule_dispatch 0.1579ms 34.9655μs 28.5996 KOps/s 24.3446 KOps/s $\textbf{\color{#35bf28}+17.48\%}$
test_tdseq 28.5820μs 19.4152μs 51.5059 KOps/s 43.0688 KOps/s $\textbf{\color{#35bf28}+19.59\%}$
test_tdseq_dispatch 58.1430μs 37.1135μs 26.9443 KOps/s 22.9161 KOps/s $\textbf{\color{#35bf28}+17.58\%}$
test_instantiation_functorch 1.7815ms 1.5636ms 639.5429 Ops/s 639.0239 Ops/s $\color{#35bf28}+0.08\%$
test_exec_functorch 0.2121ms 0.1469ms 6.8093 KOps/s 6.7334 KOps/s $\color{#35bf28}+1.13\%$
test_exec_functional_call 0.2561ms 0.1375ms 7.2706 KOps/s 6.7981 KOps/s $\textbf{\color{#35bf28}+6.95\%}$
test_exec_td_decorator 0.3685ms 0.1869ms 5.3501 KOps/s 5.2402 KOps/s $\color{#35bf28}+2.10\%$
test_vmap_mlp_speed_decorator[True-True] 0.8372ms 0.6833ms 1.4635 KOps/s 1.3613 KOps/s $\textbf{\color{#35bf28}+7.51\%}$
test_vmap_mlp_speed_decorator[True-False] 0.8538ms 0.6816ms 1.4671 KOps/s 1.3620 KOps/s $\textbf{\color{#35bf28}+7.72\%}$
test_vmap_mlp_speed_decorator[False-True] 0.7440ms 0.5977ms 1.6731 KOps/s 1.5733 KOps/s $\textbf{\color{#35bf28}+6.34\%}$
test_vmap_mlp_speed_decorator[False-False] 0.7898ms 0.5979ms 1.6726 KOps/s 1.5726 KOps/s $\textbf{\color{#35bf28}+6.36\%}$
test_vmap_transformer_speed_decorator[True-True] 20.1120ms 19.1322ms 52.2678 Ops/s 51.3150 Ops/s $\color{#35bf28}+1.86\%$
test_vmap_transformer_speed_decorator[True-False] 19.5197ms 19.1575ms 52.1988 Ops/s 51.4560 Ops/s $\color{#35bf28}+1.44\%$
test_vmap_transformer_speed_decorator[False-True] 19.3251ms 19.0150ms 52.5901 Ops/s 51.7440 Ops/s $\color{#35bf28}+1.64\%$
test_vmap_transformer_speed_decorator[False-False] 19.2988ms 19.0153ms 52.5891 Ops/s 51.8842 Ops/s $\color{#35bf28}+1.36\%$
test_to_module_speed[True] 1.5315ms 0.9730ms 1.0277 KOps/s 1.0266 KOps/s $\color{#35bf28}+0.10\%$
test_to_module_speed[False] 1.1824ms 0.9563ms 1.0457 KOps/s 1.0338 KOps/s $\color{#35bf28}+1.16\%$
test_tc_init 0.1035ms 36.0018μs 27.7764 KOps/s 25.1817 KOps/s $\textbf{\color{#35bf28}+10.30\%}$
test_tc_init_nested 0.1339ms 71.8785μs 13.9124 KOps/s 12.3550 KOps/s $\textbf{\color{#35bf28}+12.61\%}$
test_tc_first_layer_tensor 27.5310μs 0.8653μs 1.1556 MOps/s 1.2211 MOps/s $\textbf{\color{#d91a1a}-5.36\%}$
test_tc_first_layer_nontensor 54.4820μs 2.3484μs 425.8304 KOps/s 439.8046 KOps/s $\color{#d91a1a}-3.18\%$
test_tc_second_layer_tensor 14.9973μs 1.4890μs 671.6043 KOps/s 686.1797 KOps/s $\color{#d91a1a}-2.12\%$
test_tc_second_layer_nontensor 42.7620μs 3.0825μs 324.4154 KOps/s 326.9839 KOps/s $\color{#d91a1a}-0.79\%$
test_unbind 0.2283s 12.4623ms 80.2421 Ops/s 142.2314 Ops/s $\textbf{\color{#d91a1a}-43.58\%}$
test_full_like 11.8287ms 10.1017ms 98.9936 Ops/s 100.2627 Ops/s $\color{#d91a1a}-1.27\%$
test_zeros_like 9.4727ms 7.3088ms 136.8211 Ops/s 112.6821 Ops/s $\textbf{\color{#35bf28}+21.42\%}$
test_ones_like 5.0411ms 4.5291ms 220.7945 Ops/s 221.5727 Ops/s $\color{#d91a1a}-0.35\%$
test_clone 8.1951ms 7.2428ms 138.0681 Ops/s 138.5103 Ops/s $\color{#d91a1a}-0.32\%$
test_squeeze 0.1798ms 9.7556μs 102.5052 KOps/s 94.7814 KOps/s $\textbf{\color{#35bf28}+8.15\%}$
test_unsqueeze 0.4869ms 73.3750μs 13.6286 KOps/s 12.1356 KOps/s $\textbf{\color{#35bf28}+12.30\%}$
test_split 0.3992ms 0.1600ms 6.2511 KOps/s 5.7611 KOps/s $\textbf{\color{#35bf28}+8.50\%}$
test_permute 0.3512ms 0.1764ms 5.6690 KOps/s 5.3671 KOps/s $\textbf{\color{#35bf28}+5.62\%}$
test_stack 53.4700ms 52.1513ms 19.1750 Ops/s 19.3262 Ops/s $\color{#d91a1a}-0.78\%$
test_cat 53.1597ms 51.9289ms 19.2571 Ops/s 19.2286 Ops/s $\color{#35bf28}+0.15\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
2 participants