Skip to content

[Feature] More copy() refactors for compile friendliness#1516

Merged
vmoens merged 1 commit intomainfrom
solve-copy
Jan 6, 2026
Merged

[Feature] More copy() refactors for compile friendliness#1516
vmoens merged 1 commit intomainfrom
solve-copy

Conversation

@vmoens
Copy link
Copy Markdown
Collaborator

@vmoens vmoens commented Jan 6, 2026

No description provided.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 6, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jan 6, 2026

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 233. Improved: $\large\color{#35bf28}8$. Worsened: $\large\color{#d91a1a}9$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 39.4110μs 14.2917μs 69.9706 KOps/s 67.8649 KOps/s $\color{#35bf28}+3.10\%$
test_plain_set_stack_nested 48.3610μs 14.6457μs 68.2792 KOps/s 66.7027 KOps/s $\color{#35bf28}+2.36\%$
test_plain_set_nested_inplace 45.2610μs 16.0877μs 62.1594 KOps/s 60.5927 KOps/s $\color{#35bf28}+2.59\%$
test_plain_set_stack_nested_inplace 61.1720μs 16.2388μs 61.5808 KOps/s 61.5126 KOps/s $\color{#35bf28}+0.11\%$
test_items 29.1500μs 5.5321μs 180.7628 KOps/s 176.3758 KOps/s $\color{#35bf28}+2.49\%$
test_items_nested 0.5658ms 0.5124ms 1.9518 KOps/s 1.9371 KOps/s $\color{#35bf28}+0.75\%$
test_items_nested_locked 0.5809ms 0.5150ms 1.9416 KOps/s 1.9482 KOps/s $\color{#d91a1a}-0.34\%$
test_items_nested_leaf 0.1270ms 90.9810μs 10.9913 KOps/s 11.1311 KOps/s $\color{#d91a1a}-1.26\%$
test_items_stack_nested 0.5759ms 0.5111ms 1.9567 KOps/s 1.9577 KOps/s $\color{#d91a1a}-0.06\%$
test_items_stack_nested_leaf 0.1370ms 90.5914μs 11.0386 KOps/s 10.8483 KOps/s $\color{#35bf28}+1.75\%$
test_items_stack_nested_locked 0.6200ms 0.5153ms 1.9405 KOps/s 1.9619 KOps/s $\color{#d91a1a}-1.09\%$
test_keys 29.6300μs 4.4454μs 224.9533 KOps/s 243.4742 KOps/s $\textbf{\color{#d91a1a}-7.61\%}$
test_keys_nested 0.1896ms 0.1162ms 8.6060 KOps/s 8.3833 KOps/s $\color{#35bf28}+2.66\%$
test_keys_nested_locked 0.7596ms 0.1248ms 8.0133 KOps/s 7.7883 KOps/s $\color{#35bf28}+2.89\%$
test_keys_nested_leaf 0.5159ms 0.1065ms 9.3896 KOps/s 9.1036 KOps/s $\color{#35bf28}+3.14\%$
test_keys_stack_nested 0.1681ms 0.1165ms 8.5811 KOps/s 8.4228 KOps/s $\color{#35bf28}+1.88\%$
test_keys_stack_nested_leaf 0.1530ms 0.1069ms 9.3514 KOps/s 9.1281 KOps/s $\color{#35bf28}+2.45\%$
test_keys_stack_nested_locked 0.1695ms 0.1250ms 7.9999 KOps/s 7.8489 KOps/s $\color{#35bf28}+1.92\%$
test_values 9.9642μs 0.9949μs 1.0052 MOps/s 998.2798 KOps/s $\color{#35bf28}+0.69\%$
test_values_nested 74.8810μs 46.1557μs 21.6658 KOps/s 21.3263 KOps/s $\color{#35bf28}+1.59\%$
test_values_nested_locked 86.5410μs 48.9560μs 20.4265 KOps/s 20.1557 KOps/s $\color{#35bf28}+1.34\%$
test_values_nested_leaf 82.3820μs 52.6010μs 19.0110 KOps/s 18.7264 KOps/s $\color{#35bf28}+1.52\%$
test_values_stack_nested 77.4710μs 46.7130μs 21.4073 KOps/s 21.3093 KOps/s $\color{#35bf28}+0.46\%$
test_values_stack_nested_leaf 87.5510μs 52.5056μs 19.0456 KOps/s 18.4974 KOps/s $\color{#35bf28}+2.96\%$
test_values_stack_nested_locked 81.5920μs 49.0837μs 20.3734 KOps/s 20.1358 KOps/s $\color{#35bf28}+1.18\%$
test_membership 5.1768μs 0.8021μs 1.2468 MOps/s 1.2394 MOps/s $\color{#35bf28}+0.60\%$
test_membership_nested 35.3400μs 2.9597μs 337.8666 KOps/s 332.9217 KOps/s $\color{#35bf28}+1.49\%$
test_membership_nested_leaf 31.4110μs 3.0059μs 332.6784 KOps/s 331.8123 KOps/s $\color{#35bf28}+0.26\%$
test_membership_stacked_nested 36.8700μs 3.0080μs 332.4422 KOps/s 332.6410 KOps/s $\color{#d91a1a}-0.06\%$
test_membership_stacked_nested_leaf 24.5010μs 2.9802μs 335.5514 KOps/s 334.5166 KOps/s $\color{#35bf28}+0.31\%$
test_membership_nested_last 38.2410μs 4.3628μs 229.2130 KOps/s 227.2465 KOps/s $\color{#35bf28}+0.87\%$
test_membership_nested_leaf_last 34.9810μs 4.3499μs 229.8906 KOps/s 226.1016 KOps/s $\color{#35bf28}+1.68\%$
test_membership_stacked_nested_last 38.6110μs 4.3729μs 228.6810 KOps/s 226.7032 KOps/s $\color{#35bf28}+0.87\%$
test_membership_stacked_nested_leaf_last 25.7910μs 4.3633μs 229.1837 KOps/s 230.2510 KOps/s $\color{#d91a1a}-0.46\%$
test_nested_getleaf 51.8310μs 20.8263μs 48.0161 KOps/s 48.5970 KOps/s $\color{#d91a1a}-1.20\%$
test_nested_get 56.9710μs 19.7159μs 50.7206 KOps/s 51.1790 KOps/s $\color{#d91a1a}-0.90\%$
test_stacked_getleaf 44.2310μs 20.5867μs 48.5751 KOps/s 48.5586 KOps/s $\color{#35bf28}+0.03\%$
test_stacked_get 45.1410μs 19.4989μs 51.2848 KOps/s 50.9654 KOps/s $\color{#35bf28}+0.63\%$
test_nested_getitemleaf 58.7610μs 21.3793μs 46.7743 KOps/s 47.1397 KOps/s $\color{#d91a1a}-0.78\%$
test_nested_getitem 49.0010μs 20.0216μs 49.9460 KOps/s 49.6388 KOps/s $\color{#35bf28}+0.62\%$
test_stacked_getitemleaf 47.4710μs 21.0812μs 47.4356 KOps/s 46.9325 KOps/s $\color{#35bf28}+1.07\%$
test_stacked_getitem 56.8410μs 19.6923μs 50.7813 KOps/s 49.4489 KOps/s $\color{#35bf28}+2.69\%$
test_lock_nested 7.9938ms 0.4566ms 2.1901 KOps/s 2.2271 KOps/s $\color{#d91a1a}-1.66\%$
test_lock_stack_nested 0.5013ms 0.4536ms 2.2047 KOps/s 2.1888 KOps/s $\color{#35bf28}+0.73\%$
test_unlock_nested 0.6410ms 0.3625ms 2.7590 KOps/s 2.7326 KOps/s $\color{#35bf28}+0.96\%$
test_unlock_stack_nested 0.4167ms 0.3618ms 2.7636 KOps/s 2.7195 KOps/s $\color{#35bf28}+1.62\%$
test_flatten_speed 0.1483ms 0.1160ms 8.6198 KOps/s 8.5904 KOps/s $\color{#35bf28}+0.34\%$
test_unflatten_speed 0.7009ms 0.5665ms 1.7651 KOps/s 1.7523 KOps/s $\color{#35bf28}+0.73\%$
test_common_ops 0.8464ms 0.7255ms 1.3783 KOps/s 1.3671 KOps/s $\color{#35bf28}+0.82\%$
test_creation 0.1017ms 2.5760μs 388.2019 KOps/s 387.7784 KOps/s $\color{#35bf28}+0.11\%$
test_creation_empty 33.4200μs 8.6166μs 116.0546 KOps/s 115.7373 KOps/s $\color{#35bf28}+0.27\%$
test_creation_nested_1 34.9910μs 11.3324μs 88.2428 KOps/s 87.7010 KOps/s $\color{#35bf28}+0.62\%$
test_creation_nested_2 46.0500μs 15.2289μs 65.6646 KOps/s 65.4914 KOps/s $\color{#35bf28}+0.26\%$
test_clone 51.8610μs 12.8003μs 78.1234 KOps/s 76.6450 KOps/s $\color{#35bf28}+1.93\%$
test_getitem[int] 1.1603ms 13.4946μs 74.1037 KOps/s 73.2298 KOps/s $\color{#35bf28}+1.19\%$
test_getitem[slice_int] 0.1357ms 23.4165μs 42.7049 KOps/s 42.6213 KOps/s $\color{#35bf28}+0.20\%$
test_getitem[range] 0.1659ms 56.9163μs 17.5697 KOps/s 17.0981 KOps/s $\color{#35bf28}+2.76\%$
test_getitem[tuple] 0.1571ms 23.4405μs 42.6612 KOps/s 42.9003 KOps/s $\color{#d91a1a}-0.56\%$
test_getitem[list] 0.1730ms 52.3349μs 19.1077 KOps/s 19.1545 KOps/s $\color{#d91a1a}-0.24\%$
test_setitem_dim[int] 48.7010μs 23.9593μs 41.7374 KOps/s 41.5266 KOps/s $\color{#35bf28}+0.51\%$
test_setitem_dim[slice_int] 80.9020μs 43.8717μs 22.7937 KOps/s 23.0591 KOps/s $\color{#d91a1a}-1.15\%$
test_setitem_dim[range] 0.1227ms 83.8834μs 11.9213 KOps/s 11.8525 KOps/s $\color{#35bf28}+0.58\%$
test_setitem_dim[tuple] 77.4810μs 40.3408μs 24.7888 KOps/s 24.6977 KOps/s $\color{#35bf28}+0.37\%$
test_setitem 55.8010μs 17.4751μs 57.2242 KOps/s 56.2016 KOps/s $\color{#35bf28}+1.82\%$
test_set 50.6810μs 16.7478μs 59.7095 KOps/s 59.8043 KOps/s $\color{#d91a1a}-0.16\%$
test_set_shared 0.4878ms 0.2042ms 4.8969 KOps/s 4.9306 KOps/s $\color{#d91a1a}-0.68\%$
test_update 0.3653ms 21.7170μs 46.0468 KOps/s 46.1854 KOps/s $\color{#d91a1a}-0.30\%$
test_update_nested 73.9910μs 33.4785μs 29.8699 KOps/s 29.9962 KOps/s $\color{#d91a1a}-0.42\%$
test_update__nested 0.4755ms 32.5405μs 30.7310 KOps/s 28.5983 KOps/s $\textbf{\color{#35bf28}+7.46\%}$
test_set_nested 57.8410μs 18.6393μs 53.6502 KOps/s 53.9088 KOps/s $\color{#d91a1a}-0.48\%$
test_set_nested_new 65.8710μs 23.7942μs 42.0270 KOps/s 41.6626 KOps/s $\color{#35bf28}+0.87\%$
test_select 83.6620μs 41.0057μs 24.3868 KOps/s 23.6990 KOps/s $\color{#35bf28}+2.90\%$
test_select_nested 0.1177ms 70.2030μs 14.2444 KOps/s 13.9679 KOps/s $\color{#35bf28}+1.98\%$
test_exclude_nested 0.1351ms 92.1692μs 10.8496 KOps/s 10.8672 KOps/s $\color{#d91a1a}-0.16\%$
test_empty[True] 0.4709ms 0.4161ms 2.4032 KOps/s 2.3980 KOps/s $\color{#35bf28}+0.22\%$
test_empty[False] 8.0450μs 1.2517μs 798.9207 KOps/s 793.6258 KOps/s $\color{#35bf28}+0.67\%$
test_to 0.1001ms 69.6001μs 14.3678 KOps/s 14.1144 KOps/s $\color{#35bf28}+1.80\%$
test_to_nonblocking 0.1559ms 63.7245μs 15.6925 KOps/s 15.9133 KOps/s $\color{#d91a1a}-1.39\%$
test_unbind_speed 0.3559ms 0.3102ms 3.2235 KOps/s 3.2177 KOps/s $\color{#35bf28}+0.18\%$
test_unbind_speed_stack0 0.3716ms 0.3080ms 3.2466 KOps/s 3.2282 KOps/s $\color{#35bf28}+0.57\%$
test_unbind_speed_stack1 98.7742ms 0.9027ms 1.1078 KOps/s 1.1910 KOps/s $\textbf{\color{#d91a1a}-6.98\%}$
test_split 1.1545ms 1.1098ms 901.0621 Ops/s 677.3472 Ops/s $\textbf{\color{#35bf28}+33.03\%}$
test_chunk 98.3580ms 1.1652ms 858.2038 Ops/s 949.5347 Ops/s $\textbf{\color{#d91a1a}-9.62\%}$
test_consolidate[False-None] 3.9466ms 3.7423ms 267.2189 Ops/s 269.1021 Ops/s $\color{#d91a1a}-0.70\%$
test_consolidate[default-None] 2.0974ms 1.9920ms 502.0201 Ops/s 484.0340 Ops/s $\color{#35bf28}+3.72\%$
test_consolidate[reduce-overhead-None] 1.9891ms 1.9017ms 525.8361 Ops/s 498.4121 Ops/s $\textbf{\color{#35bf28}+5.50\%}$
test_consolidate_njt[False-None] 8.8058ms 8.6156ms 116.0686 Ops/s 117.5496 Ops/s $\color{#d91a1a}-1.26\%$
test_to[False-False-None] 2.1072ms 2.0236ms 494.1697 Ops/s 493.7708 Ops/s $\color{#35bf28}+0.08\%$
test_to[True-False-None] 2.0824ms 1.7918ms 558.1081 Ops/s 556.5970 Ops/s $\color{#35bf28}+0.27\%$
test_to[within-False-None] 5.7096ms 5.6133ms 178.1474 Ops/s 181.2369 Ops/s $\color{#d91a1a}-1.70\%$
test_to[True-default-None] 11.9039ms 11.7398ms 85.1802 Ops/s 86.5040 Ops/s $\color{#d91a1a}-1.53\%$
test_to_njt[False-False-None] 8.4819ms 8.3897ms 119.1934 Ops/s 119.5545 Ops/s $\color{#d91a1a}-0.30\%$
test_to_njt[True-False-None] 7.3714ms 7.2244ms 138.4206 Ops/s 139.2561 Ops/s $\color{#d91a1a}-0.60\%$
test_to_njt[within-False-None] 16.1139ms 16.0300ms 62.3831 Ops/s 62.9222 Ops/s $\color{#d91a1a}-0.86\%$
test_creation[device0] 0.3496ms 0.1063ms 9.4040 KOps/s 9.2146 KOps/s $\color{#35bf28}+2.06\%$
test_creation_from_tensor 0.3540ms 0.1086ms 9.2117 KOps/s 9.0758 KOps/s $\color{#35bf28}+1.50\%$
test_add_one[memmap_tensor0] 0.1661ms 6.5742μs 152.1092 KOps/s 152.7741 KOps/s $\color{#d91a1a}-0.44\%$
test_contiguous[memmap_tensor0] 25.7200μs 0.6346μs 1.5758 MOps/s 2.1192 MOps/s $\textbf{\color{#d91a1a}-25.64\%}$
test_stack[memmap_tensor0] 31.7110μs 4.5066μs 221.8979 KOps/s 218.6604 KOps/s $\color{#35bf28}+1.48\%$
test_memmaptd_index 1.0695ms 0.2700ms 3.7035 KOps/s 3.6898 KOps/s $\color{#35bf28}+0.37\%$
test_memmaptd_index_astensor 0.5209ms 0.3551ms 2.8158 KOps/s 2.8187 KOps/s $\color{#d91a1a}-0.10\%$
test_memmaptd_index_op 0.8046ms 0.6008ms 1.6644 KOps/s 1.6541 KOps/s $\color{#35bf28}+0.62\%$
test_serialize_model 0.1349s 0.1337s 7.4771 Ops/s 7.4681 Ops/s $\color{#35bf28}+0.12\%$
test_serialize_model_pickle 1.3749s 1.2160s 0.8224 Ops/s 0.8431 Ops/s $\color{#d91a1a}-2.46\%$
test_serialize_weights 0.1341s 0.1337s 7.4810 Ops/s 5.1143 Ops/s $\textbf{\color{#35bf28}+46.28\%}$
test_serialize_weights_returnearly 67.7410ms 51.8055ms 19.3030 Ops/s 19.6459 Ops/s $\color{#d91a1a}-1.75\%$
test_serialize_weights_pickle 1.3648s 1.2212s 0.8188 Ops/s 0.8394 Ops/s $\color{#d91a1a}-2.45\%$
test_reshape_pytree 0.3613ms 32.4561μs 30.8109 KOps/s 31.3724 KOps/s $\color{#d91a1a}-1.79\%$
test_reshape_td 79.5620μs 38.3370μs 26.0845 KOps/s 26.9806 KOps/s $\color{#d91a1a}-3.32\%$
test_view_pytree 0.2180ms 31.4169μs 31.8300 KOps/s 31.6772 KOps/s $\color{#35bf28}+0.48\%$
test_view_td 77.2620μs 44.0102μs 22.7220 KOps/s 22.4781 KOps/s $\color{#35bf28}+1.09\%$
test_unbind_pytree 0.2342ms 37.0498μs 26.9907 KOps/s 27.4004 KOps/s $\color{#d91a1a}-1.50\%$
test_unbind_td 0.1613ms 45.6581μs 21.9019 KOps/s 21.0984 KOps/s $\color{#35bf28}+3.81\%$
test_split_pytree 0.1955ms 42.3427μs 23.6168 KOps/s 23.6828 KOps/s $\color{#d91a1a}-0.28\%$
test_split_td 0.1954ms 61.8069μs 16.1794 KOps/s 15.8221 KOps/s $\color{#35bf28}+2.26\%$
test_add_pytree 0.2478ms 45.0905μs 22.1776 KOps/s 23.2474 KOps/s $\color{#d91a1a}-4.60\%$
test_add_td 96.6720μs 55.8715μs 17.8982 KOps/s 19.0106 KOps/s $\textbf{\color{#d91a1a}-5.85\%}$
test_compile_add_one_nested[tensordict-compile] 0.2828ms 0.1767ms 5.6602 KOps/s 5.5834 KOps/s $\color{#35bf28}+1.38\%$
test_compile_add_one_nested[tensordict-eager] 0.2764ms 0.1886ms 5.3026 KOps/s 5.3799 KOps/s $\color{#d91a1a}-1.44\%$
test_compile_add_one_nested[pytree-compile] 0.1933ms 0.1482ms 6.7470 KOps/s 6.6412 KOps/s $\color{#35bf28}+1.59\%$
test_compile_add_one_nested[pytree-eager] 0.4451ms 0.1850ms 5.4067 KOps/s 5.4760 KOps/s $\color{#d91a1a}-1.27\%$
test_compile_copy_nested[tensordict-compile] 59.7910μs 27.4877μs 36.3799 KOps/s 35.1448 KOps/s $\color{#35bf28}+3.51\%$
test_compile_copy_nested[tensordict-eager] 79.0310μs 50.4877μs 19.8068 KOps/s 19.8471 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_copy_nested[pytree-compile] 58.0510μs 14.0807μs 71.0190 KOps/s 68.4436 KOps/s $\color{#35bf28}+3.76\%$
test_compile_copy_nested[pytree-eager] 0.3954ms 71.6789μs 13.9511 KOps/s 13.8288 KOps/s $\color{#35bf28}+0.88\%$
test_compile_add_one_flat[tensordict-compile] 0.2396ms 0.2038ms 4.9060 KOps/s 4.7993 KOps/s $\color{#35bf28}+2.22\%$
test_compile_add_one_flat[tensordict-eager] 0.3024ms 0.2602ms 3.8436 KOps/s 3.9037 KOps/s $\color{#d91a1a}-1.54\%$
test_compile_add_one_flat[tensorclass-compile] 0.1974ms 0.1521ms 6.5731 KOps/s 6.4480 KOps/s $\color{#35bf28}+1.94\%$
test_compile_add_one_flat[tensorclass-eager] 0.1118ms 70.6916μs 14.1459 KOps/s 14.2113 KOps/s $\color{#d91a1a}-0.46\%$
test_compile_add_one_flat[pytree-compile] 0.2596ms 0.1995ms 5.0138 KOps/s 4.9192 KOps/s $\color{#35bf28}+1.92\%$
test_compile_add_one_flat[pytree-eager] 0.9006ms 0.5438ms 1.8390 KOps/s 1.8534 KOps/s $\color{#d91a1a}-0.77\%$
test_compile_add_self_flat[tensordict-eager] 0.3489ms 0.3093ms 3.2330 KOps/s 3.2364 KOps/s $\color{#d91a1a}-0.11\%$
test_compile_add_self_flat[tensordict-compile] 0.2439ms 0.2073ms 4.8244 KOps/s 4.6522 KOps/s $\color{#35bf28}+3.70\%$
test_compile_add_self_flat[tensorclass-eager] 0.1277ms 89.1271μs 11.2199 KOps/s 11.8617 KOps/s $\textbf{\color{#d91a1a}-5.41\%}$
test_compile_add_self_flat[tensorclass-compile] 0.2067ms 0.1563ms 6.3976 KOps/s 6.2562 KOps/s $\color{#35bf28}+2.26\%$
test_compile_add_self_flat[pytree-eager] 0.6791ms 0.4514ms 2.2151 KOps/s 2.2356 KOps/s $\color{#d91a1a}-0.92\%$
test_compile_add_self_flat[pytree-compile] 0.2559ms 0.2011ms 4.9717 KOps/s 4.7953 KOps/s $\color{#35bf28}+3.68\%$
test_compile_copy_flat[tensordict-compile] 57.8310μs 23.4353μs 42.6706 KOps/s 38.3995 KOps/s $\textbf{\color{#35bf28}+11.12\%}$
test_compile_copy_flat[tensordict-eager] 70.9520μs 40.4105μs 24.7461 KOps/s 25.1815 KOps/s $\color{#d91a1a}-1.73\%$
test_compile_copy_flat[pytree-compile] 56.6410μs 21.9194μs 45.6217 KOps/s 46.8941 KOps/s $\color{#d91a1a}-2.71\%$
test_compile_copy_flat[pytree-eager] 0.3552ms 66.6207μs 15.0104 KOps/s 14.9426 KOps/s $\color{#35bf28}+0.45\%$
test_compile_assign_and_add[tensordict-compile] 2.0216ms 0.2090ms 4.7857 KOps/s 4.7070 KOps/s $\color{#35bf28}+1.67\%$
test_compile_assign_and_add[tensordict-eager] 3.5527ms 3.3125ms 301.8873 Ops/s 310.8999 Ops/s $\color{#d91a1a}-2.90\%$
test_compile_assign_and_add[pytree-compile] 1.9975ms 0.2036ms 4.9116 KOps/s 4.8565 KOps/s $\color{#35bf28}+1.14\%$
test_compile_assign_and_add[pytree-eager] 3.2111ms 2.9686ms 336.8626 Ops/s 348.6521 Ops/s $\color{#d91a1a}-3.38\%$
test_compile_indexing[tensor-tensordict-compile] 0.2126ms 0.1462ms 6.8417 KOps/s 6.7336 KOps/s $\color{#35bf28}+1.60\%$
test_compile_indexing[tensor-tensordict-eager] 0.3025ms 70.2515μs 14.2346 KOps/s 14.4883 KOps/s $\color{#d91a1a}-1.75\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1892ms 0.1413ms 7.0766 KOps/s 7.3716 KOps/s $\color{#d91a1a}-4.00\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2579ms 46.6914μs 21.4172 KOps/s 21.2026 KOps/s $\color{#35bf28}+1.01\%$
test_compile_indexing[tensor-pytree-compile] 0.1737ms 0.1351ms 7.4022 KOps/s 7.4243 KOps/s $\color{#d91a1a}-0.30\%$
test_compile_indexing[tensor-pytree-eager] 0.2843ms 47.7238μs 20.9539 KOps/s 20.9964 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_indexing[slice-tensordict-compile] 0.1258ms 85.9901μs 11.6293 KOps/s 11.2498 KOps/s $\color{#35bf28}+3.37\%$
test_compile_indexing[slice-tensordict-eager] 0.2129ms 27.5569μs 36.2886 KOps/s 38.0379 KOps/s $\color{#d91a1a}-4.60\%$
test_compile_indexing[slice-tensorclass-compile] 0.1326ms 80.3813μs 12.4407 KOps/s 12.3665 KOps/s $\color{#35bf28}+0.60\%$
test_compile_indexing[slice-tensorclass-eager] 0.2255ms 22.8066μs 43.8470 KOps/s 43.3293 KOps/s $\color{#35bf28}+1.19\%$
test_compile_indexing[slice-pytree-compile] 0.1198ms 82.8216μs 12.0741 KOps/s 11.6586 KOps/s $\color{#35bf28}+3.56\%$
test_compile_indexing[slice-pytree-eager] 0.2539ms 22.8436μs 43.7759 KOps/s 43.9353 KOps/s $\color{#d91a1a}-0.36\%$
test_compile_indexing[int-tensordict-compile] 0.1345ms 90.2239μs 11.0835 KOps/s 11.0686 KOps/s $\color{#35bf28}+0.13\%$
test_compile_indexing[int-tensordict-eager] 0.2546ms 25.2749μs 39.5649 KOps/s 38.6627 KOps/s $\color{#35bf28}+2.33\%$
test_compile_indexing[int-tensorclass-compile] 0.1205ms 80.4968μs 12.4229 KOps/s 11.7668 KOps/s $\textbf{\color{#35bf28}+5.58\%}$
test_compile_indexing[int-tensorclass-eager] 0.2313ms 22.4667μs 44.5103 KOps/s 43.8204 KOps/s $\color{#35bf28}+1.57\%$
test_compile_indexing[int-pytree-compile] 0.1687ms 81.2074μs 12.3142 KOps/s 11.7397 KOps/s $\color{#35bf28}+4.89\%$
test_compile_indexing[int-pytree-eager] 0.2564ms 22.4236μs 44.5959 KOps/s 43.2019 KOps/s $\color{#35bf28}+3.23\%$
test_mod_add[eager] 92.6820μs 52.4787μs 19.0554 KOps/s 19.8376 KOps/s $\color{#d91a1a}-3.94\%$
test_mod_add[compile] 0.2041ms 0.1478ms 6.7644 KOps/s 6.3207 KOps/s $\textbf{\color{#35bf28}+7.02\%}$
test_mod_add[compile-overhead] 0.2824ms 0.1948ms 5.1341 KOps/s 4.9239 KOps/s $\color{#35bf28}+4.27\%$
test_mod_wrap[eager] 0.4071ms 0.3067ms 3.2605 KOps/s 3.1168 KOps/s $\color{#35bf28}+4.61\%$
test_mod_wrap[compile] 0.5221ms 0.3962ms 2.5239 KOps/s 2.5421 KOps/s $\color{#d91a1a}-0.71\%$
test_mod_wrap[compile-overhead] 7.4018ms 3.9177ms 255.2511 Ops/s 261.4842 Ops/s $\color{#d91a1a}-2.38\%$
test_mod_wrap_and_backward[eager] 1.6646ms 1.5680ms 637.7359 Ops/s 637.3953 Ops/s $\color{#35bf28}+0.05\%$
test_mod_wrap_and_backward[compile] 1.6502ms 1.5840ms 631.3066 Ops/s 629.0205 Ops/s $\color{#35bf28}+0.36\%$
test_mod_wrap_and_backward[compile-overhead] 1.3013ms 0.9647ms 1.0366 KOps/s 1.0309 KOps/s $\color{#35bf28}+0.56\%$
test_seq_add[eager] 0.2104ms 0.1580ms 6.3299 KOps/s 6.5648 KOps/s $\color{#d91a1a}-3.58\%$
test_seq_add[compile] 0.2127ms 0.1589ms 6.2924 KOps/s 6.3306 KOps/s $\color{#d91a1a}-0.60\%$
test_seq_add[compile-overhead] 0.2415ms 0.1989ms 5.0284 KOps/s 5.0080 KOps/s $\color{#35bf28}+0.41\%$
test_seq_wrap[eager] 0.7061ms 0.5594ms 1.7878 KOps/s 1.8904 KOps/s $\textbf{\color{#d91a1a}-5.43\%}$
test_seq_wrap[compile] 0.5087ms 0.4235ms 2.3613 KOps/s 2.4511 KOps/s $\color{#d91a1a}-3.66\%$
test_seq_wrap[compile-overhead] 0.3639ms 0.3063ms 3.2645 KOps/s 3.2636 KOps/s $\color{#35bf28}+0.03\%$
test_func_call_runtime[False-eager] 0.9672ms 0.8787ms 1.1380 KOps/s 1.1514 KOps/s $\color{#d91a1a}-1.16\%$
test_func_call_runtime[False-compile] 0.9791ms 0.9249ms 1.0812 KOps/s 1.0881 KOps/s $\color{#d91a1a}-0.64\%$
test_func_call_runtime[False-compile-overhead] 0.5216ms 0.4786ms 2.0895 KOps/s 2.0741 KOps/s $\color{#35bf28}+0.74\%$
test_func_call_runtime[True-eager] 1.2306ms 1.1155ms 896.4767 Ops/s 899.8950 Ops/s $\color{#d91a1a}-0.38\%$
test_func_call_runtime[True-compile] 1.1603ms 0.9868ms 1.0134 KOps/s 1.0538 KOps/s $\color{#d91a1a}-3.84\%$
test_func_call_runtime[True-compile-overhead] 0.5662ms 0.5018ms 1.9926 KOps/s 1.9868 KOps/s $\color{#35bf28}+0.29\%$
test_func_call_cm_runtime[False-eager] 1.0951ms 0.9239ms 1.0824 KOps/s 1.1483 KOps/s $\textbf{\color{#d91a1a}-5.74\%}$
test_func_call_cm_runtime[False-compile] 1.0289ms 0.9609ms 1.0406 KOps/s 1.0842 KOps/s $\color{#d91a1a}-4.02\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5910ms 0.4837ms 2.0675 KOps/s 2.0670 KOps/s $\color{#35bf28}+0.02\%$
test_func_call_cm_runtime[True-eager] 1.4062ms 1.2532ms 797.9391 Ops/s 802.0192 Ops/s $\color{#d91a1a}-0.51\%$
test_func_call_cm_runtime[True-compile] 1.0743ms 0.9803ms 1.0200 KOps/s 1.0243 KOps/s $\color{#d91a1a}-0.42\%$
test_func_call_cm_runtime[True-compile-overhead] 0.6229ms 0.5338ms 1.8734 KOps/s 1.8593 KOps/s $\color{#35bf28}+0.76\%$
test_vmap_func_call_cm_runtime[eager] 2.8475ms 2.3560ms 424.4502 Ops/s 423.7846 Ops/s $\color{#35bf28}+0.16\%$
test_vmap_func_call_cm_runtime[compile] 1.2603ms 1.0017ms 998.2969 Ops/s 983.7563 Ops/s $\color{#35bf28}+1.48\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5806ms 0.5300ms 1.8868 KOps/s 1.8803 KOps/s $\color{#35bf28}+0.35\%$
test_distributed 0.5479ms 0.1525ms 6.5595 KOps/s 6.4848 KOps/s $\color{#35bf28}+1.15\%$
test_tdmodule 0.2856ms 27.2107μs 36.7502 KOps/s 36.9967 KOps/s $\color{#d91a1a}-0.67\%$
test_tdmodule_dispatch 81.5910μs 46.3524μs 21.5739 KOps/s 21.5767 KOps/s $\color{#d91a1a}-0.01\%$
test_tdseq 47.1110μs 26.2718μs 38.0636 KOps/s 38.4374 KOps/s $\color{#d91a1a}-0.97\%$
test_tdseq_dispatch 84.9920μs 48.5391μs 20.6020 KOps/s 20.6134 KOps/s $\color{#d91a1a}-0.06\%$
test_instantiation_functorch 2.0849ms 1.9973ms 500.6755 Ops/s 504.5698 Ops/s $\color{#d91a1a}-0.77\%$
test_exec_functorch 0.2259ms 0.1793ms 5.5763 KOps/s 5.5039 KOps/s $\color{#35bf28}+1.32\%$
test_exec_functional_call 0.2293ms 0.1588ms 6.2961 KOps/s 6.1801 KOps/s $\color{#35bf28}+1.88\%$
test_exec_td_decorator 0.4721ms 0.2347ms 4.2615 KOps/s 4.2791 KOps/s $\color{#d91a1a}-0.41\%$
test_vmap_mlp_speed_decorator[True-True] 0.9794ms 0.7985ms 1.2524 KOps/s 1.2473 KOps/s $\color{#35bf28}+0.41\%$
test_vmap_mlp_speed_decorator[True-False] 0.9998ms 0.8054ms 1.2416 KOps/s 1.2385 KOps/s $\color{#35bf28}+0.26\%$
test_vmap_mlp_speed_decorator[False-True] 0.9037ms 0.6870ms 1.4556 KOps/s 1.4592 KOps/s $\color{#d91a1a}-0.25\%$
test_vmap_mlp_speed_decorator[False-False] 0.8792ms 0.6942ms 1.4405 KOps/s 1.4269 KOps/s $\color{#35bf28}+0.95\%$
test_vmap_transformer_speed_decorator[True-True] 20.7898ms 20.7011ms 48.3067 Ops/s 48.5750 Ops/s $\color{#d91a1a}-0.55\%$
test_vmap_transformer_speed_decorator[True-False] 21.7234ms 20.7364ms 48.2244 Ops/s 48.5881 Ops/s $\color{#d91a1a}-0.75\%$
test_vmap_transformer_speed_decorator[False-True] 21.1608ms 20.5243ms 48.7226 Ops/s 48.9938 Ops/s $\color{#d91a1a}-0.55\%$
test_vmap_transformer_speed_decorator[False-False] 20.8117ms 20.5137ms 48.7478 Ops/s 48.9937 Ops/s $\color{#d91a1a}-0.50\%$
test_to_module_speed[True] 1.5035ms 1.3969ms 715.8791 Ops/s 717.6078 Ops/s $\color{#d91a1a}-0.24\%$
test_to_module_speed[False] 1.4679ms 1.3662ms 731.9571 Ops/s 727.6130 Ops/s $\color{#35bf28}+0.60\%$
test_tc_init 84.4410μs 50.4089μs 19.8378 KOps/s 19.7888 KOps/s $\color{#35bf28}+0.25\%$
test_tc_init_tensor_only 47.3710μs 14.1487μs 70.6777 KOps/s 70.2866 KOps/s $\color{#35bf28}+0.56\%$
test_tc_init_nested 0.1507ms 99.0695μs 10.0939 KOps/s 10.0019 KOps/s $\color{#35bf28}+0.92\%$
test_tc_first_layer_tensor 33.3810μs 1.6775μs 596.1402 KOps/s 589.4324 KOps/s $\color{#35bf28}+1.14\%$
test_tc_first_layer_tensor_only 7.5089μs 0.6523μs 1.5331 MOps/s 1.4854 MOps/s $\color{#35bf28}+3.21\%$
test_tc_first_layer_tensor_set 35.4910μs 3.9539μs 252.9177 KOps/s 248.1826 KOps/s $\color{#35bf28}+1.91\%$
test_tc_first_layer_tensor_only_set 29.4200μs 2.9743μs 336.2183 KOps/s 332.8704 KOps/s $\color{#35bf28}+1.01\%$
test_tc_first_layer_nontensor 28.3000μs 5.6245μs 177.7932 KOps/s 177.0970 KOps/s $\color{#35bf28}+0.39\%$
test_tc_second_layer_tensor 23.9800μs 4.0028μs 249.8221 KOps/s 246.0809 KOps/s $\color{#35bf28}+1.52\%$
test_tc_second_layer_nontensor 38.1710μs 7.8463μs 127.4487 KOps/s 128.6794 KOps/s $\color{#d91a1a}-0.96\%$
test_unbind 0.2565s 13.0842ms 76.4282 Ops/s 66.5074 Ops/s $\textbf{\color{#35bf28}+14.92\%}$
test_full_like 4.9771ms 4.4000ms 227.2729 Ops/s 225.9471 Ops/s $\color{#35bf28}+0.59\%$
test_zeros_like 5.0191ms 4.3773ms 228.4525 Ops/s 228.3409 Ops/s $\color{#35bf28}+0.05\%$
test_ones_like 4.5904ms 4.3815ms 228.2336 Ops/s 227.8812 Ops/s $\color{#35bf28}+0.15\%$
test_clone 6.8983ms 6.5757ms 152.0759 Ops/s 152.2187 Ops/s $\color{#d91a1a}-0.09\%$
test_squeeze 0.2099ms 13.5285μs 73.9181 KOps/s 73.2633 KOps/s $\color{#35bf28}+0.89\%$
test_unsqueeze 0.2798ms 0.1076ms 9.2962 KOps/s 9.3099 KOps/s $\color{#d91a1a}-0.15\%$
test_split 0.2376ms 0.1772ms 5.6447 KOps/s 5.4820 KOps/s $\color{#35bf28}+2.97\%$
test_permute 0.2542ms 0.2017ms 4.9572 KOps/s 5.1247 KOps/s $\color{#d91a1a}-3.27\%$
test_stack 52.1193ms 51.6900ms 19.3461 Ops/s 19.3245 Ops/s $\color{#35bf28}+0.11\%$
test_cat 52.2121ms 51.8118ms 19.3006 Ops/s 23.0480 Ops/s $\textbf{\color{#d91a1a}-16.26\%}$

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jan 6, 2026

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 233. Improved: $\large\color{#35bf28}12$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 39.6810μs 15.2491μs 65.5776 KOps/s 66.5274 KOps/s $\color{#d91a1a}-1.43\%$
test_plain_set_stack_nested 59.8720μs 15.4145μs 64.8738 KOps/s 65.5916 KOps/s $\color{#d91a1a}-1.09\%$
test_plain_set_nested_inplace 47.6310μs 16.8351μs 59.3997 KOps/s 59.6539 KOps/s $\color{#d91a1a}-0.43\%$
test_plain_set_stack_nested_inplace 52.7610μs 16.8144μs 59.4730 KOps/s 59.4228 KOps/s $\color{#35bf28}+0.08\%$
test_items 40.7210μs 5.7736μs 173.2008 KOps/s 172.3257 KOps/s $\color{#35bf28}+0.51\%$
test_items_nested 0.5950ms 0.5398ms 1.8525 KOps/s 1.8723 KOps/s $\color{#d91a1a}-1.06\%$
test_items_nested_locked 0.6230ms 0.5438ms 1.8389 KOps/s 1.8595 KOps/s $\color{#d91a1a}-1.11\%$
test_items_nested_leaf 0.1425ms 95.3100μs 10.4921 KOps/s 10.3491 KOps/s $\color{#35bf28}+1.38\%$
test_items_stack_nested 0.6134ms 0.5343ms 1.8717 KOps/s 1.8544 KOps/s $\color{#35bf28}+0.94\%$
test_items_stack_nested_leaf 0.1241ms 97.0849μs 10.3003 KOps/s 10.2276 KOps/s $\color{#35bf28}+0.71\%$
test_items_stack_nested_locked 0.7592ms 0.5346ms 1.8706 KOps/s 1.8525 KOps/s $\color{#35bf28}+0.98\%$
test_keys 61.5620μs 4.1983μs 238.1928 KOps/s 238.3800 KOps/s $\color{#d91a1a}-0.08\%$
test_keys_nested 0.1536ms 0.1196ms 8.3637 KOps/s 8.2927 KOps/s $\color{#35bf28}+0.86\%$
test_keys_nested_locked 0.7452ms 0.1295ms 7.7198 KOps/s 7.7845 KOps/s $\color{#d91a1a}-0.83\%$
test_keys_nested_leaf 0.1394ms 0.1110ms 9.0082 KOps/s 9.0914 KOps/s $\color{#d91a1a}-0.92\%$
test_keys_stack_nested 0.1597ms 0.1213ms 8.2465 KOps/s 8.2688 KOps/s $\color{#d91a1a}-0.27\%$
test_keys_stack_nested_leaf 0.1490ms 0.1110ms 9.0118 KOps/s 8.9709 KOps/s $\color{#35bf28}+0.46\%$
test_keys_stack_nested_locked 0.1768ms 0.1288ms 7.7649 KOps/s 7.7613 KOps/s $\color{#35bf28}+0.05\%$
test_values 7.4440μs 1.0273μs 973.4640 KOps/s 973.9872 KOps/s $\color{#d91a1a}-0.05\%$
test_values_nested 72.4920μs 48.2981μs 20.7048 KOps/s 20.5968 KOps/s $\color{#35bf28}+0.52\%$
test_values_nested_locked 82.0220μs 50.9152μs 19.6405 KOps/s 19.2631 KOps/s $\color{#35bf28}+1.96\%$
test_values_nested_leaf 78.9010μs 54.7401μs 18.2681 KOps/s 18.1949 KOps/s $\color{#35bf28}+0.40\%$
test_values_stack_nested 73.5410μs 48.2555μs 20.7230 KOps/s 20.6595 KOps/s $\color{#35bf28}+0.31\%$
test_values_stack_nested_leaf 1.0766ms 54.5058μs 18.3467 KOps/s 18.1678 KOps/s $\color{#35bf28}+0.98\%$
test_values_stack_nested_locked 88.3520μs 50.8287μs 19.6739 KOps/s 19.4764 KOps/s $\color{#35bf28}+1.01\%$
test_membership 5.4785μs 0.8483μs 1.1788 MOps/s 1.1863 MOps/s $\color{#d91a1a}-0.63\%$
test_membership_nested 29.3910μs 3.1779μs 314.6748 KOps/s 313.3996 KOps/s $\color{#35bf28}+0.41\%$
test_membership_nested_leaf 28.8200μs 3.1948μs 313.0100 KOps/s 313.1632 KOps/s $\color{#d91a1a}-0.05\%$
test_membership_stacked_nested 37.0910μs 3.1955μs 312.9432 KOps/s 311.3553 KOps/s $\color{#35bf28}+0.51\%$
test_membership_stacked_nested_leaf 33.3710μs 3.1869μs 313.7876 KOps/s 313.0446 KOps/s $\color{#35bf28}+0.24\%$
test_membership_nested_last 29.3610μs 4.6914μs 213.1581 KOps/s 216.9031 KOps/s $\color{#d91a1a}-1.73\%$
test_membership_nested_leaf_last 39.0610μs 4.6398μs 215.5276 KOps/s 215.9277 KOps/s $\color{#d91a1a}-0.19\%$
test_membership_stacked_nested_last 37.9510μs 4.7140μs 212.1329 KOps/s 214.9179 KOps/s $\color{#d91a1a}-1.30\%$
test_membership_stacked_nested_leaf_last 37.5110μs 4.6504μs 215.0347 KOps/s 214.2371 KOps/s $\color{#35bf28}+0.37\%$
test_nested_getleaf 50.8810μs 21.5394μs 46.4266 KOps/s 46.8255 KOps/s $\color{#d91a1a}-0.85\%$
test_nested_get 48.1710μs 20.4094μs 48.9970 KOps/s 48.6764 KOps/s $\color{#35bf28}+0.66\%$
test_stacked_getleaf 45.7010μs 21.7926μs 45.8872 KOps/s 46.5525 KOps/s $\color{#d91a1a}-1.43\%$
test_stacked_get 49.2410μs 20.5657μs 48.6246 KOps/s 49.4609 KOps/s $\color{#d91a1a}-1.69\%$
test_nested_getitemleaf 66.7410μs 21.8199μs 45.8297 KOps/s 45.6686 KOps/s $\color{#35bf28}+0.35\%$
test_nested_getitem 65.5210μs 20.8628μs 47.9322 KOps/s 48.8069 KOps/s $\color{#d91a1a}-1.79\%$
test_stacked_getitemleaf 47.2500μs 22.2245μs 44.9953 KOps/s 45.3942 KOps/s $\color{#d91a1a}-0.88\%$
test_stacked_getitem 44.3110μs 21.0119μs 47.5921 KOps/s 48.0660 KOps/s $\color{#d91a1a}-0.99\%$
test_lock_nested 7.9643ms 0.4825ms 2.0725 KOps/s 2.0907 KOps/s $\color{#d91a1a}-0.87\%$
test_lock_stack_nested 0.6237ms 0.4791ms 2.0874 KOps/s 2.0912 KOps/s $\color{#d91a1a}-0.18\%$
test_unlock_nested 0.6410ms 0.3832ms 2.6094 KOps/s 2.5969 KOps/s $\color{#35bf28}+0.48\%$
test_unlock_stack_nested 0.4580ms 0.3821ms 2.6170 KOps/s 2.6038 KOps/s $\color{#35bf28}+0.51\%$
test_flatten_speed 0.1817ms 0.1212ms 8.2495 KOps/s 8.2691 KOps/s $\color{#d91a1a}-0.24\%$
test_unflatten_speed 0.6620ms 0.5988ms 1.6701 KOps/s 1.6826 KOps/s $\color{#d91a1a}-0.74\%$
test_common_ops 0.8441ms 0.7407ms 1.3500 KOps/s 1.3343 KOps/s $\color{#35bf28}+1.18\%$
test_creation 95.5220μs 2.7504μs 363.5792 KOps/s 367.0978 KOps/s $\color{#d91a1a}-0.96\%$
test_creation_empty 51.5610μs 9.1204μs 109.6448 KOps/s 110.4981 KOps/s $\color{#d91a1a}-0.77\%$
test_creation_nested_1 41.2610μs 11.9364μs 83.7774 KOps/s 83.7090 KOps/s $\color{#35bf28}+0.08\%$
test_creation_nested_2 62.3310μs 16.0818μs 62.1822 KOps/s 62.6666 KOps/s $\color{#d91a1a}-0.77\%$
test_clone 54.1110μs 13.4680μs 74.2503 KOps/s 75.0112 KOps/s $\color{#d91a1a}-1.01\%$
test_getitem[int] 1.1588ms 14.0459μs 71.1951 KOps/s 70.2571 KOps/s $\color{#35bf28}+1.34\%$
test_getitem[slice_int] 0.1444ms 24.7029μs 40.4811 KOps/s 40.0782 KOps/s $\color{#35bf28}+1.01\%$
test_getitem[range] 0.1729ms 60.1378μs 16.6285 KOps/s 16.7303 KOps/s $\color{#d91a1a}-0.61\%$
test_getitem[tuple] 0.1440ms 24.5876μs 40.6709 KOps/s 40.5711 KOps/s $\color{#35bf28}+0.25\%$
test_getitem[list] 0.1770ms 54.7612μs 18.2611 KOps/s 18.6590 KOps/s $\color{#d91a1a}-2.13\%$
test_setitem_dim[int] 46.6910μs 24.5039μs 40.8099 KOps/s 41.1236 KOps/s $\color{#d91a1a}-0.76\%$
test_setitem_dim[slice_int] 68.9610μs 45.3908μs 22.0309 KOps/s 21.8688 KOps/s $\color{#35bf28}+0.74\%$
test_setitem_dim[range] 0.1077ms 85.9221μs 11.6384 KOps/s 11.6358 KOps/s $\color{#35bf28}+0.02\%$
test_setitem_dim[tuple] 61.7010μs 41.1467μs 24.3033 KOps/s 23.7351 KOps/s $\color{#35bf28}+2.39\%$
test_setitem 61.9810μs 18.3417μs 54.5207 KOps/s 54.7117 KOps/s $\color{#d91a1a}-0.35\%$
test_set 61.8720μs 17.3932μs 57.4937 KOps/s 54.4256 KOps/s $\textbf{\color{#35bf28}+5.64\%}$
test_set_shared 0.4878ms 0.2067ms 4.8381 KOps/s 4.7924 KOps/s $\color{#35bf28}+0.95\%$
test_update 0.2219ms 22.1930μs 45.0593 KOps/s 45.0765 KOps/s $\color{#d91a1a}-0.04\%$
test_update_nested 77.1210μs 34.4232μs 29.0501 KOps/s 28.7666 KOps/s $\color{#35bf28}+0.99\%$
test_update__nested 0.4692ms 34.4117μs 29.0599 KOps/s 27.9399 KOps/s $\color{#35bf28}+4.01\%$
test_set_nested 65.0810μs 19.5238μs 51.2197 KOps/s 51.8631 KOps/s $\color{#d91a1a}-1.24\%$
test_set_nested_new 68.7320μs 24.3964μs 40.9896 KOps/s 41.3419 KOps/s $\color{#d91a1a}-0.85\%$
test_select 87.3410μs 43.0414μs 23.2334 KOps/s 23.3637 KOps/s $\color{#d91a1a}-0.56\%$
test_select_nested 0.1198ms 75.3120μs 13.2781 KOps/s 13.4287 KOps/s $\color{#d91a1a}-1.12\%$
test_exclude_nested 0.1290ms 97.7450μs 10.2307 KOps/s 10.2211 KOps/s $\color{#35bf28}+0.09\%$
test_empty[True] 0.5040ms 0.4354ms 2.2969 KOps/s 2.2795 KOps/s $\color{#35bf28}+0.76\%$
test_empty[False] 7.2102μs 1.3321μs 750.6684 KOps/s 752.1901 KOps/s $\color{#d91a1a}-0.20\%$
test_to 0.1047ms 73.2250μs 13.6565 KOps/s 13.4820 KOps/s $\color{#35bf28}+1.29\%$
test_to_nonblocking 0.1146ms 65.7830μs 15.2015 KOps/s 15.1850 KOps/s $\color{#35bf28}+0.11\%$
test_unbind_speed 0.3730ms 0.3276ms 3.0521 KOps/s 3.0963 KOps/s $\color{#d91a1a}-1.43\%$
test_unbind_speed_stack0 0.4793ms 0.3234ms 3.0922 KOps/s 3.1068 KOps/s $\color{#d91a1a}-0.47\%$
test_unbind_speed_stack1 99.7706ms 0.9452ms 1.0580 KOps/s 1.1508 KOps/s $\textbf{\color{#d91a1a}-8.07\%}$
test_split 1.2505ms 1.1565ms 864.6605 Ops/s 866.8946 Ops/s $\color{#d91a1a}-0.26\%$
test_chunk 99.3968ms 1.2175ms 821.3437 Ops/s 754.4221 Ops/s $\textbf{\color{#35bf28}+8.87\%}$
test_consolidate[False-None] 4.0166ms 3.8816ms 257.6259 Ops/s 256.6516 Ops/s $\color{#35bf28}+0.38\%$
test_consolidate[default-None] 2.1657ms 2.0552ms 486.5690 Ops/s 460.4803 Ops/s $\textbf{\color{#35bf28}+5.67\%}$
test_consolidate[reduce-overhead-None] 2.1229ms 1.9772ms 505.7621 Ops/s 473.2462 Ops/s $\textbf{\color{#35bf28}+6.87\%}$
test_consolidate_njt[False-None] 9.1576ms 8.8269ms 113.2901 Ops/s 111.2589 Ops/s $\color{#35bf28}+1.83\%$
test_to[False-False-None] 2.1856ms 2.0700ms 483.0924 Ops/s 477.4677 Ops/s $\color{#35bf28}+1.18\%$
test_to[True-False-None] 0.1754s 2.2159ms 451.2796 Ops/s 537.7088 Ops/s $\textbf{\color{#d91a1a}-16.07\%}$
test_to[within-False-None] 6.1657ms 5.8684ms 170.4044 Ops/s 170.5205 Ops/s $\color{#d91a1a}-0.07\%$
test_to[True-default-None] 13.0327ms 12.2995ms 81.3044 Ops/s 83.2909 Ops/s $\color{#d91a1a}-2.39\%$
test_to_njt[False-False-None] 9.0310ms 8.6061ms 116.1961 Ops/s 114.6585 Ops/s $\color{#35bf28}+1.34\%$
test_to_njt[True-False-None] 7.8852ms 7.6333ms 131.0055 Ops/s 133.7059 Ops/s $\color{#d91a1a}-2.02\%$
test_to_njt[within-False-None] 17.3925ms 16.5348ms 60.4786 Ops/s 59.1906 Ops/s $\color{#35bf28}+2.18\%$
test_creation[device0] 0.3398ms 0.1138ms 8.7852 KOps/s 9.0124 KOps/s $\color{#d91a1a}-2.52\%$
test_creation_from_tensor 0.3609ms 0.1141ms 8.7627 KOps/s 8.9393 KOps/s $\color{#d91a1a}-1.98\%$
test_add_one[memmap_tensor0] 0.1640ms 6.7522μs 148.0991 KOps/s 147.8935 KOps/s $\color{#35bf28}+0.14\%$
test_contiguous[memmap_tensor0] 26.6600μs 0.6933μs 1.4424 MOps/s 1.9975 MOps/s $\textbf{\color{#d91a1a}-27.79\%}$
test_stack[memmap_tensor0] 32.2800μs 4.8623μs 205.6644 KOps/s 209.2184 KOps/s $\color{#d91a1a}-1.70\%$
test_memmaptd_index 0.9880ms 0.2849ms 3.5103 KOps/s 3.6142 KOps/s $\color{#d91a1a}-2.87\%$
test_memmaptd_index_astensor 1.1176ms 0.3792ms 2.6369 KOps/s 2.7042 KOps/s $\color{#d91a1a}-2.49\%$
test_memmaptd_index_op 0.8071ms 0.6296ms 1.5883 KOps/s 1.6104 KOps/s $\color{#d91a1a}-1.37\%$
test_serialize_model 0.1355s 0.1342s 7.4536 Ops/s 7.5002 Ops/s $\color{#d91a1a}-0.62\%$
test_serialize_model_pickle 1.3479s 1.2111s 0.8257 Ops/s 0.8227 Ops/s $\color{#35bf28}+0.36\%$
test_serialize_weights 0.1346s 0.1329s 7.5254 Ops/s 7.5363 Ops/s $\color{#d91a1a}-0.14\%$
test_serialize_weights_returnearly 0.4108s 70.4879ms 14.1868 Ops/s 19.4168 Ops/s $\textbf{\color{#d91a1a}-26.94\%}$
test_serialize_weights_pickle 1.3745s 1.1977s 0.8349 Ops/s 0.8353 Ops/s $\color{#d91a1a}-0.05\%$
test_reshape_pytree 9.8355ms 34.0884μs 29.3355 KOps/s 30.1389 KOps/s $\color{#d91a1a}-2.67\%$
test_reshape_td 72.1710μs 39.0750μs 25.5918 KOps/s 26.6442 KOps/s $\color{#d91a1a}-3.95\%$
test_view_pytree 0.2272ms 33.2988μs 30.0311 KOps/s 30.1433 KOps/s $\color{#d91a1a}-0.37\%$
test_view_td 80.5310μs 46.2754μs 21.6097 KOps/s 21.7122 KOps/s $\color{#d91a1a}-0.47\%$
test_unbind_pytree 0.2436ms 38.3926μs 26.0467 KOps/s 26.0559 KOps/s $\color{#d91a1a}-0.04\%$
test_unbind_td 0.2127ms 49.0330μs 20.3944 KOps/s 20.4696 KOps/s $\color{#d91a1a}-0.37\%$
test_split_pytree 0.2553ms 45.5126μs 21.9719 KOps/s 22.4310 KOps/s $\color{#d91a1a}-2.05\%$
test_split_td 0.2038ms 65.0622μs 15.3699 KOps/s 15.4813 KOps/s $\color{#d91a1a}-0.72\%$
test_add_pytree 0.2321ms 46.7285μs 21.4002 KOps/s 22.5332 KOps/s $\textbf{\color{#d91a1a}-5.03\%}$
test_add_td 96.7020μs 57.5609μs 17.3729 KOps/s 19.1388 KOps/s $\textbf{\color{#d91a1a}-9.23\%}$
test_compile_add_one_nested[tensordict-compile] 0.3119ms 0.1847ms 5.4135 KOps/s 5.4786 KOps/s $\color{#d91a1a}-1.19\%$
test_compile_add_one_nested[tensordict-eager] 0.2513ms 0.1933ms 5.1741 KOps/s 5.1280 KOps/s $\color{#35bf28}+0.90\%$
test_compile_add_one_nested[pytree-compile] 0.2283ms 0.1559ms 6.4136 KOps/s 6.1010 KOps/s $\textbf{\color{#35bf28}+5.12\%}$
test_compile_add_one_nested[pytree-eager] 0.4346ms 0.1896ms 5.2747 KOps/s 5.2860 KOps/s $\color{#d91a1a}-0.21\%$
test_compile_copy_nested[tensordict-compile] 0.1313ms 28.3698μs 35.2488 KOps/s 35.8946 KOps/s $\color{#d91a1a}-1.80\%$
test_compile_copy_nested[tensordict-eager] 86.5020μs 52.9995μs 18.8681 KOps/s 18.9189 KOps/s $\color{#d91a1a}-0.27\%$
test_compile_copy_nested[pytree-compile] 0.1441ms 14.2704μs 70.0750 KOps/s 69.1206 KOps/s $\color{#35bf28}+1.38\%$
test_compile_copy_nested[pytree-eager] 0.3828ms 76.4335μs 13.0833 KOps/s 13.0012 KOps/s $\color{#35bf28}+0.63\%$
test_compile_add_one_flat[tensordict-compile] 0.3290ms 0.2074ms 4.8215 KOps/s 4.7359 KOps/s $\color{#35bf28}+1.81\%$
test_compile_add_one_flat[tensordict-eager] 0.3350ms 0.2611ms 3.8300 KOps/s 3.8155 KOps/s $\color{#35bf28}+0.38\%$
test_compile_add_one_flat[tensorclass-compile] 0.2624ms 0.1549ms 6.4566 KOps/s 6.3605 KOps/s $\color{#35bf28}+1.51\%$
test_compile_add_one_flat[tensorclass-eager] 0.1204ms 71.2063μs 14.0437 KOps/s 14.2562 KOps/s $\color{#d91a1a}-1.49\%$
test_compile_add_one_flat[pytree-compile] 0.2478ms 0.2036ms 4.9123 KOps/s 4.6381 KOps/s $\textbf{\color{#35bf28}+5.91\%}$
test_compile_add_one_flat[pytree-eager] 0.8213ms 0.5422ms 1.8445 KOps/s 1.8471 KOps/s $\color{#d91a1a}-0.15\%$
test_compile_add_self_flat[tensordict-eager] 0.4517ms 0.3166ms 3.1584 KOps/s 3.1319 KOps/s $\color{#35bf28}+0.85\%$
test_compile_add_self_flat[tensordict-compile] 0.3308ms 0.2116ms 4.7265 KOps/s 4.5346 KOps/s $\color{#35bf28}+4.23\%$
test_compile_add_self_flat[tensorclass-eager] 0.1275ms 86.2577μs 11.5932 KOps/s 11.5412 KOps/s $\color{#35bf28}+0.45\%$
test_compile_add_self_flat[tensorclass-compile] 0.2045ms 0.1570ms 6.3708 KOps/s 6.2213 KOps/s $\color{#35bf28}+2.40\%$
test_compile_add_self_flat[pytree-eager] 0.6780ms 0.4512ms 2.2164 KOps/s 2.2343 KOps/s $\color{#d91a1a}-0.80\%$
test_compile_add_self_flat[pytree-compile] 0.2509ms 0.2037ms 4.9086 KOps/s 4.7983 KOps/s $\color{#35bf28}+2.30\%$
test_compile_copy_flat[tensordict-compile] 0.1411ms 23.9691μs 41.7204 KOps/s 40.2482 KOps/s $\color{#35bf28}+3.66\%$
test_compile_copy_flat[tensordict-eager] 0.1417ms 41.1042μs 24.3284 KOps/s 24.0033 KOps/s $\color{#35bf28}+1.35\%$
test_compile_copy_flat[pytree-compile] 92.2810μs 20.8024μs 48.0715 KOps/s 47.5301 KOps/s $\color{#35bf28}+1.14\%$
test_compile_copy_flat[pytree-eager] 0.3655ms 70.4728μs 14.1899 KOps/s 14.3753 KOps/s $\color{#d91a1a}-1.29\%$
test_compile_assign_and_add[tensordict-compile] 2.0976ms 0.2127ms 4.7019 KOps/s 4.6276 KOps/s $\color{#35bf28}+1.60\%$
test_compile_assign_and_add[tensordict-eager] 3.6568ms 3.3959ms 294.4744 Ops/s 297.3034 Ops/s $\color{#d91a1a}-0.95\%$
test_compile_assign_and_add[pytree-compile] 2.0585ms 0.2105ms 4.7503 KOps/s 4.7211 KOps/s $\color{#35bf28}+0.62\%$
test_compile_assign_and_add[pytree-eager] 3.0658ms 2.9414ms 339.9774 Ops/s 356.8828 Ops/s $\color{#d91a1a}-4.74\%$
test_compile_indexing[tensor-tensordict-compile] 0.2117ms 0.1430ms 6.9921 KOps/s 6.9194 KOps/s $\color{#35bf28}+1.05\%$
test_compile_indexing[tensor-tensordict-eager] 0.2982ms 67.1802μs 14.8853 KOps/s 15.0236 KOps/s $\color{#d91a1a}-0.92\%$
test_compile_indexing[tensor-tensorclass-compile] 0.2820ms 0.1368ms 7.3126 KOps/s 7.2186 KOps/s $\color{#35bf28}+1.30\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2578ms 46.3027μs 21.5970 KOps/s 21.3313 KOps/s $\color{#35bf28}+1.25\%$
test_compile_indexing[tensor-pytree-compile] 0.1997ms 0.1379ms 7.2538 KOps/s 7.1696 KOps/s $\color{#35bf28}+1.18\%$
test_compile_indexing[tensor-pytree-eager] 0.2776ms 49.7517μs 20.0998 KOps/s 21.3797 KOps/s $\textbf{\color{#d91a1a}-5.99\%}$
test_compile_indexing[slice-tensordict-compile] 0.2388ms 86.7788μs 11.5236 KOps/s 11.1641 KOps/s $\color{#35bf28}+3.22\%$
test_compile_indexing[slice-tensordict-eager] 0.2155ms 27.4071μs 36.4869 KOps/s 36.8401 KOps/s $\color{#d91a1a}-0.96\%$
test_compile_indexing[slice-tensorclass-compile] 0.1912ms 81.3403μs 12.2940 KOps/s 12.1938 KOps/s $\color{#35bf28}+0.82\%$
test_compile_indexing[slice-tensorclass-eager] 0.2373ms 24.1537μs 41.4015 KOps/s 42.0176 KOps/s $\color{#d91a1a}-1.47\%$
test_compile_indexing[slice-pytree-compile] 0.1349ms 81.7467μs 12.2329 KOps/s 12.1491 KOps/s $\color{#35bf28}+0.69\%$
test_compile_indexing[slice-pytree-eager] 0.2304ms 23.9748μs 41.7104 KOps/s 42.0352 KOps/s $\color{#d91a1a}-0.77\%$
test_compile_indexing[int-tensordict-compile] 0.1605ms 87.6873μs 11.4042 KOps/s 11.3795 KOps/s $\color{#35bf28}+0.22\%$
test_compile_indexing[int-tensordict-eager] 0.2641ms 28.4033μs 35.2072 KOps/s 37.0830 KOps/s $\textbf{\color{#d91a1a}-5.06\%}$
test_compile_indexing[int-tensorclass-compile] 0.1206ms 81.9988μs 12.1953 KOps/s 12.1056 KOps/s $\color{#35bf28}+0.74\%$
test_compile_indexing[int-tensorclass-eager] 0.2321ms 23.7510μs 42.1035 KOps/s 41.8062 KOps/s $\color{#35bf28}+0.71\%$
test_compile_indexing[int-pytree-compile] 0.1237ms 81.4219μs 12.2817 KOps/s 12.1520 KOps/s $\color{#35bf28}+1.07\%$
test_compile_indexing[int-pytree-eager] 0.2441ms 23.8345μs 41.9560 KOps/s 41.9823 KOps/s $\color{#d91a1a}-0.06\%$
test_mod_add[eager] 99.1920μs 52.3394μs 19.1061 KOps/s 19.2268 KOps/s $\color{#d91a1a}-0.63\%$
test_mod_add[compile] 0.2122ms 0.1532ms 6.5268 KOps/s 6.5024 KOps/s $\color{#35bf28}+0.37\%$
test_mod_add[compile-overhead] 0.6106ms 0.2008ms 4.9811 KOps/s 4.9360 KOps/s $\color{#35bf28}+0.91\%$
test_mod_wrap[eager] 0.4089ms 0.3134ms 3.1903 KOps/s 3.2223 KOps/s $\color{#d91a1a}-0.99\%$
test_mod_wrap[compile] 0.5179ms 0.4056ms 2.4654 KOps/s 2.4325 KOps/s $\color{#35bf28}+1.35\%$
test_mod_wrap[compile-overhead] 7.5845ms 4.0001ms 249.9931 Ops/s 253.5575 Ops/s $\color{#d91a1a}-1.41\%$
test_mod_wrap_and_backward[eager] 1.7003ms 1.5690ms 637.3298 Ops/s 603.1250 Ops/s $\textbf{\color{#35bf28}+5.67\%}$
test_mod_wrap_and_backward[compile] 1.8017ms 1.6266ms 614.7829 Ops/s 568.1210 Ops/s $\textbf{\color{#35bf28}+8.21\%}$
test_mod_wrap_and_backward[compile-overhead] 1.5117ms 0.9986ms 1.0014 KOps/s 894.1144 Ops/s $\textbf{\color{#35bf28}+12.00\%}$
test_seq_add[eager] 0.2154ms 0.1590ms 6.2876 KOps/s 6.3356 KOps/s $\color{#d91a1a}-0.76\%$
test_seq_add[compile] 0.2162ms 0.1623ms 6.1618 KOps/s 6.1159 KOps/s $\color{#35bf28}+0.75\%$
test_seq_add[compile-overhead] 0.2584ms 0.2108ms 4.7440 KOps/s 4.8906 KOps/s $\color{#d91a1a}-3.00\%$
test_seq_wrap[eager] 0.6397ms 0.5505ms 1.8164 KOps/s 1.8111 KOps/s $\color{#35bf28}+0.29\%$
test_seq_wrap[compile] 0.5354ms 0.4203ms 2.3790 KOps/s 2.3719 KOps/s $\color{#35bf28}+0.30\%$
test_seq_wrap[compile-overhead] 0.4137ms 0.3165ms 3.1591 KOps/s 3.1631 KOps/s $\color{#d91a1a}-0.13\%$
test_func_call_runtime[False-eager] 1.0052ms 0.8956ms 1.1165 KOps/s 1.1205 KOps/s $\color{#d91a1a}-0.35\%$
test_func_call_runtime[False-compile] 1.0202ms 0.9532ms 1.0491 KOps/s 1.0487 KOps/s $\color{#35bf28}+0.04\%$
test_func_call_runtime[False-compile-overhead] 0.5617ms 0.5015ms 1.9941 KOps/s 1.9761 KOps/s $\color{#35bf28}+0.91\%$
test_func_call_runtime[True-eager] 1.2328ms 1.1331ms 882.4972 Ops/s 885.2426 Ops/s $\color{#d91a1a}-0.31\%$
test_func_call_runtime[True-compile] 1.0828ms 0.9793ms 1.0212 KOps/s 1.0201 KOps/s $\color{#35bf28}+0.11\%$
test_func_call_runtime[True-compile-overhead] 0.5731ms 0.5208ms 1.9200 KOps/s 1.8929 KOps/s $\color{#35bf28}+1.43\%$
test_func_call_cm_runtime[False-eager] 0.9628ms 0.8957ms 1.1164 KOps/s 1.0804 KOps/s $\color{#35bf28}+3.33\%$
test_func_call_cm_runtime[False-compile] 1.1682ms 0.9637ms 1.0377 KOps/s 1.0447 KOps/s $\color{#d91a1a}-0.68\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5495ms 0.4995ms 2.0022 KOps/s 1.9773 KOps/s $\color{#35bf28}+1.26\%$
test_func_call_cm_runtime[True-eager] 1.4043ms 1.2799ms 781.3394 Ops/s 775.8520 Ops/s $\color{#35bf28}+0.71\%$
test_func_call_cm_runtime[True-compile] 1.0617ms 1.0116ms 988.5148 Ops/s 992.0090 Ops/s $\color{#d91a1a}-0.35\%$
test_func_call_cm_runtime[True-compile-overhead] 0.6256ms 0.5544ms 1.8037 KOps/s 1.7719 KOps/s $\color{#35bf28}+1.79\%$
test_vmap_func_call_cm_runtime[eager] 3.0085ms 2.4027ms 416.1938 Ops/s 419.8471 Ops/s $\color{#d91a1a}-0.87\%$
test_vmap_func_call_cm_runtime[compile] 1.0969ms 1.0273ms 973.4159 Ops/s 980.0453 Ops/s $\color{#d91a1a}-0.68\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.6369ms 0.5471ms 1.8277 KOps/s 1.7961 KOps/s $\color{#35bf28}+1.76\%$
test_distributed 3.1234ms 0.1622ms 6.1667 KOps/s 6.4100 KOps/s $\color{#d91a1a}-3.80\%$
test_tdmodule 0.1990ms 27.9144μs 35.8238 KOps/s 35.0459 KOps/s $\color{#35bf28}+2.22\%$
test_tdmodule_dispatch 77.6520μs 48.0846μs 20.7967 KOps/s 20.7148 KOps/s $\color{#35bf28}+0.40\%$
test_tdseq 46.3010μs 27.4248μs 36.4634 KOps/s 36.6021 KOps/s $\color{#d91a1a}-0.38\%$
test_tdseq_dispatch 70.0220μs 50.4938μs 19.8044 KOps/s 19.6038 KOps/s $\color{#35bf28}+1.02\%$
test_instantiation_functorch 2.2485ms 2.0960ms 477.1013 Ops/s 476.0608 Ops/s $\color{#35bf28}+0.22\%$
test_exec_functorch 0.2282ms 0.1854ms 5.3937 KOps/s 5.4067 KOps/s $\color{#d91a1a}-0.24\%$
test_exec_functional_call 0.2603ms 0.1679ms 5.9563 KOps/s 6.1075 KOps/s $\color{#d91a1a}-2.48\%$
test_exec_td_decorator 0.4462ms 0.2419ms 4.1337 KOps/s 4.1607 KOps/s $\color{#d91a1a}-0.65\%$
test_vmap_mlp_speed_decorator[True-True] 0.9982ms 0.8188ms 1.2213 KOps/s 1.2194 KOps/s $\color{#35bf28}+0.15\%$
test_vmap_mlp_speed_decorator[True-False] 0.9727ms 0.8150ms 1.2270 KOps/s 1.2198 KOps/s $\color{#35bf28}+0.59\%$
test_vmap_mlp_speed_decorator[False-True] 0.8552ms 0.7036ms 1.4212 KOps/s 1.4216 KOps/s $\color{#d91a1a}-0.03\%$
test_vmap_mlp_speed_decorator[False-False] 0.8595ms 0.7040ms 1.4204 KOps/s 1.4184 KOps/s $\color{#35bf28}+0.14\%$
test_vmap_transformer_speed_decorator[True-True] 20.9935ms 20.9076ms 47.8296 Ops/s 47.8546 Ops/s $\color{#d91a1a}-0.05\%$
test_vmap_transformer_speed_decorator[True-False] 21.1929ms 20.8887ms 47.8727 Ops/s 48.0362 Ops/s $\color{#d91a1a}-0.34\%$
test_vmap_transformer_speed_decorator[False-True] 20.9547ms 20.7035ms 48.3010 Ops/s 48.5755 Ops/s $\color{#d91a1a}-0.57\%$
test_vmap_transformer_speed_decorator[False-False] 20.8796ms 20.7259ms 48.2487 Ops/s 48.2625 Ops/s $\color{#d91a1a}-0.03\%$
test_to_module_speed[True] 1.6514ms 1.4837ms 674.0051 Ops/s 670.0258 Ops/s $\color{#35bf28}+0.59\%$
test_to_module_speed[False] 1.5627ms 1.4712ms 679.7203 Ops/s 677.9734 Ops/s $\color{#35bf28}+0.26\%$
test_tc_init 80.5510μs 52.4490μs 19.0661 KOps/s 19.3271 KOps/s $\color{#d91a1a}-1.35\%$
test_tc_init_tensor_only 48.4010μs 14.9573μs 66.8568 KOps/s 66.7052 KOps/s $\color{#35bf28}+0.23\%$
test_tc_init_nested 0.1361ms 0.1054ms 9.4910 KOps/s 9.6992 KOps/s $\color{#d91a1a}-2.15\%$
test_tc_first_layer_tensor 16.2810μs 1.8043μs 554.2213 KOps/s 555.5795 KOps/s $\color{#d91a1a}-0.24\%$
test_tc_first_layer_tensor_only 4.2571μs 0.7053μs 1.4178 MOps/s 1.4114 MOps/s $\color{#35bf28}+0.45\%$
test_tc_first_layer_tensor_set 32.1410μs 4.2138μs 237.3159 KOps/s 234.5683 KOps/s $\color{#35bf28}+1.17\%$
test_tc_first_layer_tensor_only_set 24.5005μs 3.0635μs 326.4195 KOps/s 325.4673 KOps/s $\color{#35bf28}+0.29\%$
test_tc_first_layer_nontensor 30.2500μs 6.0395μs 165.5775 KOps/s 162.9135 KOps/s $\color{#35bf28}+1.64\%$
test_tc_second_layer_tensor 27.5000μs 4.3378μs 230.5323 KOps/s 231.0323 KOps/s $\color{#d91a1a}-0.22\%$
test_tc_second_layer_nontensor 41.8910μs 8.5649μs 116.7554 KOps/s 116.2420 KOps/s $\color{#35bf28}+0.44\%$
test_unbind 0.2564s 16.1047ms 62.0936 Ops/s 63.1422 Ops/s $\color{#d91a1a}-1.66\%$
test_full_like 4.4932ms 4.3768ms 228.4785 Ops/s 222.6022 Ops/s $\color{#35bf28}+2.64\%$
test_zeros_like 4.4628ms 4.3611ms 229.2986 Ops/s 223.1669 Ops/s $\color{#35bf28}+2.75\%$
test_ones_like 5.0356ms 4.4025ms 227.1414 Ops/s 226.7161 Ops/s $\color{#35bf28}+0.19\%$
test_clone 6.5554ms 6.4023ms 156.1929 Ops/s 103.3493 Ops/s $\textbf{\color{#35bf28}+51.13\%}$
test_squeeze 0.2001ms 14.3693μs 69.5929 KOps/s 69.5834 KOps/s $\color{#35bf28}+0.01\%$
test_unsqueeze 0.1516ms 0.1144ms 8.7434 KOps/s 9.0802 KOps/s $\color{#d91a1a}-3.71\%$
test_split 0.3603ms 0.1870ms 5.3474 KOps/s 5.3797 KOps/s $\color{#d91a1a}-0.60\%$
test_permute 0.2853ms 0.2110ms 4.7400 KOps/s 4.9360 KOps/s $\color{#d91a1a}-3.97\%$
test_stack 43.2234ms 42.9533ms 23.2811 Ops/s 19.0580 Ops/s $\textbf{\color{#35bf28}+22.16\%}$
test_cat 43.0785ms 42.8632ms 23.3300 Ops/s 19.0696 Ops/s $\textbf{\color{#35bf28}+22.34\%}$

@vmoens
Copy link
Copy Markdown
Collaborator Author

vmoens commented Jan 6, 2026

unrelated failing tests

@vmoens vmoens merged commit bb7d548 into main Jan 6, 2026
66 of 84 checks passed
@vmoens vmoens deleted the solve-copy branch January 6, 2026 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant