[Redis] Add RedisLazyStackedTensorDict for lazy stack storage#1573
Closed
[Redis] Add RedisLazyStackedTensorDict for lazy stack storage#1573
Conversation
- Apply ufmt formatting to bench_redis.py and redis.py - Ignore PytestUnraisableExceptionWarning in test_tensorclass.py to prevent __del__ GC exceptions from failing unrelated tests - Install torchvision nightly in docs workflow to fix torchvision compatibility errors in tutorial builds Co-authored-by: Cursor <cursoragent@cursor.com>
- Ignore PytestUnraisableExceptionWarning in test_tensordict.py (fixes test_to_memory_leak failure on 3.14 and silicon 3.14) - Make h5py install optional for Python 3.14t since it fails to build from source on free-threaded Python Co-authored-by: Cursor <cursoragent@cursor.com>
orjson does not support free-threaded Python yet, similar to h5py. Co-authored-by: Cursor <cursoragent@cursor.com>
- Add skipif for test_auto_batch_size when h5py is not installed - Skip h5=True parametrizations in test_index_with_generator when h5py is missing - Skip no-build-isolation install tests on free-threaded Python Co-authored-by: Cursor <cursoragent@cursor.com>
Implement RedisLazyStackedTensorDict, a TensorDictBase subclass that stores LazyStackedTensorDict data in Redis as concatenated blobs with O(K) Redis keys for K leaf keys, regardless of the number of stack elements N. Key features: - Homogeneous mode: same-shape elements use arithmetic offsets, reusing existing GETRANGE/Lua byte-range infrastructure - Heterogeneous mode: variable-shape elements use packed int64 offset tables per key, with per-element shapes in metadata - Streaming upload: data processed in chunks of 10K elements to avoid materializing the full stack in memory - Full indexing: td[int], td[slice], td[::step], td[tensor], td[bool] all work via _index_tensordict override and pipelined GETRANGE/Lua - to_redis() convenience method on TensorDictBase - Pickle support and from_redis() reconnection Co-authored-by: Cursor <cursoragent@cursor.com>
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_plain_set_nested | 38.8020μs | 14.2163μs | 70.3420 KOps/s | 70.8511 KOps/s | |
| test_plain_set_stack_nested | 45.9930μs | 14.4827μs | 69.0480 KOps/s | 69.8096 KOps/s | |
| test_plain_set_nested_inplace | 41.4030μs | 15.9697μs | 62.6187 KOps/s | 64.1995 KOps/s | |
| test_plain_set_stack_nested_inplace | 53.2230μs | 15.7162μs | 63.6286 KOps/s | 64.2951 KOps/s | |
| test_items | 42.8730μs | 5.5304μs | 180.8195 KOps/s | 179.8177 KOps/s | |
| test_items_nested | 0.5094ms | 0.4398ms | 2.2740 KOps/s | 2.3125 KOps/s | |
| test_items_nested_locked | 0.5560ms | 0.4438ms | 2.2531 KOps/s | 2.2823 KOps/s | |
| test_items_nested_leaf | 0.1460ms | 90.9302μs | 10.9974 KOps/s | 10.8319 KOps/s | |
| test_items_stack_nested | 0.5553ms | 0.4415ms | 2.2650 KOps/s | 2.3029 KOps/s | |
| test_items_stack_nested_leaf | 0.1509ms | 91.0638μs | 10.9813 KOps/s | 10.8687 KOps/s | |
| test_items_stack_nested_locked | 0.4981ms | 0.4433ms | 2.2559 KOps/s | 2.2769 KOps/s | |
| test_keys | 23.2620μs | 4.1391μs | 241.5969 KOps/s | 242.1664 KOps/s | |
| test_keys_nested | 0.1665ms | 0.1171ms | 8.5424 KOps/s | 8.3874 KOps/s | |
| test_keys_nested_locked | 0.6483ms | 0.1272ms | 7.8638 KOps/s | 7.8337 KOps/s | |
| test_keys_nested_leaf | 0.1857ms | 0.1082ms | 9.2463 KOps/s | 9.2268 KOps/s | |
| test_keys_stack_nested | 0.1846ms | 0.1187ms | 8.4258 KOps/s | 8.4332 KOps/s | |
| test_keys_stack_nested_leaf | 0.1701ms | 0.1078ms | 9.2751 KOps/s | 9.2751 KOps/s | |
| test_keys_stack_nested_locked | 0.1991ms | 0.1261ms | 7.9292 KOps/s | 7.8282 KOps/s | |
| test_values | 5.6962μs | 0.9960μs | 1.0041 MOps/s | 1.0012 MOps/s | |
| test_values_nested | 82.1150μs | 46.4159μs | 21.5443 KOps/s | 21.6294 KOps/s | |
| test_values_nested_locked | 94.4460μs | 49.2198μs | 20.3170 KOps/s | 20.1921 KOps/s | |
| test_values_nested_leaf | 0.1167ms | 52.6465μs | 18.9946 KOps/s | 18.9836 KOps/s | |
| test_values_stack_nested | 78.0640μs | 46.5439μs | 21.4851 KOps/s | 21.5466 KOps/s | |
| test_values_stack_nested_leaf | 93.1060μs | 53.1245μs | 18.8237 KOps/s | 18.9468 KOps/s | |
| test_values_stack_nested_locked | 91.8750μs | 49.6338μs | 20.1476 KOps/s | 20.1246 KOps/s | |
| test_membership | 4.6287μs | 0.8170μs | 1.2239 MOps/s | 1.2313 MOps/s | |
| test_membership_nested | 22.9610μs | 2.7524μs | 363.3237 KOps/s | 361.7447 KOps/s | |
| test_membership_nested_leaf | 27.7510μs | 2.7727μs | 360.6562 KOps/s | 362.9467 KOps/s | |
| test_membership_stacked_nested | 38.5020μs | 2.7709μs | 360.8883 KOps/s | 362.2366 KOps/s | |
| test_membership_stacked_nested_leaf | 25.5010μs | 2.7628μs | 361.9503 KOps/s | 364.7556 KOps/s | |
| test_membership_nested_last | 39.1820μs | 4.1182μs | 242.8268 KOps/s | 242.8961 KOps/s | |
| test_membership_nested_leaf_last | 26.5910μs | 4.1813μs | 239.1605 KOps/s | 241.4803 KOps/s | |
| test_membership_stacked_nested_last | 21.5020μs | 4.1255μs | 242.3978 KOps/s | 243.1646 KOps/s | |
| test_membership_stacked_nested_leaf_last | 38.7020μs | 4.1514μs | 240.8822 KOps/s | 240.2708 KOps/s | |
| test_nested_getleaf | 51.0730μs | 20.6875μs | 48.3384 KOps/s | 48.1444 KOps/s | |
| test_nested_get | 60.2540μs | 18.9413μs | 52.7947 KOps/s | 51.2556 KOps/s | |
| test_stacked_getleaf | 77.8550μs | 20.3355μs | 49.1751 KOps/s | 47.8505 KOps/s | |
| test_stacked_get | 51.9230μs | 19.5242μs | 51.2186 KOps/s | 50.6685 KOps/s | |
| test_nested_getitemleaf | 45.8630μs | 21.4158μs | 46.6945 KOps/s | 47.2139 KOps/s | |
| test_nested_getitem | 56.4330μs | 19.7680μs | 50.5868 KOps/s | 50.5356 KOps/s | |
| test_stacked_getitemleaf | 55.2730μs | 20.7678μs | 48.1515 KOps/s | 46.9094 KOps/s | |
| test_stacked_getitem | 49.0430μs | 19.8087μs | 50.4830 KOps/s | 49.8940 KOps/s | |
| test_lock_nested | 8.3284ms | 0.4560ms | 2.1932 KOps/s | 2.1917 KOps/s | |
| test_lock_stack_nested | 0.5303ms | 0.4548ms | 2.1986 KOps/s | 2.1729 KOps/s | |
| test_unlock_nested | 0.6785ms | 0.3628ms | 2.7566 KOps/s | 2.7233 KOps/s | |
| test_unlock_stack_nested | 0.4198ms | 0.3648ms | 2.7414 KOps/s | 2.6831 KOps/s | |
| test_flatten_speed | 0.1502ms | 0.1168ms | 8.5634 KOps/s | 8.5749 KOps/s | |
| test_unflatten_speed | 0.5805ms | 0.5423ms | 1.8441 KOps/s | 1.8305 KOps/s | |
| test_common_ops | 0.8378ms | 0.6800ms | 1.4705 KOps/s | 1.4852 KOps/s | |
| test_creation | 0.1202ms | 2.6840μs | 372.5740 KOps/s | 365.6714 KOps/s | |
| test_creation_empty | 42.3020μs | 5.7681μs | 173.3666 KOps/s | 175.2517 KOps/s | |
| test_creation_nested_1 | 40.6720μs | 10.0118μs | 99.8824 KOps/s | 101.7073 KOps/s | |
| test_creation_nested_2 | 42.7020μs | 11.2887μs | 88.5843 KOps/s | 90.0344 KOps/s | |
| test_creation_many_keys[10] | 52.6630μs | 16.9297μs | 59.0679 KOps/s | 59.5467 KOps/s | |
| test_creation_many_keys[50] | 99.9260μs | 73.7166μs | 13.5655 KOps/s | 13.8992 KOps/s | |
| test_creation_many_keys[100] | 0.1953ms | 0.1435ms | 6.9673 KOps/s | 7.0862 KOps/s | |
| test_creation_nested_many_keys[10] | 73.0640μs | 36.3733μs | 27.4927 KOps/s | 27.5029 KOps/s | |
| test_creation_nested_many_keys[50] | 0.2064ms | 0.1495ms | 6.6910 KOps/s | 6.7331 KOps/s | |
| test_clone | 63.5140μs | 12.8777μs | 77.6538 KOps/s | 76.1892 KOps/s | |
| test_getitem[int] | 1.7832ms | 13.8875μs | 72.0071 KOps/s | 56.9723 KOps/s | |
| test_getitem[slice_int] | 0.1375ms | 23.4029μs | 42.7297 KOps/s | 42.5933 KOps/s | |
| test_getitem[range] | 0.1697ms | 61.1100μs | 16.3639 KOps/s | 16.0797 KOps/s | |
| test_getitem[tuple] | 0.1383ms | 22.8928μs | 43.6819 KOps/s | 43.3946 KOps/s | |
| test_getitem[list] | 0.1833ms | 56.0327μs | 17.8467 KOps/s | 17.6063 KOps/s | |
| test_setitem_dim[int] | 46.7130μs | 24.6376μs | 40.5883 KOps/s | 36.1009 KOps/s | |
| test_setitem_dim[slice_int] | 69.1740μs | 42.0388μs | 23.7876 KOps/s | 22.0354 KOps/s | |
| test_setitem_dim[range] | 0.1324ms | 94.4035μs | 10.5928 KOps/s | 10.9182 KOps/s | |
| test_setitem_dim[tuple] | 65.1840μs | 40.0937μs | 24.9416 KOps/s | 25.0901 KOps/s | |
| test_setitem | 55.6230μs | 16.7729μs | 59.6201 KOps/s | 57.3516 KOps/s | |
| test_set | 65.3340μs | 16.0938μs | 62.1356 KOps/s | 61.0051 KOps/s | |
| test_set_shared | 0.5693ms | 0.2027ms | 4.9336 KOps/s | 4.7778 KOps/s | |
| test_update | 0.3593ms | 20.4493μs | 48.9014 KOps/s | 44.2006 KOps/s | |
| test_update_nested | 77.3940μs | 32.4246μs | 30.8408 KOps/s | 30.5882 KOps/s | |
| test_update__nested | 0.4857ms | 33.4145μs | 29.9271 KOps/s | 29.7093 KOps/s | |
| test_set_nested | 60.5440μs | 18.1177μs | 55.1947 KOps/s | 54.2875 KOps/s | |
| test_set_nested_new | 68.5240μs | 22.9175μs | 43.6348 KOps/s | 42.1093 KOps/s | |
| test_select | 69.4140μs | 38.8749μs | 25.7235 KOps/s | 24.6930 KOps/s | |
| test_select_nested | 0.1173ms | 70.8523μs | 14.1139 KOps/s | 14.1591 KOps/s | |
| test_exclude_nested | 0.1444ms | 86.3379μs | 11.5824 KOps/s | 11.5399 KOps/s | |
| test_empty[True] | 0.4363ms | 0.3755ms | 2.6632 KOps/s | 2.6575 KOps/s | |
| test_empty[False] | 12.2882μs | 1.2650μs | 790.5211 KOps/s | 787.5069 KOps/s | |
| test_to | 0.1031ms | 72.4030μs | 13.8116 KOps/s | 14.1971 KOps/s | |
| test_to_nonblocking | 0.1041ms | 66.2112μs | 15.1032 KOps/s | 15.6977 KOps/s | |
| test_unbind_speed | 0.3701ms | 0.3113ms | 3.2125 KOps/s | 3.2003 KOps/s | |
| test_unbind_speed_stack0 | 0.3617ms | 0.3080ms | 3.2468 KOps/s | 3.2066 KOps/s | |
| test_unbind_speed_stack1 | 0.1032s | 0.8790ms | 1.1377 KOps/s | 1.1128 KOps/s | |
| test_split | 1.1780ms | 1.0845ms | 922.0795 Ops/s | 918.9973 Ops/s | |
| test_chunk | 0.1033s | 1.1520ms | 868.0179 Ops/s | 951.9561 Ops/s | |
| test_to_cpu_blocking | 19.1713ms | 18.7071ms | 53.4557 Ops/s | 48.3416 Ops/s | |
| test_to_cpu_global_sync | 11.0135ms | 10.8917ms | 91.8134 Ops/s | 91.4407 Ops/s | |
| test_to_cpu_event_sync | 0.1149s | 12.9870ms | 77.0000 Ops/s | 84.9671 Ops/s | |
| test_to_cpu_default | 12.1815ms | 11.8062ms | 84.7015 Ops/s | 84.7642 Ops/s | |
| test_consolidate[False-None] | 4.1354ms | 3.9451ms | 253.4774 Ops/s | 226.0395 Ops/s | |
| test_consolidate[default-None] | 2.0577ms | 1.9417ms | 515.0231 Ops/s | 505.9531 Ops/s | |
| test_consolidate[reduce-overhead-None] | 2.0499ms | 1.8658ms | 535.9653 Ops/s | 522.1745 Ops/s | |
| test_consolidate_njt[False-None] | 8.8069ms | 8.1169ms | 123.1995 Ops/s | 121.3843 Ops/s | |
| test_to[False-False-None] | 2.4097ms | 1.9973ms | 500.6642 Ops/s | 490.9072 Ops/s | |
| test_to[True-False-None] | 2.1041ms | 1.8632ms | 536.6986 Ops/s | 535.7286 Ops/s | |
| test_to[within-False-None] | 6.1998ms | 5.8891ms | 169.8047 Ops/s | 169.5192 Ops/s | |
| test_to[True-default-None] | 7.6206ms | 7.3736ms | 135.6195 Ops/s | 131.4642 Ops/s | |
| test_to_njt[False-False-None] | 8.3518ms | 8.2638ms | 121.0091 Ops/s | 120.2883 Ops/s | |
| test_to_njt[True-False-None] | 7.1218ms | 6.7742ms | 147.6182 Ops/s | 147.5583 Ops/s | |
| test_to_njt[within-False-None] | 15.6212ms | 14.8593ms | 67.2977 Ops/s | 65.5190 Ops/s | |
| test_creation[device0] | 0.4474ms | 0.1151ms | 8.6890 KOps/s | 8.6966 KOps/s | |
| test_creation_from_tensor | 0.4506ms | 0.1127ms | 8.8751 KOps/s | 8.8807 KOps/s | |
| test_add_one[memmap_tensor0] | 0.3430ms | 6.3843μs | 156.6346 KOps/s | 154.7175 KOps/s | |
| test_contiguous[memmap_tensor0] | 30.0210μs | 0.6199μs | 1.6130 MOps/s | 2.2632 MOps/s | |
| test_stack[memmap_tensor0] | 30.6210μs | 4.5595μs | 219.3235 KOps/s | 220.0725 KOps/s | |
| test_memmaptd_index | 1.1633ms | 0.2549ms | 3.9225 KOps/s | 3.9897 KOps/s | |
| test_memmaptd_index_astensor | 0.4989ms | 0.3425ms | 2.9193 KOps/s | 2.9351 KOps/s | |
| test_memmaptd_index_op | 0.8564ms | 0.5800ms | 1.7241 KOps/s | 1.7025 KOps/s | |
| test_serialize_model | 0.1399s | 0.1374s | 7.2804 Ops/s | 7.3728 Ops/s | |
| test_serialize_model_pickle | 1.3491s | 1.2162s | 0.8222 Ops/s | 0.8233 Ops/s | |
| test_serialize_weights | 0.1408s | 0.1365s | 7.3246 Ops/s | 7.3880 Ops/s | |
| test_serialize_weights_returnearly | 0.2918s | 81.2918ms | 12.3014 Ops/s | 11.4219 Ops/s | |
| test_serialize_weights_pickle | 1.3472s | 1.2099s | 0.8265 Ops/s | 0.8228 Ops/s | |
| test_reshape_pytree | 0.2057ms | 31.8226μs | 31.4242 KOps/s | 31.1159 KOps/s | |
| test_reshape_td | 84.3450μs | 41.7576μs | 23.9477 KOps/s | 23.0599 KOps/s | |
| test_view_pytree | 0.2178ms | 31.4952μs | 31.7509 KOps/s | 31.4810 KOps/s | |
| test_view_td | 0.2162ms | 52.4068μs | 19.0815 KOps/s | 19.9509 KOps/s | |
| test_unbind_pytree | 0.2362ms | 35.2055μs | 28.4047 KOps/s | 28.0683 KOps/s | |
| test_unbind_td | 0.1626ms | 46.4393μs | 21.5335 KOps/s | 21.5637 KOps/s | |
| test_split_pytree | 0.2460ms | 40.7771μs | 24.5236 KOps/s | 24.6722 KOps/s | |
| test_split_td | 0.1852ms | 61.5353μs | 16.2508 KOps/s | 15.9034 KOps/s | |
| test_add_pytree | 0.1911ms | 42.6910μs | 23.4242 KOps/s | 24.3692 KOps/s | |
| test_add_td | 0.1075ms | 55.8257μs | 17.9129 KOps/s | 19.5253 KOps/s | |
| test_compile_add_one_nested[tensordict-compile] | 0.1965ms | 0.1363ms | 7.3358 KOps/s | 7.1104 KOps/s | |
| test_compile_add_one_nested[tensordict-eager] | 0.6244ms | 0.1888ms | 5.2960 KOps/s | 5.3708 KOps/s | |
| test_compile_add_one_nested[pytree-compile] | 0.6133ms | 0.1070ms | 9.3496 KOps/s | 9.1915 KOps/s | |
| test_compile_add_one_nested[pytree-eager] | 0.6004ms | 0.1753ms | 5.7036 KOps/s | 5.6842 KOps/s | |
| test_compile_copy_nested[tensordict-compile] | 0.1071ms | 26.9186μs | 37.1490 KOps/s | 32.4276 KOps/s | |
| test_compile_copy_nested[tensordict-eager] | 81.5550μs | 49.3067μs | 20.2812 KOps/s | 20.0443 KOps/s | |
| test_compile_copy_nested[pytree-compile] | 37.3320μs | 9.3439μs | 107.0218 KOps/s | 103.5485 KOps/s | |
| test_compile_copy_nested[pytree-eager] | 0.4606ms | 66.0763μs | 15.1340 KOps/s | 15.1210 KOps/s | |
| test_compile_add_one_flat[tensordict-compile] | 0.2133ms | 0.1742ms | 5.7416 KOps/s | 5.3773 KOps/s | |
| test_compile_add_one_flat[tensordict-eager] | 0.3252ms | 0.2505ms | 3.9926 KOps/s | 3.9419 KOps/s | |
| test_compile_add_one_flat[tensorclass-compile] | 0.1587ms | 0.1149ms | 8.7068 KOps/s | 8.3708 KOps/s | |
| test_compile_add_one_flat[tensorclass-eager] | 0.1156ms | 68.5670μs | 14.5843 KOps/s | 14.1236 KOps/s | |
| test_compile_add_one_flat[pytree-compile] | 0.2076ms | 0.1555ms | 6.4313 KOps/s | 6.0737 KOps/s | |
| test_compile_add_one_flat[pytree-eager] | 0.7890ms | 0.5194ms | 1.9253 KOps/s | 1.8689 KOps/s | |
| test_compile_add_self_flat[tensordict-eager] | 0.3490ms | 0.3027ms | 3.3039 KOps/s | 3.2571 KOps/s | |
| test_compile_add_self_flat[tensordict-compile] | 0.2422ms | 0.1768ms | 5.6545 KOps/s | 5.3892 KOps/s | |
| test_compile_add_self_flat[tensorclass-eager] | 0.1281ms | 84.0233μs | 11.9015 KOps/s | 11.8348 KOps/s | |
| test_compile_add_self_flat[tensorclass-compile] | 0.1635ms | 0.1175ms | 8.5087 KOps/s | 8.2985 KOps/s | |
| test_compile_add_self_flat[pytree-eager] | 0.6507ms | 0.4388ms | 2.2790 KOps/s | 2.3061 KOps/s | |
| test_compile_add_self_flat[pytree-compile] | 0.1966ms | 0.1554ms | 6.4364 KOps/s | 6.3256 KOps/s | |
| test_compile_copy_flat[tensordict-compile] | 50.2930μs | 22.2471μs | 44.9498 KOps/s | 40.9568 KOps/s | |
| test_compile_copy_flat[tensordict-eager] | 66.4340μs | 40.0775μs | 24.9516 KOps/s | 25.4097 KOps/s | |
| test_compile_copy_flat[pytree-compile] | 46.9820μs | 10.2531μs | 97.5314 KOps/s | 95.1442 KOps/s | |
| test_compile_copy_flat[pytree-eager] | 0.3909ms | 50.8320μs | 19.6727 KOps/s | 19.7468 KOps/s | |
| test_compile_assign_and_add[tensordict-compile] | 1.9244ms | 0.1688ms | 5.9225 KOps/s | 5.7534 KOps/s | |
| test_compile_assign_and_add[tensordict-eager] | 3.3052ms | 3.2381ms | 308.8233 Ops/s | 310.3973 Ops/s | |
| test_compile_assign_and_add[pytree-compile] | 1.8839ms | 0.1572ms | 6.3611 KOps/s | 6.1903 KOps/s | |
| test_compile_assign_and_add[pytree-eager] | 2.8819ms | 2.7512ms | 363.4764 Ops/s | 340.6210 Ops/s | |
| test_compile_indexing[tensor-tensordict-compile] | 0.1488ms | 0.1076ms | 9.2919 KOps/s | 8.9735 KOps/s | |
| test_compile_indexing[tensor-tensordict-eager] | 0.3109ms | 71.3070μs | 14.0239 KOps/s | 13.9946 KOps/s | |
| test_compile_indexing[tensor-tensorclass-compile] | 0.1387ms | 93.9590μs | 10.6429 KOps/s | 10.2392 KOps/s | |
| test_compile_indexing[tensor-tensorclass-eager] | 0.2496ms | 44.5448μs | 22.4493 KOps/s | 22.1174 KOps/s | |
| test_compile_indexing[tensor-pytree-compile] | 0.1345ms | 95.0368μs | 10.5222 KOps/s | 10.4696 KOps/s | |
| test_compile_indexing[tensor-pytree-eager] | 0.2736ms | 44.5678μs | 22.4377 KOps/s | 21.3667 KOps/s | |
| test_compile_indexing[slice-tensordict-compile] | 92.7250μs | 54.6991μs | 18.2818 KOps/s | 17.4196 KOps/s | |
| test_compile_indexing[slice-tensordict-eager] | 0.2262ms | 26.7034μs | 37.4484 KOps/s | 37.3752 KOps/s | |
| test_compile_indexing[slice-tensorclass-compile] | 84.0650μs | 44.1688μs | 22.6404 KOps/s | 21.8045 KOps/s | |
| test_compile_indexing[slice-tensorclass-eager] | 0.2560ms | 21.5948μs | 46.3074 KOps/s | 45.8312 KOps/s | |
| test_compile_indexing[slice-pytree-compile] | 81.8540μs | 44.2772μs | 22.5850 KOps/s | 22.1603 KOps/s | |
| test_compile_indexing[slice-pytree-eager] | 0.2552ms | 21.4938μs | 46.5251 KOps/s | 45.8079 KOps/s | |
| test_compile_indexing[int-tensordict-compile] | 94.7450μs | 55.5385μs | 18.0055 KOps/s | 17.2799 KOps/s | |
| test_compile_indexing[int-tensordict-eager] | 0.2554ms | 26.4233μs | 37.8454 KOps/s | 38.4350 KOps/s | |
| test_compile_indexing[int-tensorclass-compile] | 80.6450μs | 44.1220μs | 22.6644 KOps/s | 22.3507 KOps/s | |
| test_compile_indexing[int-tensorclass-eager] | 0.2635ms | 21.7073μs | 46.0675 KOps/s | 45.8761 KOps/s | |
| test_compile_indexing[int-pytree-compile] | 82.0650μs | 44.3379μs | 22.5541 KOps/s | 21.8061 KOps/s | |
| test_compile_indexing[int-pytree-eager] | 0.2760ms | 21.5017μs | 46.5078 KOps/s | 46.0939 KOps/s | |
| test_mod_add[eager] | 0.1069ms | 48.5403μs | 20.6014 KOps/s | 19.5326 KOps/s | |
| test_mod_add[compile] | 0.1496ms | 0.1019ms | 9.8113 KOps/s | 9.3356 KOps/s | |
| test_mod_add[compile-overhead] | 0.2280ms | 0.1450ms | 6.8981 KOps/s | 6.7452 KOps/s | |
| test_mod_wrap[eager] | 0.3931ms | 0.2984ms | 3.3513 KOps/s | 3.3444 KOps/s | |
| test_mod_wrap[compile] | 0.4744ms | 0.3400ms | 2.9411 KOps/s | 2.9223 KOps/s | |
| test_mod_wrap[compile-overhead] | 7.3953ms | 4.0360ms | 247.7685 Ops/s | 250.2542 Ops/s | |
| test_mod_wrap_and_backward[eager] | 2.3350ms | 1.5527ms | 644.0543 Ops/s | 677.0766 Ops/s | |
| test_mod_wrap_and_backward[compile] | 1.7531ms | 1.4193ms | 704.5646 Ops/s | 700.7280 Ops/s | |
| test_mod_wrap_and_backward[compile-overhead] | 1.2097ms | 0.8646ms | 1.1566 KOps/s | 1.1390 KOps/s | |
| test_seq_add[eager] | 0.2295ms | 0.1615ms | 6.1916 KOps/s | 6.3711 KOps/s | |
| test_seq_add[compile] | 0.1879ms | 0.1153ms | 8.6730 KOps/s | 8.2262 KOps/s | |
| test_seq_add[compile-overhead] | 0.1981ms | 0.1513ms | 6.6095 KOps/s | 6.4419 KOps/s | |
| test_seq_wrap[eager] | 0.5784ms | 0.5117ms | 1.9541 KOps/s | 1.9894 KOps/s | |
| test_seq_wrap[compile] | 0.3970ms | 0.3592ms | 2.7841 KOps/s | 2.7823 KOps/s | |
| test_seq_wrap[compile-overhead] | 0.3260ms | 0.2613ms | 3.8273 KOps/s | 3.8114 KOps/s | |
| test_func_call_runtime[False-eager] | 0.9353ms | 0.8290ms | 1.2063 KOps/s | 1.2170 KOps/s | |
| test_func_call_runtime[False-compile] | 0.9319ms | 0.8866ms | 1.1279 KOps/s | 1.0629 KOps/s | |
| test_func_call_runtime[False-compile-overhead] | 0.6591ms | 0.4487ms | 2.2287 KOps/s | 2.2420 KOps/s | |
| test_func_call_runtime[True-eager] | 1.3419ms | 1.0804ms | 925.6152 Ops/s | 949.9384 Ops/s | |
| test_func_call_runtime[True-compile] | 0.9923ms | 0.8982ms | 1.1134 KOps/s | 1.1121 KOps/s | |
| test_func_call_runtime[True-compile-overhead] | 0.5198ms | 0.4593ms | 2.1773 KOps/s | 2.1774 KOps/s | |
| test_func_call_cm_runtime[False-eager] | 0.8868ms | 0.8311ms | 1.2032 KOps/s | 1.1372 KOps/s | |
| test_func_call_cm_runtime[False-compile] | 0.9869ms | 0.8945ms | 1.1179 KOps/s | 1.1222 KOps/s | |
| test_func_call_cm_runtime[False-compile-overhead] | 0.5114ms | 0.4447ms | 2.2485 KOps/s | 2.2288 KOps/s | |
| test_func_call_cm_runtime[True-eager] | 1.3058ms | 1.2038ms | 830.7028 Ops/s | 831.7958 Ops/s | |
| test_func_call_cm_runtime[True-compile] | 1.1236ms | 0.9518ms | 1.0507 KOps/s | 1.0691 KOps/s | |
| test_func_call_cm_runtime[True-compile-overhead] | 0.6095ms | 0.4888ms | 2.0457 KOps/s | 2.0221 KOps/s | |
| test_vmap_func_call_cm_runtime[eager] | 3.1016ms | 2.3702ms | 421.9128 Ops/s | 429.0025 Ops/s | |
| test_vmap_func_call_cm_runtime[compile] | 1.0224ms | 0.9498ms | 1.0528 KOps/s | 1.0556 KOps/s | |
| test_vmap_func_call_cm_runtime[compile-overhead] | 0.5745ms | 0.4945ms | 2.0223 KOps/s | 2.0005 KOps/s | |
| test_distributed | 0.7112ms | 0.1515ms | 6.6019 KOps/s | 6.6239 KOps/s | |
| test_tdmodule | 0.3337ms | 27.5319μs | 36.3216 KOps/s | 35.8630 KOps/s | |
| test_tdmodule_dispatch | 75.1050μs | 44.5478μs | 22.4478 KOps/s | 22.4168 KOps/s | |
| test_tdseq | 47.9430μs | 27.0499μs | 36.9687 KOps/s | 37.6567 KOps/s | |
| test_tdseq_dispatch | 67.0040μs | 46.7413μs | 21.3944 KOps/s | 21.2810 KOps/s | |
| test_instantiation_functorch | 2.2118ms | 1.9791ms | 505.2786 Ops/s | 510.5523 Ops/s | |
| test_exec_functorch | 0.2199ms | 0.1745ms | 5.7306 KOps/s | 5.6423 KOps/s | |
| test_exec_functional_call | 0.2058ms | 0.1583ms | 6.3168 KOps/s | 6.2913 KOps/s | |
| test_exec_td_decorator | 0.4451ms | 0.2290ms | 4.3660 KOps/s | 4.3790 KOps/s | |
| test_vmap_mlp_speed_decorator[True-True] | 1.0039ms | 0.8155ms | 1.2262 KOps/s | 1.2410 KOps/s | |
| test_vmap_mlp_speed_decorator[True-False] | 0.9887ms | 0.8127ms | 1.2305 KOps/s | 1.2071 KOps/s | |
| test_vmap_mlp_speed_decorator[False-True] | 0.8898ms | 0.7211ms | 1.3867 KOps/s | 1.4166 KOps/s | |
| test_vmap_mlp_speed_decorator[False-False] | 0.9162ms | 0.7108ms | 1.4069 KOps/s | 1.4083 KOps/s | |
| test_vmap_transformer_speed_decorator[True-True] | 21.2442ms | 20.3329ms | 49.1813 Ops/s | 49.4106 Ops/s | |
| test_vmap_transformer_speed_decorator[True-False] | 21.0559ms | 20.3297ms | 49.1892 Ops/s | 49.2606 Ops/s | |
| test_vmap_transformer_speed_decorator[False-True] | 20.2393ms | 20.1439ms | 49.6429 Ops/s | 49.7491 Ops/s | |
| test_vmap_transformer_speed_decorator[False-False] | 20.4260ms | 20.1054ms | 49.7380 Ops/s | 49.7222 Ops/s | |
| test_to_module_speed[True] | 1.4969ms | 1.4198ms | 704.3187 Ops/s | 710.0716 Ops/s | |
| test_to_module_speed[False] | 1.4938ms | 1.3942ms | 717.2412 Ops/s | 729.6974 Ops/s | |
| test_tc_init | 96.2950μs | 44.0550μs | 22.6989 KOps/s | 22.6334 KOps/s | |
| test_tc_init_tensor_only | 35.5720μs | 9.4144μs | 106.2204 KOps/s | 107.4214 KOps/s | |
| test_tc_init_nested | 0.1327ms | 86.4906μs | 11.5620 KOps/s | 11.4163 KOps/s | |
| test_tc_init_many_fields | 61.5940μs | 15.6338μs | 63.9638 KOps/s | 63.8747 KOps/s | |
| test_tc_first_layer_tensor | 20.4910μs | 1.7314μs | 577.5658 KOps/s | 567.7724 KOps/s | |
| test_tc_first_layer_tensor_only | 5.1603μs | 0.7019μs | 1.4248 MOps/s | 1.3930 MOps/s | |
| test_tc_first_layer_tensor_set | 29.3020μs | 3.7248μs | 268.4693 KOps/s | 267.3582 KOps/s | |
| test_tc_first_layer_tensor_only_set | 25.0020μs | 2.9996μs | 333.3797 KOps/s | 331.7403 KOps/s | |
| test_tc_first_layer_nontensor | 31.1020μs | 5.8800μs | 170.0667 KOps/s | 172.7371 KOps/s | |
| test_tc_second_layer_tensor | 33.0020μs | 4.2019μs | 237.9882 KOps/s | 240.5104 KOps/s | |
| test_tc_second_layer_nontensor | 37.8820μs | 8.2966μs | 120.5318 KOps/s | 123.0094 KOps/s | |
| test_unbind | 0.2430s | 17.1602ms | 58.2745 Ops/s | 57.0724 Ops/s | |
| test_full_like | 4.8257ms | 4.3821ms | 228.1985 Ops/s | 73.5949 Ops/s | |
| test_zeros_like | 7.6051ms | 4.3880ms | 227.8950 Ops/s | 73.8555 Ops/s | |
| test_ones_like | 4.4952ms | 4.3870ms | 227.9481 Ops/s | 73.7064 Ops/s | |
| test_clone | 6.9797ms | 6.6413ms | 150.5732 Ops/s | 66.4622 Ops/s | |
| test_squeeze | 0.2048ms | 14.4443μs | 69.2315 KOps/s | 72.7994 KOps/s | |
| test_unsqueeze | 0.1717ms | 0.1134ms | 8.8195 KOps/s | 9.1978 KOps/s | |
| test_split | 0.2602ms | 0.1857ms | 5.3854 KOps/s | 5.4502 KOps/s | |
| test_permute | 0.2847ms | 0.2076ms | 4.8175 KOps/s | 5.0115 KOps/s | |
| test_stack | 52.0777ms | 51.5276ms | 19.4071 Ops/s | 19.3701 Ops/s | |
| test_cat | 52.0952ms | 51.7178ms | 19.3357 Ops/s | 19.3748 Ops/s |
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_plain_set_nested | 39.0600μs | 14.2166μs | 70.3403 KOps/s | 70.7641 KOps/s | |
| test_plain_set_stack_nested | 35.9000μs | 14.7375μs | 67.8542 KOps/s | 70.0937 KOps/s | |
| test_plain_set_nested_inplace | 45.9710μs | 16.0765μs | 62.2024 KOps/s | 63.2244 KOps/s | |
| test_plain_set_stack_nested_inplace | 47.4410μs | 15.8831μs | 62.9600 KOps/s | 63.9427 KOps/s | |
| test_items | 36.6410μs | 5.5635μs | 179.7430 KOps/s | 184.0275 KOps/s | |
| test_items_nested | 0.5310ms | 0.4454ms | 2.2453 KOps/s | 2.2974 KOps/s | |
| test_items_nested_locked | 0.5197ms | 0.4471ms | 2.2366 KOps/s | 2.2678 KOps/s | |
| test_items_nested_leaf | 0.1334ms | 91.9320μs | 10.8776 KOps/s | 10.8973 KOps/s | |
| test_items_stack_nested | 0.5125ms | 0.4462ms | 2.2410 KOps/s | 2.2885 KOps/s | |
| test_items_stack_nested_leaf | 0.1307ms | 92.7252μs | 10.7846 KOps/s | 10.8924 KOps/s | |
| test_items_stack_nested_locked | 0.5215ms | 0.4498ms | 2.2230 KOps/s | 2.2709 KOps/s | |
| test_keys | 29.9400μs | 4.1416μs | 241.4531 KOps/s | 241.3472 KOps/s | |
| test_keys_nested | 0.1653ms | 0.1200ms | 8.3348 KOps/s | 8.6154 KOps/s | |
| test_keys_nested_locked | 0.6222ms | 0.1282ms | 7.7976 KOps/s | 7.9738 KOps/s | |
| test_keys_nested_leaf | 0.1539ms | 0.1100ms | 9.0915 KOps/s | 9.3478 KOps/s | |
| test_keys_stack_nested | 0.1723ms | 0.1202ms | 8.3211 KOps/s | 8.5863 KOps/s | |
| test_keys_stack_nested_leaf | 0.1994ms | 0.1092ms | 9.1556 KOps/s | 9.3639 KOps/s | |
| test_keys_stack_nested_locked | 0.1759ms | 0.1271ms | 7.8689 KOps/s | 7.9958 KOps/s | |
| test_values | 6.6662μs | 1.0025μs | 997.4911 KOps/s | 1.0024 MOps/s | |
| test_values_nested | 96.1720μs | 47.0412μs | 21.2580 KOps/s | 21.6882 KOps/s | |
| test_values_nested_locked | 94.9320μs | 50.0234μs | 19.9906 KOps/s | 20.0559 KOps/s | |
| test_values_nested_leaf | 0.1309ms | 52.9994μs | 18.8682 KOps/s | 19.1661 KOps/s | |
| test_values_stack_nested | 81.1220μs | 46.7152μs | 21.4063 KOps/s | 21.5998 KOps/s | |
| test_values_stack_nested_leaf | 0.1107ms | 53.7329μs | 18.6106 KOps/s | 19.2423 KOps/s | |
| test_values_stack_nested_locked | 0.1039ms | 50.3239μs | 19.8713 KOps/s | 20.1424 KOps/s | |
| test_membership | 18.0910μs | 0.9306μs | 1.0746 MOps/s | 1.2240 MOps/s | |
| test_membership_nested | 38.2910μs | 2.7216μs | 367.4351 KOps/s | 362.1986 KOps/s | |
| test_membership_nested_leaf | 37.4910μs | 2.7455μs | 364.2276 KOps/s | 372.2336 KOps/s | |
| test_membership_stacked_nested | 31.6910μs | 2.7619μs | 362.0674 KOps/s | 360.5059 KOps/s | |
| test_membership_stacked_nested_leaf | 34.6310μs | 2.7645μs | 361.7265 KOps/s | 362.5720 KOps/s | |
| test_membership_nested_last | 40.1310μs | 4.1431μs | 241.3660 KOps/s | 241.8741 KOps/s | |
| test_membership_nested_leaf_last | 67.4310μs | 4.0790μs | 245.1577 KOps/s | 243.3582 KOps/s | |
| test_membership_stacked_nested_last | 33.9600μs | 4.1495μs | 240.9921 KOps/s | 243.5863 KOps/s | |
| test_membership_stacked_nested_leaf_last | 32.3600μs | 4.0713μs | 245.6206 KOps/s | 241.4491 KOps/s | |
| test_nested_getleaf | 56.6910μs | 20.5394μs | 48.6869 KOps/s | 47.9955 KOps/s | |
| test_nested_get | 53.7610μs | 18.9257μs | 52.8382 KOps/s | 50.7886 KOps/s | |
| test_stacked_getleaf | 56.2610μs | 20.4595μs | 48.8770 KOps/s | 48.6143 KOps/s | |
| test_stacked_get | 48.1510μs | 19.3513μs | 51.6760 KOps/s | 51.0617 KOps/s | |
| test_nested_getitemleaf | 54.9510μs | 20.9285μs | 47.7817 KOps/s | 47.5349 KOps/s | |
| test_nested_getitem | 67.4310μs | 19.4991μs | 51.2844 KOps/s | 49.6580 KOps/s | |
| test_stacked_getitemleaf | 55.8310μs | 21.0248μs | 47.5628 KOps/s | 47.5217 KOps/s | |
| test_stacked_getitem | 68.4710μs | 19.9166μs | 50.2093 KOps/s | 49.5984 KOps/s | |
| test_lock_nested | 8.2480ms | 0.4585ms | 2.1811 KOps/s | 2.1614 KOps/s | |
| test_lock_stack_nested | 0.5518ms | 0.4591ms | 2.1782 KOps/s | 2.1566 KOps/s | |
| test_unlock_nested | 0.7058ms | 0.3668ms | 2.7261 KOps/s | 2.7175 KOps/s | |
| test_unlock_stack_nested | 0.5048ms | 0.3710ms | 2.6952 KOps/s | 2.6760 KOps/s | |
| test_flatten_speed | 0.2695ms | 0.1153ms | 8.6715 KOps/s | 8.5136 KOps/s | |
| test_unflatten_speed | 0.6109ms | 0.5468ms | 1.8287 KOps/s | 1.8281 KOps/s | |
| test_common_ops | 0.7928ms | 0.6589ms | 1.5176 KOps/s | 1.4231 KOps/s | |
| test_creation | 71.6910μs | 2.7421μs | 364.6899 KOps/s | 364.7682 KOps/s | |
| test_creation_empty | 33.7810μs | 5.7643μs | 173.4829 KOps/s | 174.4172 KOps/s | |
| test_creation_nested_1 | 44.7510μs | 9.9507μs | 100.4953 KOps/s | 100.8636 KOps/s | |
| test_creation_nested_2 | 46.2110μs | 11.1050μs | 90.0499 KOps/s | 89.8189 KOps/s | |
| test_creation_many_keys[10] | 58.8010μs | 16.9923μs | 58.8503 KOps/s | 59.2977 KOps/s | |
| test_creation_many_keys[50] | 0.1056ms | 72.8090μs | 13.7346 KOps/s | 13.9732 KOps/s | |
| test_creation_many_keys[100] | 0.2010ms | 0.1369ms | 7.3026 KOps/s | 7.1235 KOps/s | |
| test_creation_nested_many_keys[10] | 78.5910μs | 36.8536μs | 27.1344 KOps/s | 27.4960 KOps/s | |
| test_creation_nested_many_keys[50] | 0.2035ms | 0.1490ms | 6.7135 KOps/s | 6.7784 KOps/s | |
| test_clone | 63.5410μs | 12.8129μs | 78.0462 KOps/s | 77.3519 KOps/s | |
| test_getitem[int] | 1.7111ms | 14.1395μs | 70.7239 KOps/s | 55.8188 KOps/s | |
| test_getitem[slice_int] | 0.1938ms | 23.8482μs | 41.9318 KOps/s | 42.5531 KOps/s | |
| test_getitem[range] | 0.2037ms | 59.5374μs | 16.7962 KOps/s | 16.4190 KOps/s | |
| test_getitem[tuple] | 0.2455ms | 22.7430μs | 43.9696 KOps/s | 42.7556 KOps/s | |
| test_getitem[list] | 0.2071ms | 55.5175μs | 18.0123 KOps/s | 16.6047 KOps/s | |
| test_setitem_dim[int] | 43.8110μs | 24.8536μs | 40.2356 KOps/s | 36.8024 KOps/s | |
| test_setitem_dim[slice_int] | 65.3610μs | 42.4331μs | 23.5665 KOps/s | 22.4107 KOps/s | |
| test_setitem_dim[range] | 0.1313ms | 90.0593μs | 11.1038 KOps/s | 10.9109 KOps/s | |
| test_setitem_dim[tuple] | 58.4010μs | 38.6158μs | 25.8961 KOps/s | 24.2947 KOps/s | |
| test_setitem | 59.4510μs | 16.5192μs | 60.5358 KOps/s | 52.2149 KOps/s | |
| test_set | 55.1610μs | 15.9037μs | 62.8783 KOps/s | 56.7109 KOps/s | |
| test_set_shared | 0.5527ms | 0.2026ms | 4.9366 KOps/s | 4.6995 KOps/s | |
| test_update | 0.3547ms | 20.7309μs | 48.2371 KOps/s | 43.7645 KOps/s | |
| test_update_nested | 78.9120μs | 31.9585μs | 31.2906 KOps/s | 28.8728 KOps/s | |
| test_update__nested | 0.5284ms | 33.3361μs | 29.9975 KOps/s | 29.8044 KOps/s | |
| test_set_nested | 85.3710μs | 17.7638μs | 56.2942 KOps/s | 49.5968 KOps/s | |
| test_set_nested_new | 68.0310μs | 23.2031μs | 43.0976 KOps/s | 39.2797 KOps/s | |
| test_select | 0.1088ms | 39.8834μs | 25.0731 KOps/s | 23.3423 KOps/s | |
| test_select_nested | 0.1089ms | 70.3482μs | 14.2150 KOps/s | 14.1073 KOps/s | |
| test_exclude_nested | 0.1217ms | 86.8919μs | 11.5085 KOps/s | 11.8183 KOps/s | |
| test_empty[True] | 0.7841ms | 0.3770ms | 2.6524 KOps/s | 2.7035 KOps/s | |
| test_empty[False] | 9.7603μs | 1.2693μs | 787.8492 KOps/s | 800.1885 KOps/s | |
| test_to | 0.1010ms | 69.1413μs | 14.4631 KOps/s | 13.8873 KOps/s | |
| test_to_nonblocking | 0.2182ms | 64.1616μs | 15.5856 KOps/s | 16.0661 KOps/s | |
| test_unbind_speed | 0.4373ms | 0.3110ms | 3.2150 KOps/s | 3.1957 KOps/s | |
| test_unbind_speed_stack0 | 0.3707ms | 0.3097ms | 3.2292 KOps/s | 3.2401 KOps/s | |
| test_unbind_speed_stack1 | 0.1025s | 0.8865ms | 1.1280 KOps/s | 1.1273 KOps/s | |
| test_split | 1.1797ms | 1.0986ms | 910.2843 Ops/s | 923.1877 Ops/s | |
| test_chunk | 0.1029s | 1.1723ms | 853.0100 Ops/s | 783.1420 Ops/s | |
| test_to_cpu_blocking | 28.2570ms | 28.0575ms | 35.6411 Ops/s | 35.4599 Ops/s | |
| test_to_cpu_global_sync | 11.0551ms | 10.9448ms | 91.3674 Ops/s | 91.7974 Ops/s | |
| test_to_cpu_event_sync | 0.1144s | 13.1801ms | 75.8717 Ops/s | 83.9700 Ops/s | |
| test_to_cpu_default | 12.2633ms | 11.9533ms | 83.6586 Ops/s | 75.6447 Ops/s | |
| test_consolidate[False-None] | 4.0979ms | 3.9319ms | 254.3296 Ops/s | 251.4071 Ops/s | |
| test_consolidate[default-None] | 2.3312ms | 1.9253ms | 519.3936 Ops/s | 503.1363 Ops/s | |
| test_consolidate[reduce-overhead-None] | 1.9841ms | 1.8556ms | 538.9126 Ops/s | 522.7708 Ops/s | |
| test_consolidate_njt[False-None] | 8.3678ms | 8.1812ms | 122.2310 Ops/s | 121.1437 Ops/s | |
| test_to[False-False-None] | 2.1855ms | 2.0023ms | 499.4241 Ops/s | 488.1182 Ops/s | |
| test_to[True-False-None] | 2.0597ms | 1.8794ms | 532.0744 Ops/s | 525.2560 Ops/s | |
| test_to[within-False-None] | 6.0681ms | 5.9280ms | 168.6901 Ops/s | 167.1774 Ops/s | |
| test_to[True-default-None] | 0.1760s | 8.5753ms | 116.6136 Ops/s | 131.0533 Ops/s | |
| test_to_njt[False-False-None] | 8.4461ms | 8.3291ms | 120.0606 Ops/s | 119.7455 Ops/s | |
| test_to_njt[True-False-None] | 6.9066ms | 6.7809ms | 147.4722 Ops/s | 147.3904 Ops/s | |
| test_to_njt[within-False-None] | 15.3354ms | 15.0876ms | 66.2797 Ops/s | 65.8415 Ops/s | |
| test_creation[device0] | 0.2864ms | 0.1140ms | 8.7720 KOps/s | 8.6597 KOps/s | |
| test_creation_from_tensor | 0.4007ms | 0.1121ms | 8.9208 KOps/s | 8.7670 KOps/s | |
| test_add_one[memmap_tensor0] | 0.2191ms | 6.3129μs | 158.4058 KOps/s | 153.8127 KOps/s | |
| test_contiguous[memmap_tensor0] | 13.8700μs | 0.5958μs | 1.6785 MOps/s | 2.2997 MOps/s | |
| test_stack[memmap_tensor0] | 87.6110μs | 4.3416μs | 230.3324 KOps/s | 216.1956 KOps/s | |
| test_memmaptd_index | 1.0963ms | 0.2517ms | 3.9724 KOps/s | 3.9188 KOps/s | |
| test_memmaptd_index_astensor | 0.7672ms | 0.3418ms | 2.9260 KOps/s | 2.9117 KOps/s | |
| test_memmaptd_index_op | 1.0082ms | 0.5743ms | 1.7412 KOps/s | 1.6836 KOps/s | |
| test_serialize_model | 0.1389s | 0.1362s | 7.3428 Ops/s | 7.3033 Ops/s | |
| test_serialize_model_pickle | 1.8803s | 1.3754s | 0.7270 Ops/s | 0.8265 Ops/s | |
| test_serialize_weights | 0.1374s | 0.1357s | 7.3675 Ops/s | 7.4459 Ops/s | |
| test_serialize_weights_returnearly | 0.3927s | 89.3600ms | 11.1907 Ops/s | 10.8933 Ops/s | |
| test_serialize_weights_pickle | 1.3675s | 1.1980s | 0.8347 Ops/s | 0.8228 Ops/s | |
| test_reshape_pytree | 0.2150ms | 32.1711μs | 31.0838 KOps/s | 31.6166 KOps/s | |
| test_reshape_td | 0.1620ms | 43.6420μs | 22.9137 KOps/s | 23.4757 KOps/s | |
| test_view_pytree | 0.2293ms | 31.2462μs | 32.0039 KOps/s | 31.8227 KOps/s | |
| test_view_td | 94.8410μs | 49.5472μs | 20.1828 KOps/s | 19.9872 KOps/s | |
| test_unbind_pytree | 0.2388ms | 35.5454μs | 28.1331 KOps/s | 28.6575 KOps/s | |
| test_unbind_td | 0.1083ms | 46.7531μs | 21.3890 KOps/s | 21.6590 KOps/s | |
| test_split_pytree | 0.2185ms | 40.3446μs | 24.7865 KOps/s | 25.0005 KOps/s | |
| test_split_td | 0.1835ms | 62.4153μs | 16.0217 KOps/s | 16.2064 KOps/s | |
| test_add_pytree | 0.1926ms | 40.5479μs | 24.6622 KOps/s | 24.7956 KOps/s | |
| test_add_td | 0.2041ms | 50.0554μs | 19.9779 KOps/s | 19.2763 KOps/s | |
| test_compile_add_one_nested[tensordict-compile] | 0.1878ms | 0.1353ms | 7.3916 KOps/s | 7.1890 KOps/s | |
| test_compile_add_one_nested[tensordict-eager] | 0.4883ms | 0.1888ms | 5.2969 KOps/s | 5.4047 KOps/s | |
| test_compile_add_one_nested[pytree-compile] | 0.1596ms | 0.1063ms | 9.4094 KOps/s | 9.2349 KOps/s | |
| test_compile_add_one_nested[pytree-eager] | 0.4332ms | 0.1748ms | 5.7221 KOps/s | 5.6905 KOps/s | |
| test_compile_copy_nested[tensordict-compile] | 89.1820μs | 30.8034μs | 32.4639 KOps/s | 33.2705 KOps/s | |
| test_compile_copy_nested[tensordict-eager] | 82.5820μs | 49.6635μs | 20.1355 KOps/s | 20.1165 KOps/s | |
| test_compile_copy_nested[pytree-compile] | 44.9710μs | 9.4944μs | 105.3251 KOps/s | 104.6554 KOps/s | |
| test_compile_copy_nested[pytree-eager] | 0.4612ms | 65.8688μs | 15.1817 KOps/s | 15.3888 KOps/s | |
| test_compile_add_one_flat[tensordict-compile] | 0.2424ms | 0.1727ms | 5.7897 KOps/s | 3.7195 KOps/s | |
| test_compile_add_one_flat[tensordict-eager] | 0.3456ms | 0.2505ms | 3.9917 KOps/s | 3.9066 KOps/s | |
| test_compile_add_one_flat[tensorclass-compile] | 0.1915ms | 0.1127ms | 8.8753 KOps/s | 8.4071 KOps/s | |
| test_compile_add_one_flat[tensorclass-eager] | 0.1637ms | 67.6665μs | 14.7784 KOps/s | 14.6184 KOps/s | |
| test_compile_add_one_flat[pytree-compile] | 0.1997ms | 0.1559ms | 6.4151 KOps/s | 6.1705 KOps/s | |
| test_compile_add_one_flat[pytree-eager] | 0.8061ms | 0.5160ms | 1.9379 KOps/s | 1.8732 KOps/s | |
| test_compile_add_self_flat[tensordict-eager] | 0.4669ms | 0.3036ms | 3.2937 KOps/s | 3.2313 KOps/s | |
| test_compile_add_self_flat[tensordict-compile] | 0.3231ms | 0.1777ms | 5.6288 KOps/s | 5.3611 KOps/s | |
| test_compile_add_self_flat[tensorclass-eager] | 0.1270ms | 83.0115μs | 12.0465 KOps/s | 11.9420 KOps/s | |
| test_compile_add_self_flat[tensorclass-compile] | 0.1614ms | 0.1159ms | 8.6271 KOps/s | 8.3257 KOps/s | |
| test_compile_add_self_flat[pytree-eager] | 0.6390ms | 0.4238ms | 2.3594 KOps/s | 2.2469 KOps/s | |
| test_compile_add_self_flat[pytree-compile] | 0.3078ms | 0.1575ms | 6.3493 KOps/s | 6.3362 KOps/s | |
| test_compile_copy_flat[tensordict-compile] | 0.1446ms | 24.1745μs | 41.3659 KOps/s | 40.8725 KOps/s | |
| test_compile_copy_flat[tensordict-eager] | 83.0620μs | 40.4502μs | 24.7218 KOps/s | 25.1112 KOps/s | |
| test_compile_copy_flat[pytree-compile] | 44.6810μs | 10.4272μs | 95.9030 KOps/s | 95.6917 KOps/s | |
| test_compile_copy_flat[pytree-eager] | 0.4132ms | 51.3126μs | 19.4884 KOps/s | 19.4970 KOps/s | |
| test_compile_assign_and_add[tensordict-compile] | 1.9465ms | 0.1686ms | 5.9325 KOps/s | 5.7717 KOps/s | |
| test_compile_assign_and_add[tensordict-eager] | 3.3166ms | 3.2268ms | 309.9053 Ops/s | 307.7682 Ops/s | |
| test_compile_assign_and_add[pytree-compile] | 1.9317ms | 0.1569ms | 6.3748 KOps/s | 6.2523 KOps/s | |
| test_compile_assign_and_add[pytree-eager] | 2.8836ms | 2.7538ms | 363.1330 Ops/s | 343.1867 Ops/s | |
| test_compile_indexing[tensor-tensordict-compile] | 0.1427ms | 0.1049ms | 9.5359 KOps/s | 9.2796 KOps/s | |
| test_compile_indexing[tensor-tensordict-eager] | 0.3068ms | 73.8455μs | 13.5418 KOps/s | 13.2200 KOps/s | |
| test_compile_indexing[tensor-tensorclass-compile] | 0.1435ms | 93.1412μs | 10.7364 KOps/s | 10.5678 KOps/s | |
| test_compile_indexing[tensor-tensorclass-eager] | 0.2649ms | 43.7299μs | 22.8676 KOps/s | 22.6228 KOps/s | |
| test_compile_indexing[tensor-pytree-compile] | 0.1294ms | 93.3928μs | 10.7075 KOps/s | 10.4530 KOps/s | |
| test_compile_indexing[tensor-pytree-eager] | 0.2488ms | 45.8396μs | 21.8152 KOps/s | 22.5805 KOps/s | |
| test_compile_indexing[slice-tensordict-compile] | 0.1551ms | 56.6657μs | 17.6474 KOps/s | 17.8835 KOps/s | |
| test_compile_indexing[slice-tensordict-eager] | 0.2110ms | 26.9079μs | 37.1638 KOps/s | 36.4170 KOps/s | |
| test_compile_indexing[slice-tensorclass-compile] | 0.1760ms | 44.7038μs | 22.3695 KOps/s | 22.1174 KOps/s | |
| test_compile_indexing[slice-tensorclass-eager] | 0.2583ms | 21.8515μs | 45.7633 KOps/s | 46.0050 KOps/s | |
| test_compile_indexing[slice-pytree-compile] | 85.2420μs | 44.3579μs | 22.5439 KOps/s | 22.2674 KOps/s | |
| test_compile_indexing[slice-pytree-eager] | 0.2603ms | 21.5609μs | 46.3803 KOps/s | 45.9606 KOps/s | |
| test_compile_indexing[int-tensordict-compile] | 0.1701ms | 55.9667μs | 17.8678 KOps/s | 18.0007 KOps/s | |
| test_compile_indexing[int-tensordict-eager] | 0.2503ms | 26.5312μs | 37.6915 KOps/s | 37.3566 KOps/s | |
| test_compile_indexing[int-tensorclass-compile] | 84.7820μs | 44.0286μs | 22.7125 KOps/s | 21.7658 KOps/s | |
| test_compile_indexing[int-tensorclass-eager] | 0.2586ms | 21.7088μs | 46.0643 KOps/s | 45.8554 KOps/s | |
| test_compile_indexing[int-pytree-compile] | 0.1173ms | 44.7329μs | 22.3549 KOps/s | 22.0467 KOps/s | |
| test_compile_indexing[int-pytree-eager] | 0.2629ms | 21.5707μs | 46.3591 KOps/s | 45.6702 KOps/s | |
| test_mod_add[eager] | 0.1966ms | 49.2430μs | 20.3074 KOps/s | 20.3599 KOps/s | |
| test_mod_add[compile] | 0.2219ms | 0.1009ms | 9.9132 KOps/s | 9.5363 KOps/s | |
| test_mod_add[compile-overhead] | 0.2301ms | 0.1444ms | 6.9249 KOps/s | 6.8508 KOps/s | |
| test_mod_wrap[eager] | 0.4396ms | 0.2939ms | 3.4022 KOps/s | 3.4457 KOps/s | |
| test_mod_wrap[compile] | 0.4224ms | 0.3376ms | 2.9623 KOps/s | 2.8449 KOps/s | |
| test_mod_wrap[compile-overhead] | 6.7072ms | 3.6869ms | 271.2316 Ops/s | 253.0873 Ops/s | |
| test_mod_wrap_and_backward[eager] | 1.6892ms | 1.4875ms | 672.2851 Ops/s | 670.6033 Ops/s | |
| test_mod_wrap_and_backward[compile] | 1.6164ms | 1.4187ms | 704.8871 Ops/s | 695.3365 Ops/s | |
| test_mod_wrap_and_backward[compile-overhead] | 1.6950ms | 0.8720ms | 1.1468 KOps/s | 1.1240 KOps/s | |
| test_seq_add[eager] | 0.2136ms | 0.1511ms | 6.6160 KOps/s | 6.5076 KOps/s | |
| test_seq_add[compile] | 0.2630ms | 0.1161ms | 8.6123 KOps/s | 8.6970 KOps/s | |
| test_seq_add[compile-overhead] | 0.2830ms | 0.1511ms | 6.6191 KOps/s | 6.4851 KOps/s | |
| test_seq_wrap[eager] | 0.8063ms | 0.5196ms | 1.9246 KOps/s | 1.8787 KOps/s | |
| test_seq_wrap[compile] | 0.4420ms | 0.3708ms | 2.6968 KOps/s | 2.6714 KOps/s | |
| test_seq_wrap[compile-overhead] | 0.3759ms | 0.2582ms | 3.8732 KOps/s | 3.8191 KOps/s | |
| test_func_call_runtime[False-eager] | 0.8819ms | 0.8181ms | 1.2223 KOps/s | 1.1991 KOps/s | |
| test_func_call_runtime[False-compile] | 1.0857ms | 0.8817ms | 1.1342 KOps/s | 1.1100 KOps/s | |
| test_func_call_runtime[False-compile-overhead] | 0.4972ms | 0.4408ms | 2.2684 KOps/s | 2.2454 KOps/s | |
| test_func_call_runtime[True-eager] | 1.2692ms | 1.0726ms | 932.3468 Ops/s | 931.7976 Ops/s | |
| test_func_call_runtime[True-compile] | 0.9639ms | 0.8934ms | 1.1193 KOps/s | 1.0999 KOps/s | |
| test_func_call_runtime[True-compile-overhead] | 0.5088ms | 0.4561ms | 2.1924 KOps/s | 2.1704 KOps/s | |
| test_func_call_cm_runtime[False-eager] | 0.8837ms | 0.8188ms | 1.2213 KOps/s | 1.1526 KOps/s | |
| test_func_call_cm_runtime[False-compile] | 1.1224ms | 0.8858ms | 1.1290 KOps/s | 1.0904 KOps/s | |
| test_func_call_cm_runtime[False-compile-overhead] | 0.5481ms | 0.4449ms | 2.2476 KOps/s | 2.2259 KOps/s | |
| test_func_call_cm_runtime[True-eager] | 1.3124ms | 1.1950ms | 836.8100 Ops/s | 821.3976 Ops/s | |
| test_func_call_cm_runtime[True-compile] | 1.0991ms | 0.9254ms | 1.0806 KOps/s | 1.0306 KOps/s | |
| test_func_call_cm_runtime[True-compile-overhead] | 0.6345ms | 0.4873ms | 2.0523 KOps/s | 2.0317 KOps/s | |
| test_vmap_func_call_cm_runtime[eager] | 2.8246ms | 2.3020ms | 434.3993 Ops/s | 424.3272 Ops/s | |
| test_vmap_func_call_cm_runtime[compile] | 1.0089ms | 0.9418ms | 1.0618 KOps/s | 1.0412 KOps/s | |
| test_vmap_func_call_cm_runtime[compile-overhead] | 0.5705ms | 0.4934ms | 2.0268 KOps/s | 1.9985 KOps/s | |
| test_distributed | 0.5652ms | 0.1516ms | 6.5970 KOps/s | 6.3745 KOps/s | |
| test_tdmodule | 53.7210μs | 27.4960μs | 36.3689 KOps/s | 35.6232 KOps/s | |
| test_tdmodule_dispatch | 75.2520μs | 44.2864μs | 22.5803 KOps/s | 22.3335 KOps/s | |
| test_tdseq | 0.1441ms | 27.5425μs | 36.3075 KOps/s | 38.0364 KOps/s | |
| test_tdseq_dispatch | 67.3410μs | 47.4045μs | 21.0950 KOps/s | 21.4530 KOps/s | |
| test_instantiation_functorch | 2.0650ms | 1.9476ms | 513.4548 Ops/s | 499.1947 Ops/s | |
| test_exec_functorch | 0.2269ms | 0.1727ms | 5.7891 KOps/s | 5.6000 KOps/s | |
| test_exec_functional_call | 0.2075ms | 0.1546ms | 6.4673 KOps/s | 6.2661 KOps/s | |
| test_exec_td_decorator | 0.4331ms | 0.2267ms | 4.4109 KOps/s | 4.2955 KOps/s | |
| test_vmap_mlp_speed_decorator[True-True] | 0.9998ms | 0.8042ms | 1.2434 KOps/s | 1.2246 KOps/s | |
| test_vmap_mlp_speed_decorator[True-False] | 0.9856ms | 0.8080ms | 1.2377 KOps/s | 1.2146 KOps/s | |
| test_vmap_mlp_speed_decorator[False-True] | 0.8688ms | 0.6972ms | 1.4344 KOps/s | 1.3989 KOps/s | |
| test_vmap_mlp_speed_decorator[False-False] | 0.8746ms | 0.6981ms | 1.4325 KOps/s | 1.3924 KOps/s | |
| test_vmap_transformer_speed_decorator[True-True] | 20.3068ms | 20.1310ms | 49.6745 Ops/s | 48.4874 Ops/s | |
| test_vmap_transformer_speed_decorator[True-False] | 20.8677ms | 20.1686ms | 49.5819 Ops/s | 48.7305 Ops/s | |
| test_vmap_transformer_speed_decorator[False-True] | 20.3886ms | 19.9515ms | 50.1216 Ops/s | 49.1416 Ops/s | |
| test_vmap_transformer_speed_decorator[False-False] | 20.6232ms | 19.9686ms | 50.0787 Ops/s | 49.2046 Ops/s | |
| test_to_module_speed[True] | 2.0026ms | 1.4114ms | 708.5092 Ops/s | 710.1866 Ops/s | |
| test_to_module_speed[False] | 1.9015ms | 1.3784ms | 725.4721 Ops/s | 708.2282 Ops/s | |
| test_tc_init | 87.6910μs | 43.8329μs | 22.8139 KOps/s | 22.7025 KOps/s | |
| test_tc_init_tensor_only | 31.4300μs | 9.3070μs | 107.4456 KOps/s | 107.0924 KOps/s | |
| test_tc_init_nested | 0.1280ms | 87.5069μs | 11.4277 KOps/s | 11.3360 KOps/s | |
| test_tc_init_many_fields | 61.3410μs | 15.4993μs | 64.5191 KOps/s | 62.6369 KOps/s | |
| test_tc_first_layer_tensor | 22.5200μs | 1.7320μs | 577.3576 KOps/s | 574.8230 KOps/s | |
| test_tc_first_layer_tensor_only | 4.4101μs | 0.6968μs | 1.4352 MOps/s | 1.3920 MOps/s | |
| test_tc_first_layer_tensor_set | 23.7510μs | 3.7153μs | 269.1589 KOps/s | 268.3424 KOps/s | |
| test_tc_first_layer_tensor_only_set | 28.5300μs | 2.9776μs | 335.8398 KOps/s | 334.2563 KOps/s | |
| test_tc_first_layer_nontensor | 46.0010μs | 5.7485μs | 173.9590 KOps/s | 171.9706 KOps/s | |
| test_tc_second_layer_tensor | 20.7500μs | 4.1198μs | 242.7316 KOps/s | 240.6608 KOps/s | |
| test_tc_second_layer_nontensor | 29.3300μs | 8.2446μs | 121.2921 KOps/s | 120.5393 KOps/s | |
| test_unbind | 0.2681s | 16.0338ms | 62.3682 Ops/s | 57.1747 Ops/s | |
| test_full_like | 4.4953ms | 4.3613ms | 229.2898 Ops/s | 234.9400 Ops/s | |
| test_zeros_like | 6.2779ms | 4.3680ms | 228.9398 Ops/s | 237.0518 Ops/s | |
| test_ones_like | 4.8173ms | 4.3537ms | 229.6883 Ops/s | 229.6175 Ops/s | |
| test_clone | 11.5841ms | 9.1876ms | 108.8421 Ops/s | 157.2682 Ops/s | |
| test_squeeze | 0.1591ms | 13.4189μs | 74.5217 KOps/s | 72.2869 KOps/s | |
| test_unsqueeze | 0.2223ms | 0.1062ms | 9.4155 KOps/s | 9.1380 KOps/s | |
| test_split | 0.3502ms | 0.1781ms | 5.6144 KOps/s | 5.5747 KOps/s | |
| test_permute | 0.2479ms | 0.1988ms | 5.0290 KOps/s | 4.9965 KOps/s | |
| test_stack | 51.6337ms | 51.2214ms | 19.5231 Ops/s | 19.6919 Ops/s | |
| test_cat | 51.4338ms | 51.1567ms | 19.5478 Ops/s | 19.5975 Ops/s |
…ixes - RedisLazyStackedTensorDict[int] now returns a _RedisStackElementView that propagates reads/writes directly through to Redis instead of returning a detached TensorDict copy - Fix redundant await-in-loop in _abatch_get_element, _abatch_get_element_keys and _aset_element (reuse already-fetched metadata instead of re-fetching) - Add 5 new tests for view write-through behavior Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Collaborator
Author
|
Closing in favour of ghstack submission |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
RedisLazyStackedTensorDict, aTensorDictBasesubclass that storesLazyStackedTensorDictdata in Redis as concatenated blobs, using only O(K) Redis keys for K leaf keys regardless of the number of stack elements N (e.g. 95 keys for 30 leaves and 1M elements, vs 30M with per-element storage).to_redis()convenience method onTensorDictBase, following the same pattern asto_h5().Test plan
TestRedisLazyStackedTensorDictcovering:LazyStackedTensorDict(homogeneous and heterogeneous)td[int],td[key],td[int][key],td[slice],td[::step],td[tensor_idx]td[int] = subtd,td[slice] = val,td[tensor_idx] = valto_tensordict(),to_local(),td[idx].to_tensordict()from_redis()reconnectMade with Cursor