Skip to content

Commit 8c46f0c

Browse files
authored
[Feature] Tabular data import/export (pandas, CSV, Parquet, JSON) (#1679)
1 parent c3d11c2 commit 8c46f0c

9 files changed

Lines changed: 1361 additions & 1 deletion

File tree

.github/unittest/linux/scripts/environment.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,4 +22,6 @@ dependencies:
2222
- ninja
2323
- numpy<2.0.0
2424
- mosaicml-streaming
25+
- pandas
26+
- pyarrow
2527
- redis

.github/unittest/linux/scripts/install.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ printf "* Installing tensordict\n"
5353
# then install tensordict without resolving dependencies to avoid any solver changing
5454
# the PyTorch build (stable vs nightly).
5555
python -m pip install -U packaging pyvers importlib_metadata
56-
python -m pip install redis
56+
python -m pip install redis pandas pyarrow
5757
python -m pip install -e . --no-deps
5858

5959
# smoke test

pyproject.toml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,9 @@ tests = [
5555
"pytest-benchmark"
5656
]
5757
h5 = ["h5py>=3.8"]
58+
pandas = ["pandas>=1.5"]
59+
parquet = ["pyarrow>=10.0"]
60+
tabular = ["pandas>=1.5", "pyarrow>=10.0"]
5861
dev = ["pybind11>=2.13", "ninja"]
5962
typecheck = ["mypy>=1.0.0"]
6063
onnx = ["onnx", "onnxscript", "onnxruntime"]

tensordict/__init__.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,9 +28,13 @@
2828
_default_is_leaf as default_is_leaf,
2929
_is_leaf_nontensor as is_leaf_nontensor,
3030
from_any,
31+
from_csv,
3132
from_dict,
3233
from_h5,
34+
from_json,
3335
from_namedtuple,
36+
from_pandas,
37+
from_parquet,
3438
from_struct_array,
3539
from_tuple,
3640
get_defaults_to_none,
@@ -113,10 +117,14 @@
113117
"LazyStackedTensorDictStore",
114118
"NestedKey",
115119
# Factory functions
120+
"from_csv",
116121
"from_dict",
117122
"from_any",
118123
"from_h5",
124+
"from_json",
119125
"from_namedtuple",
126+
"from_pandas",
127+
"from_parquet",
120128
"from_struct_array",
121129
"from_tuple",
122130
"from_dataclass",

0 commit comments

Comments
 (0)