Skip to content

Add input data parameter to the encode_dataset.py#92

Open
ltsabadz wants to merge 2 commits intoINRIA:devfrom
ltsabadz:feature/add_input_encode_dataset
Open

Add input data parameter to the encode_dataset.py#92
ltsabadz wants to merge 2 commits intoINRIA:devfrom
ltsabadz:feature/add_input_encode_dataset

Conversation

@ltsabadz
Copy link
Copy Markdown
Contributor

Hi! Needed to follow docs couple of times and noticed that input data path was hardcoded as data/era5_240/full/. Would be helpful to have as an argument.

@ltsabadz ltsabadz requested a review from robert-DL as a code owner September 24, 2025 15:18
@robert-DL
Copy link
Copy Markdown
Collaborator

Hi, thx for the commit. In general, the encode_dataset.py is not really perfect as it also not allows for parallel generation or longer trajectories to be encoded .

@ltsabadz
Copy link
Copy Markdown
Contributor Author

I also noticed that when you run the script on a subset of the data (e.g. 2007–2018), it still starts by saving files named as 1979 rather than the first year in the subset.

Another issue: if an older file like era5_240_pred_1979_0h.nc is already present in the output directory, and you try to run the script only for later years, the entire dataset gets skipped because of the following block:

current_year = 1979
xr_list = []
for i, batch in tqdm(enumerate(dl)):
    fname = Path(args.output_path).joinpath(f"era5_240_pred_{current_year}_0h.nc")
    if fname.exists():
        continue

But that's another topic..

@robert-DL
Copy link
Copy Markdown
Collaborator

The issue u mentioned here is actually already part of another issue.
Could you rebase to dev?

@ltsabadz ltsabadz force-pushed the feature/add_input_encode_dataset branch from 01d4abd to 0af9556 Compare October 1, 2025 12:18
@ltsabadz ltsabadz requested a review from gcouairon as a code owner October 1, 2025 12:18
@ltsabadz ltsabadz changed the base branch from main to dev October 1, 2025 12:23
@ltsabadz
Copy link
Copy Markdown
Contributor Author

ltsabadz commented Oct 1, 2025

The issue u mentioned here is actually already part of another issue. Could you rebase to dev?

Rebased and changed the base branch to dev

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants