Skip to content

mywhoosh: add file hash to .uploaded_files.json#77

Open
slingler wants to merge 2 commits into
jat255:mainfrom
slingler:mywhoosh-track-uploaded-files-with-file-hash
Open

mywhoosh: add file hash to .uploaded_files.json#77
slingler wants to merge 2 commits into
jat255:mainfrom
slingler:mywhoosh-track-uploaded-files-with-file-hash

Conversation

@slingler

Copy link
Copy Markdown

Problem
When uploading activities with fit-file-faker -ua only the first MyWhoosh activity is uploaded. Subsequent activity uploads are removed from the list of FIT files here https://github.com/jat255/Fit-File-Faker/blob/main/fit_file_faker/app.py#L387-L388 and therefore skipped.

This is because MyWhoosh does not store FIT files with a unique name but one per app version which overwrites previous FIT files:

 % ls -1 ~/Library/Containers/com.whoosh.whooshgame/Data/Library/Application\ Support/Epic/MyWhoosh/Content/Data/*.fit
MyNewActivity-5.6.1.fit
MyNewActivity-5.7.0.fit
MyNewActivity-5.7.1.fit

Fix
Instead of pushing the FIT filename to FILES_UPLOADED_NAME only, this PR adds objects per uploaded FIT file including a hash of the FIT file content.

[
  {
    "name": "MyNewActivity-5.6.1.fit",
    "hash": "7606e6afe7504d5c5e5737e5c5f47088976239dca9f757e8bb0f0a15c4ce4975"
  },
  {
    "name": "MyNewActivity-5.6.1.fit",
    "hash": "e2b3d2e0a1cc0ad2c855eeb58799ebbbbc4dc4c1df6b00f276e9a60a503013d8"
  },
  {
    "name": "MyNewActivity-5.7.0.fit",
    "hash": "c71376c2096b18deac1a4c462a6ee6900198a364a1161700707087908b5097a0"
  },
  {
    "name": "MyNewActivity-5.7.1.fit",
    "hash": "749e98409e26eacf76cc9d0964d873a6be149a486c4e98f385473b2715b7e76b"
  },
  {
    "name": "MyNewActivity-5.7.1.fit",
    "hash": "76bc00f4eca9026502f3c974bf4b3ab10efc57ef1092f10ff75b609be46035dc"
  }
] 

@jat255 jat255 left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! In general, I like this idea, as it adds support for "upload all" for MyWhoosh, which wasn't added when I added support for MyWhoosh (since I always use the app in "monitor" mode).

There are a few problems that need to be addressed, though. There are more details in the in-line comments, but generally:

Backward compatibility (needs fixing before merge)

Existing .uploaded_files.json files contain plain strings. After the upgrade, the deduplication comparison:

files = [i for i in files if i not in uploaded_files]

...compares dicts against strings, which never match. Every previously-uploaded file will be re-uploaded on the first run after upgrading. A migration step is needed after loading uploaded_files in both upload_all and on_modified:

# Migrate legacy string entries (pre-hash format)
uploaded_files = [
    e if isinstance(e, dict) else {"name": e, "hash": None}
    for e in uploaded_files
]

And a name-only match for migrated entries where hash is None.

Minor: unclosed file handles

Both the on_modified handler and upload_all open files without a context manager:

open(dir.joinpath(i), "rb").read()

Since this project uses pathlib throughout, the idiomatic fix is:

with source_file.open("rb") as fh:          # in on_modified
    file_hash = sha256(fh.read()).hexdigest()

with dir.joinpath(i).open("rb") as fh:      # in upload_all
    file_hash = sha256(fh.read()).hexdigest()

Minor: misplaced log in upload_all

The debug log Adding "{f["name"]}" with hash... sits inside if output: but uploaded_files.append(f) runs unconditionally outside that block. The log should move to just before the append so it fires regardless of which branch executed:

        _logger.debug(
            f'Adding "{f["name"]}" with hash "{f["hash"]}" to "uploaded_files"'
        )
        uploaded_files.append(f)

No tests

Given the project requires 100% test coverage, please add tests for:

  • A file with the same name but different hash gets uploaded (the core bug fix)
  • Backward compatibility: a legacy string-format .uploaded_files.json does not cause re-uploads after upgrade

Also, I have some concerns that these changes were not tested at all, as running the ./run_tests.py script revealed the following failures:

FAILED tests/test_app.py::TestUploadAllFunction::test_upload_all_preinitialize - AssertionError: assert 'test1.fit' in [{'hash': '2c8c8e333842924d127d1c1cc267f51b0018659ff9307b08591b01bfdf34ff19', 'name': 'test1.fit'}, {'hash': '2c8c8e333842924d127d1c1cc267f51b0018659ff9307b08591b01b...
FAILED tests/test_app.py::TestNewFileEventHandler::test_mywhoosh_version_update_no_duplicate - AttributeError: 'builtin_function_or_method' object has no attribute 'joinpath'
FAILED tests/test_app.py::TestUploadAllFunction::test_upload_all_new_files - AssertionError: assert 'test_activity.fit' in [{'hash': '2c8c8e333842924d127d1c1cc267f51b0018659ff9307b08591b01bfdf34ff19', 'name': 'test_activity.fit'}]
FAILED tests/test_app.py::TestNewFileEventHandler::test_on_modified_processes_mywhoosh_file - AttributeError: 'builtin_function_or_method' object has no attribute 'joinpath'
FAILED tests/test_app.py::TestNewFileEventHandler::test_on_modified_adds_to_uploaded_files - AttributeError: 'builtin_function_or_method' object has no attribute 'joinpath'
FAILED tests/test_app.py::TestNewFileEventHandler::test_on_modified_tracking_idempotent - AttributeError: 'builtin_function_or_method' object has no attribute 'joinpath'
FAILED tests/test_app.py::TestUploadAllFunction::test_upload_all_skips_already_uploaded - AssertionError: Expected 'edit_fit' to not have been called. Called 1 times.
FAILED tests/test_integration_profiles.py::TestMultiProfileWorkflows::test_batch_upload_with_profile - AssertionError: assert 'activity1.fit' in [{'hash': '2c8c8e333842924d127d1c1cc267f51b0018659ff9307b08591b01bfdf34ff19', 'name': 'activity1.fit'}, {'hash': '2c8c8e333842924d127d1c1cc267f51b0018659ff9307b0...

Please add the tests for the new behavior and ensure the existing test suite passes locally.

Comment thread fit_file_faker/app.py
file = {
"name": source_file.name,
"hash": sha256(
open(dir.joinpath(source_file.name), "rb").read()

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use Pathlib.Path and a context manager to ensure the file is properly closed. Also, I'm assuming this was copied from the change in upload_all() without testing? dir doesn't exist in this scope, and will be the built-in dir() function, so this will raise a NameError.

Comment thread fit_file_faker/app.py
files = [i for i in files if not i.endswith("_modified.fit")]
# append file hash for each file
files = [
{"name": i, "hash": sha256(open(dir.joinpath(i), "rb").read()).hexdigest()}

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use pathlib.Path and a context manager to ensure the file is properly closed. Something like:

with open(dir.joinpath(source_file.name), "rb") as fh:
    file_hash = sha256(fh.read()).hexdigest()

Comment thread fit_file_faker/app.py
original_path=Path(f["name"]),
dryrun=dryrun,
)
_logger.debug(

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a pre-existing issue, but should be fixed here. The message saysAdding "{f["name"]}" with hash "{f["hash"]}" to "uploaded_files", but this log appears inside if output: (i.e. only when upload succeeds). However, uploaded_files.append(f) at line 413 runs unconditionally (outside that if block), which is consistent with existing behavior. The debug log placement is misleading but not broken. The log should go at line 433 right before uploaded_files.append(f)

Comment thread fit_file_faker/app.py
for i in files
]
# remove files found in the "already uploaded" list
files = [i for i in files if i not in uploaded_files]

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backward compatibility break

When a user upgrades, their existing .uploaded_files.json will contain plain strings from the current code:

["MyNewActivity-5.6.1.fit", "MyNewActivity-5.7.0.fit"]

The new code in this PR converts files to a list of dicts, then runs:

files = [i for i in files if i not in uploaded_files]

A dict {"name": "MyNewActivity-5.6.1.fit", "hash": "..."} will never equal the string "MyNewActivity-5.6.1.fit", so all previously-uploaded files will be re-uploaded on the first run after upgrading.

A migration step is needed after loading uploaded_files:

# Migrate legacy string entries (pre-hash format)
uploaded_files = [
    e if isinstance(e, dict) else {"name": e, "hash": None}
    for e in uploaded_files
]

Then adjust deduplication to match on name-only for legacy entries where hash is None:

def _already_uploaded(f: dict, uploaded: list) -> bool:
    for entry in uploaded:
        if entry["name"] == f["name"]:
            if entry["hash"] is None or entry["hash"] == f["hash"]:
                return True
    return False

files = [f for f in files if not _already_uploaded(f, uploaded_files)]

The same migration is needed in on_modified, which loads the same file independently.

@jat255 jat255 added bug Something isn't working waiting for input MyWhoosh MyWhoosh labels Apr 26, 2026
@jat255 jat255 linked an issue Apr 26, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working MyWhoosh MyWhoosh waiting for input

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MyWhoosh: upload-all mode not actually supported

2 participants