Skip to content

Adding tools for managing test fixtures#199

Open
dgkf wants to merge 6 commits intoRustyNova016:mainfrom
dgkf:110-test-fixtures
Open

Adding tools for managing test fixtures#199
dgkf wants to merge 6 commits intoRustyNova016:mainfrom
dgkf:110-test-fixtures

Conversation

@dgkf
Copy link
Copy Markdown

@dgkf dgkf commented Mar 6, 2026

Hello! Thanks for this great resource!

I recently noticed that the id field was missing from an entity that I want to use (#198), and then saw that there's an old issue about revising the testing strategy to not rely on static data (#110), which is probably why it's so easy to miss these changes.

I hope you don't mind that I went ahead and added some tools to fix this. I'm not entirely sure what the best practices are for things like this, so feel free to treat this like a starting point. If you have thoughts on a better way to approach it, I'm happy to see this evolve into something that is more idiomatic.

Just to outline the approach:

  • I added a new binary build target, which just builds a small CLI that developers can use to refresh specific files:

    # update all the test files itemized in `src/fixtures.rs`
    cargo run --features fixtures update
    
    # update specific files by glob pattern matching
    cargo run --features fixtures update --path **/*Frozen*

    I'm torn on this approach. It's nice to have an executable bundled with the package to support things like this, but it makes the dependencies and features a bit messy since it fuses developer needs with user-facing needs.

  • I introduced an initial commit just to re-format the original json files to make them easier to diff. Using long-form json will be easier to diff when checking for changes in the API results. Json file changes in the second commit show the diff in a more interpret table format.

Will annotate other decisions in-line in the PR

Comment thread src/fixtures.rs
Comment on lines +254 to +256
let client = &MUSICBRAINZ_CLIENT;
let apiurl = format!("{}/ws/2", client.musicbrainz_domain);
let agent = &client.api_client.agent;
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the internal request client directly to issue manually-constructed API requests. I went with this route instead of using the package itself because I wanted to store the test data as close as possible to how it gets received from the API.

Comment thread src/fixtures.rs
Comment on lines +300 to +301
// very simple rate limiting just for the purpose of getting some fixtures in place
std::thread::sleep(std::time::Duration::from_secs(1));
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I'm using the raw request client, rate limiting is done in a rather simple way.

Comment thread src/fixtures.rs
Comment on lines +48 to +52
Fixture {
file_path: "tests/serde/data/browse/release/by_label.json",
api_path: "release",
params: vec![("label", "47e718e1-7ee4-460c-b1cc-1192a841c6e5")],
},
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixtures are defined like this, just to get something actionable off the ground. There are probably more thoughtful ways of organizing this data, but I wanted to be minimally disruptive to other tests for now.

Comment thread src/fixtures.rs
Comment on lines +290 to +296
serde_json_fmt::JsonFormat::new()
.colon(": ")
.unwrap()
.indent_width(Some(2))
.ascii(true)
.format_to_writer(f, &json)
.unwrap();
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

serde is used to parse the json and output pretty-printed long-form json for easier diffs. One consequence of this is that non-ascii characters aren't always 1-to-1 with the original body. serde_json_fmt is used to encode them as similarly as possible, but some characters get encoded that aren't originally encoded (so far I've seen - and ~ get escaped by serde_json_fmt, where they are left as-is in the original body).

Comment on lines -606 to -637
"relations": [
{
"artist": {
"disambiguation": "",
"id": "9d22a626-a5f6-4547-882c-a1ebddbc064b",
"name": "Dick Bezemer",
"rating": {
"value": null,
"votes-count": 0
},
"sort-name": "Bezemer, Dick",
"type": "Person",
"type-id": "b6e035f4-3ce9-331c-97df-83397230b0df"
},
"attribute-credits": {},
"attribute-ids": {
"bass": "6505f98c-f698-4406-8bf4-8ca43d05c36f"
},
"attribute-values": {},
"attributes": [
"bass"
],
"begin": "1956-02-16",
"direction": "backward",
"end": "1956-02-16",
"ended": true,
"source-credit": "",
"target-credit": "",
"target-type": "artist",
"type": "instrument",
"type-id": "59054b12-01ac-43ee-a618-285fd397e461"
},
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I'm aware, this is the only part of a query that I couldn't reproduce. I tried enabling every type of relation exposed by the API but could not recover the instrumentalist information.

@RustyNova016
Copy link
Copy Markdown
Owner

I haven't reviewed the code yet so this is just process talk for now.


While this is a pretty smart idea, I'm a bit torn on it's usefulness. Usually, api bindings requires three types of tests:

  • I can ping the api, and the api pongs back without issues
  • I can read the data from the api, and turn it back into the original data (Syntactically equivalent. Doesn't have to be the exact same formatting)
  • The api gives me correct data for what I asked

The second point is a solved issue. I got some tests that serialize and deserialize the data and check it against a serde_json::Value of the original response. Not all the tests are migrated to it as the two other points blocks it

The first point is easy in principle, but a bit of a pain. Testing all the possible includes and endpoints requires a lot of queries, and with a limit of 1request per second, it gets slow

The final point is the hardest. You need to check that you get what you expect from the api (Like the includes, or having the result you searched for), but without being too specific about it so that there's the least opportunity for the data to change.


Your PR would fix 2), but lacks in the other points. You only test if the endpoints are correct by manually triggering a regeneration of the fixtures, and changing fixtures could remove the point of a test (For example, you can't really test to see if there's a recording relation as if it gets removed from MB, the fixture and the tested data will be the same still)

Like I said in #110, I think a fully dynamic approach would work better. From what I already have as a new test framework, I can test both that the schemas are valid, and that the endpoints works. But like you, I just can't guaranty that the test pass because the schema is valid, or that the thing I'm testing doesn't exists anymore on MB

I created an issue on cargo mutants to hopefully help with that by just removing each fields and seeing if it breaks. Barbaric, but it would works!
sourcefrog/cargo-mutants#605

If you have any other ideas, please feel free to share.

@dgkf
Copy link
Copy Markdown
Author

dgkf commented Mar 10, 2026

I'm a bit torn on it's usefulness.

Yeah, I share that sentiment. I think a proper testing strategy would still need to go beyond this.

This is a bit of machinery to keep the tests for goal (2) up-to-date. I put it together to try to discover any other data model mismatches.

I'd say the value here is having parallel raw ureq queries and musicbrainz_rs queries that can be checked for equivalence. If nothing else, hopefully that helps make some progress toward goal (3).

A better solution might run a scheduled CI job to run specific round trip tests, querying (and caching?) api responses to catch data model mismatches, but I didn't want to contribute something that opinionated. This would avoid slowing down more frequent CI with a ton of rate-limited queries while still catching API response changes relatively quickly.

As it is now, it's really just a basic script - no worries if you decide it's not in the style you want. If that's the case, feel free to close and take this as food for thought.

@RustyNova016
Copy link
Copy Markdown
Owner

Welp as an update, I finally made the testing utils in api_bindium. Allows for testing proper urls, deserialising, and roundtrip. Example here

It doesn't fix the issue that I will have to test against some static data, but I think I can reduce it, and reduce the numbers of individual include checks to none and all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants