Adding tools for managing test fixtures#199
Conversation
| let client = &MUSICBRAINZ_CLIENT; | ||
| let apiurl = format!("{}/ws/2", client.musicbrainz_domain); | ||
| let agent = &client.api_client.agent; |
There was a problem hiding this comment.
I used the internal request client directly to issue manually-constructed API requests. I went with this route instead of using the package itself because I wanted to store the test data as close as possible to how it gets received from the API.
| // very simple rate limiting just for the purpose of getting some fixtures in place | ||
| std::thread::sleep(std::time::Duration::from_secs(1)); |
There was a problem hiding this comment.
Since I'm using the raw request client, rate limiting is done in a rather simple way.
| Fixture { | ||
| file_path: "tests/serde/data/browse/release/by_label.json", | ||
| api_path: "release", | ||
| params: vec![("label", "47e718e1-7ee4-460c-b1cc-1192a841c6e5")], | ||
| }, |
There was a problem hiding this comment.
Fixtures are defined like this, just to get something actionable off the ground. There are probably more thoughtful ways of organizing this data, but I wanted to be minimally disruptive to other tests for now.
| serde_json_fmt::JsonFormat::new() | ||
| .colon(": ") | ||
| .unwrap() | ||
| .indent_width(Some(2)) | ||
| .ascii(true) | ||
| .format_to_writer(f, &json) | ||
| .unwrap(); |
There was a problem hiding this comment.
serde is used to parse the json and output pretty-printed long-form json for easier diffs. One consequence of this is that non-ascii characters aren't always 1-to-1 with the original body. serde_json_fmt is used to encode them as similarly as possible, but some characters get encoded that aren't originally encoded (so far I've seen - and ~ get escaped by serde_json_fmt, where they are left as-is in the original body).
| "relations": [ | ||
| { | ||
| "artist": { | ||
| "disambiguation": "", | ||
| "id": "9d22a626-a5f6-4547-882c-a1ebddbc064b", | ||
| "name": "Dick Bezemer", | ||
| "rating": { | ||
| "value": null, | ||
| "votes-count": 0 | ||
| }, | ||
| "sort-name": "Bezemer, Dick", | ||
| "type": "Person", | ||
| "type-id": "b6e035f4-3ce9-331c-97df-83397230b0df" | ||
| }, | ||
| "attribute-credits": {}, | ||
| "attribute-ids": { | ||
| "bass": "6505f98c-f698-4406-8bf4-8ca43d05c36f" | ||
| }, | ||
| "attribute-values": {}, | ||
| "attributes": [ | ||
| "bass" | ||
| ], | ||
| "begin": "1956-02-16", | ||
| "direction": "backward", | ||
| "end": "1956-02-16", | ||
| "ended": true, | ||
| "source-credit": "", | ||
| "target-credit": "", | ||
| "target-type": "artist", | ||
| "type": "instrument", | ||
| "type-id": "59054b12-01ac-43ee-a618-285fd397e461" | ||
| }, |
There was a problem hiding this comment.
As far as I'm aware, this is the only part of a query that I couldn't reproduce. I tried enabling every type of relation exposed by the API but could not recover the instrumentalist information.
|
I haven't reviewed the code yet so this is just process talk for now. While this is a pretty smart idea, I'm a bit torn on it's usefulness. Usually, api bindings requires three types of tests:
The second point is a solved issue. I got some tests that serialize and deserialize the data and check it against a The first point is easy in principle, but a bit of a pain. Testing all the possible includes and endpoints requires a lot of queries, and with a limit of 1request per second, it gets slow The final point is the hardest. You need to check that you get what you expect from the api (Like the includes, or having the result you searched for), but without being too specific about it so that there's the least opportunity for the data to change. Your PR would fix 2), but lacks in the other points. You only test if the endpoints are correct by manually triggering a regeneration of the fixtures, and changing fixtures could remove the point of a test (For example, you can't really test to see if there's a recording relation as if it gets removed from MB, the fixture and the tested data will be the same still) Like I said in #110, I think a fully dynamic approach would work better. From what I already have as a new test framework, I can test both that the schemas are valid, and that the endpoints works. But like you, I just can't guaranty that the test pass because the schema is valid, or that the thing I'm testing doesn't exists anymore on MB I created an issue on cargo mutants to hopefully help with that by just removing each fields and seeing if it breaks. Barbaric, but it would works! If you have any other ideas, please feel free to share. |
Yeah, I share that sentiment. I think a proper testing strategy would still need to go beyond this. This is a bit of machinery to keep the tests for goal (2) up-to-date. I put it together to try to discover any other data model mismatches. I'd say the value here is having parallel raw A better solution might run a scheduled CI job to run specific round trip tests, querying (and caching?) api responses to catch data model mismatches, but I didn't want to contribute something that opinionated. This would avoid slowing down more frequent CI with a ton of rate-limited queries while still catching API response changes relatively quickly. As it is now, it's really just a basic script - no worries if you decide it's not in the style you want. If that's the case, feel free to close and take this as food for thought. |
|
Welp as an update, I finally made the testing utils in It doesn't fix the issue that I will have to test against some static data, but I think I can reduce it, and reduce the numbers of individual include checks to none and all. |
Hello! Thanks for this great resource!
I recently noticed that the
idfield was missing from an entity that I want to use (#198), and then saw that there's an old issue about revising the testing strategy to not rely on static data (#110), which is probably why it's so easy to miss these changes.I hope you don't mind that I went ahead and added some tools to fix this. I'm not entirely sure what the best practices are for things like this, so feel free to treat this like a starting point. If you have thoughts on a better way to approach it, I'm happy to see this evolve into something that is more idiomatic.
Just to outline the approach:
I added a new binary build target, which just builds a small CLI that developers can use to refresh specific files:
I'm torn on this approach. It's nice to have an executable bundled with the package to support things like this, but it makes the dependencies and features a bit messy since it fuses developer needs with user-facing needs.
I introduced an initial commit just to re-format the original json files to make them easier to diff. Using long-form json will be easier to diff when checking for changes in the API results. Json file changes in the second commit show the diff in a more interpret table format.
Will annotate other decisions in-line in the PR