Skip to content

dhis2: Add paging examples and default async:false in tracker.import#1681

Open
hunterachieng wants to merge 14 commits into
mainfrom
feature/1546-dhis2-fix
Open

dhis2: Add paging examples and default async:false in tracker.import#1681
hunterachieng wants to merge 14 commits into
mainfrom
feature/1546-dhis2-fix

Conversation

@hunterachieng

@hunterachieng hunterachieng commented May 28, 2026

Copy link
Copy Markdown
Contributor

Summary

Default async:false in tracker.import and add paging examples to tracker.export()

Fixes #1546

Related

Details

  • Default async:false in tracker.import function
  • Add paging examples in tracker.export function
  • Add integration tests for tracker.export pagination
  • Update unit tests for tracker.import (testing async modes), and tracker.export testing pagination params

AI Usage

Please disclose how you've used AI in this work (it's cool, we just want to
know!):

  • I have used Claude Code
  • I have used another model
  • I have not used AI

You can read more details in our
Responsible AI Policy

Review Checklist

Before merging, the reviewer should check the following items:

  • Does the PR do what it claims to do?
  • If this is a new adaptor, added the adaptor on marketing website ?
  • If this PR includes breaking changes, do we need to update any jobs in
    production? Is it safe to release?
  • Are there any unit tests?
  • Is there a changeset associated with this PR? Should there be? Note that
    dev only changes don't need a changeset.
  • Have you ticked a box under AI Usage?

Signed-off-by: Hunter Achieng <achienghunter@gmail.com>
Signed-off-by: Hunter Achieng <achienghunter@gmail.com>
Signed-off-by: Hunter Achieng <achienghunter@gmail.com>
@mtuchi mtuchi changed the title dhis2: Add paging examples and default async to false dhis2: Add paging examples Jun 3, 2026
@mtuchi mtuchi changed the title dhis2: Add paging examples dhis2: Add paging examples and default async:false in tracker.import Jun 3, 2026
@mtuchi mtuchi removed their assignment Jun 3, 2026

@mtuchi mtuchi left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @hunterachieng i have made couple of changes for dhis2 pagination

  • I removed the async:fasle in tracker.export. After looking into dhis2 docs there is no async for tracker.export only in tracker.import
  • I have added integration tests for pagination using tracker.export
  • update tracker.export to include paging examples and jsdocs
  • update tracker unit tests to test async mode for tracker.import and paging params for tracker.export

Please have a look and let me know if you have any feedback, If not we can request review from joe

@mtuchi

mtuchi commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

@hunterachieng One more thing i have added dhis2 tracker.export pagination examples on docs, See OpenFn/docs#797. Please review and leave your feedback

@mtuchi mtuchi requested a review from josephjclark June 5, 2026 13:07
@mtuchi

mtuchi commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator

Hey @josephjclark i have added examples on docs as well and request your review OpenFn/docs#797

Comment thread .changeset/hip-planes-see.md Outdated
Comment thread packages/dhis2/src/tracker.js Outdated
Comment thread packages/dhis2/src/tracker.js Outdated
Comment thread packages/dhis2/src/tracker.js Outdated
Signed-off-by: Hunter Achieng <achienghunter@gmail.com>
Comment thread packages/dhis2/src/tracker.js Outdated
* @param {string} path - Path to the resource, relative to the /tracker endpoint
* @param {object} query - An object of query parameters to be encoded into the URL
* @param {object} query - An object of query parameters to be encoded into the URL. Can include [pagination parameters](https://docs.dhis2.org/en/develop/using-the-api/dhis-core-version-master/tracker.html#request-parameters-for-pagination), filters, etc.
* @param {string} [query.order] - Comma-separated field:sortDirection pairs, e.g. `createdAt:desc`

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only query property we're actually documenting now. Why is this one so important?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must have missed it. i removed it

@josephjclark

Copy link
Copy Markdown
Collaborator

@hunterachieng can you update the PR description so that it's accurate?

We seem to have now removed most of the things this PR actually added 😅 So I'm a little unsure about the value or point of it.

  • We've set async: false in code but wasn't this the defacto default anyway? And couldn't users always pass it on the query? So what's new?
  • The PR claims to have added pagination examples - but they've now been removed (that's not what I was asking for!). All we have now is an example with paging: false to disable pagination and force all items to be downloaded in a single page. Is that enough? Is that OK? The changeset says "examples" plural, but all we've currently just changed one example.

I basically just want to be sure that the PR description and changeset are accurate, and that we haven't lost sight of what this PR is supposed to be about.

Signed-off-by: Hunter Achieng <achienghunter@gmail.com>

@mtuchi mtuchi left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hunterachieng See joe's comment here #1681 (comment)

@josephjclark

Copy link
Copy Markdown
Collaborator

@mtuchi what do you make of all this? I just have this weird feeling that this PR doesn't actually do anything?

I think the idea was to add examples but the examples we have are very thin now. Are they really telling the right story?

Is this example really all we need? If so, fine :)

tracker.export('events', { paging: false})

Even the async query is just written to a query param and works out of the box today (presumably with the same value). So it's meaningless code.

@hunterachieng

Copy link
Copy Markdown
Contributor Author

@josephjclark The only things we have added are links for pagination and the thin examples of how its done.

Since pagination is not handled by us, the best we can do is with the examples and doc links.

For async:false, it was working with the default as false before, we just now allowed users to change the values if needed. To be honest, it is true that there is not much going on in the PR

@mtuchi

mtuchi commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

@josephjclark
Just to add on what changed on this pr, If we put pagination stuff aside

  1. Allow user to specify async mode in tracker.import, before even if you specify in query it will always be set to false. Now user have the ability to set it to true. I think this is important we shouldn't limit user to always wait for results
  2. This pr remove async option in tracker.export because it's not supported. You will always get events, enrollments etc. There is not async: true or false for tracker.export. This is more of a clean up

I am totally fine moving the pagination stuff into our job library docs which is what this pr on docs is doing OpenFn/docs#797

But i would be very cautious on advocating for tracker.export('events', { paging: false}). We should express the performance implication of fetching all records by using paging: false because it's very noticeable when you have lots of data.

@mtuchi

mtuchi commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

I did some test on app to compare paging with query param paging:true vs paging with chunking

@mtuchi

mtuchi commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Still getting a lost run when using tracker.export("events", { paging: false }). See lost run here

@josephjclark

Copy link
Copy Markdown
Collaborator

Ok- so those lost runs are probably occuring because we're downloading too much data and kubernetes is killing the process. The worker should really be robust to this but we know there are problems in that area.

So paging: false can be used to download all data at once, but doing so might cause the run to exceed its memory limit. I'm OK with that - it's a natural limitation of the system.

Doing manual paging in a workflow is very very hard, but possible.

Adding a better pagination API to dhis2 is possible of course. We closed down that PR on the grounds that a) it wasn't immediately needed and b) we can achieve the same thin through paging:false and good example (this only feels half true?).

There's also a c): I worry that, the way the closed pagination API works, it is still in danger of downloading too much data into a single run and causing memory problems. What we really need there (and on other adaptors) is to start introducing paging APIs which allow users to process a single page of a data at a time - rather than just downloading everything in one go. That can still cause problems but at least it gives us some options.

Anyway that's another story. Pagination is complicated.

So we all agree that this PR doesn't do much, but the extra bit of documentation it adds is probably worthwhile? But we need to do further and deeper work to paginate and safely process very large data sets (this problem is more than just pagination)

@mtuchi

mtuchi commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

I totally agree pagination is really hard 😮‍💨, The api we have are so easy to footgun yourself. We definitely need some deeper work on that area

But I still think these two points still need to be addressed

  1. Allow user to specify async mode in tracker.import, before even if you specify in query it will always be set to false. Now user have the ability to set it to true. I think this is important we shouldn't limit user to always wait for results
  2. This pr remove async option in tracker.export because it's not supported. You will always get events, enrollments etc. There is not async: true or false for tracker.export. This is more of a clean up

And this pr already address those two points.

Documentation does help but for now in think the job library pr is the best place for it because we have more room to say how to handle pagination in dhis2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

dhis2: add pagination support

4 participants