Evaluate CWL jobs that should be skipped on the leader by lonbar · Pull Request #5507 · DataBiosphere/toil

lonbar · 2026-04-30T21:07:48Z

CWL jobs that have a when condition that evaluates to False should not be executed. Currently anything that is not a Workflow or ExpressionTool will always run on a worker node, which means that the check on the Conditional is only done when the worker node is already allocated. If the job is instead run on the leader (local), the step won't be executed.

Checking at the instantiation level makes it possible to determine dynamically if the step should be run on the leader or the worker. This prevents unnecessary overhead in scheduling systems.

Resolves #3990.

Changelog Entry

To be copied to the draft changelog by merger:

Toil checks the CWL when conditional on the leader.

Reviewer Checklist

Make sure it is coming from issues/XXXX-fix-the-thing in the Toil repo, or from an external repo.
- If it is coming from an external repo, make sure to pull it in for CI with:
```
contrib/admin/test-pr otheruser theirbranchname issues/XXXX-fix-the-thing
```
- If there is no associated issue, create one.
Read through the code changes. Make sure that it doesn't have:
- Addition of trailing whitespace.
- New variable or member names in camelCase that want to be in snake_case.
- New functions without type hints.
- New functions or classes without informative docstrings.
- Changes to semantics not reflected in the relevant docstrings.
- New or changed command line options for Toil workflows that are not reflected in docs/running/{cliOptions,cwl,wdl}.rst
- New features without tests.
Comment on the lines of code where problems exist with a review comment. You can shift-click the line numbers in the diff to select multiple lines.
Finish the review with an overall description of your opinion.

Merger Checklist

Make sure the PR passed tests, including the Gitlab tests, for the most recent commit in its branch.
Make sure the PR has been reviewed. If not, review it. If it has been reviewed and any requested changes seem to have been addressed, proceed.
Merge with the Github "Squash and merge" feature.
- If there are multiple authors' commits, add Co-authored-by to give credit to all contributing authors.
Copy its recommended changelog entry to the Draft Changelog.
Append the issue number in parentheses to the changelog entry.

CWL jobs that have a when condition that evaluates to False should not be executed. Currently anything that is not a Workflow or ExpressionTool will always run on a worker node, which means that the check on the Conditional is only done when the worker node is already allocated. If the job is instead run on the leader (local), the step won't be executed. Checking at the instantiation level makes it possible to determine dynamically if the step should be run on the leader or the worker. This prevents unnecessary overhead in scheduling systems.

annagiroti · 2026-05-15T18:01:52Z

        # If not using the Toil file store, output files just go directly to
        # their final homes their space doesn't need to be accounted per-job.

+        options_dict: dict = {} # type: ignore


Is the # type: ignore here intentional? As far as I can tell dict = {} is valid Python and shouldn't produce a type error. Would dict[str, Any] be a more precise annotation, and would that remove the need for the ignore comment entirely?

annagiroti · 2026-05-15T18:05:29Z

        # their final homes their space doesn't need to be accounted per-job.

+        options_dict: dict = {} # type: ignore
+        run_local: bool = self.conditional.is_false(cwljob)


cwljob may still have unresolved Promise objects at init time if when references an output from an upstream step. Since Conditional.is_false resolves promises without a file store, could this either crash or return the wrong result in that case? The worst case I can think of is is_false incorrectly returning True here, setting local=True with no resources, but then the fully-resolved condition at run() time returning False, meaning real work runs on the leader with no reserved resources. Would wrapping this in a try/except that falls back to run_local = False be a safe way to handle that?

annagiroti · 2026-05-15T18:09:19Z

+                isinstance(tool, cwltool.command_line_tool.ExpressionTool)
+                or run_local
+                ),
+            **options_dict,


When run_local is True, options_dict is empty so cores, memory, disk, accelerators, and preemptible all fall back to Job defaults. CWLJobWrapper, which also runs locally, explicitly passes cores=1, memory="1GiB", disk="1MiB" for its local run. Would it be worth doing the same here for consistency, rather than relying on the defaults being equivalent?

annagiroti

The overall approach appears to be clean and the options_dict pattern for conditionally passing resources is a nice solution. My main concern is the is_false being called at init time before promises are fully resolved. This is worth making sure that can't cause issues for when conditions that reference upstream step outputs. Would it also be worth adding test cases? For example, one where the when condition is false (verifying the job isn't submitted to the batch system) and one where it references an output from a previous step (to confirm it doesn't crash or mis-schedule).

adamnovak requested a review from annagiroti May 7, 2026 15:36

lonbar force-pushed the master branch from 60418bb to a078cd3 Compare May 12, 2026 11:24

lonbar marked this pull request as ready for review May 12, 2026 11:25

tikk3r mentioned this pull request May 13, 2026

Automatic config generation for automatic delay cal selection LOFAR-VLBI/pilot#113

Open

annagiroti reviewed May 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluate CWL jobs that should be skipped on the leader#5507

Evaluate CWL jobs that should be skipped on the leader#5507
lonbar wants to merge 1 commit into
DataBiosphere:masterfrom
lonbar:master

lonbar commented Apr 30, 2026 •

edited by annagiroti

Loading

Uh oh!

annagiroti May 15, 2026

Uh oh!

annagiroti May 15, 2026

Uh oh!

annagiroti May 15, 2026

Uh oh!

annagiroti left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lonbar commented Apr 30, 2026 • edited by annagiroti Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changelog Entry

Reviewer Checklist

Merger Checklist

Uh oh!

annagiroti May 15, 2026

Choose a reason for hiding this comment

Uh oh!

annagiroti May 15, 2026

Choose a reason for hiding this comment

Uh oh!

annagiroti May 15, 2026

Choose a reason for hiding this comment

Uh oh!

annagiroti left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lonbar commented Apr 30, 2026 •

edited by annagiroti

Loading