Skip to content

Inline convolution index bookkeeping in Stan hot path#1404

Open
ajh-oai wants to merge 2 commits into
epiforecasts:mainfrom
ajh-oai:codex/inline-convolution-indices
Open

Inline convolution index bookkeeping in Stan hot path#1404
ajh-oai wants to merge 2 commits into
epiforecasts:mainfrom
ajh-oai:codex/inline-convolution-indices

Conversation

@ajh-oai
Copy link
Copy Markdown

@ajh-oai ajh-oai commented May 12, 2026

Description

This PR follows #1273 with a focused cleanup in convolve_with_rev_pmf().

The convolution loop previously called small helper functions on every output element to build a temporary 4-integer index array before taking each dot product. This change keeps the convolution logic the same but computes those loop-local indices directly where they are used, avoiding repeated helper-call and temporary-array overhead inside the hot loop.

Benchmarks

I benchmarked the change locally with paired CmdStan runs on the package's infection-estimation workflow (example_confirmed[1:60], fixed generation time, standard incubation/reporting delays, 2 chains, 250 warmup + 250 sampling draws, seeds 101/202/303/404, order balanced across baseline and branch runs).

Metric Baseline mean Branch mean Mean paired change
reports Stan profile block 4.3150 s 2.6701 s -38.13%
CmdStan total runtime 10.2738 s 10.1897 s -0.85%

The reports profile block improved in all four paired runs (-25.78% to -53.59%). Total runtime was directionally slightly lower on average but noisy across seeds, so I view the main result here as a targeted reduction in convolution hot-path overhead rather than a strong claim about end-to-end runtime.

Validation

  • git diff --check
  • NOT_CRAN=true Rscript -e 'pkgload::load_all(".", quiet = TRUE); testthat::test_file("tests/testthat/test-stan-convole.R")'
    • 24 passed, 0 failed
  • Local paired CmdStan benchmark described above; both baseline and branch models compiled and sampled successfully.

AI assistance disclosure

Codex helped identify the optimization seam, benchmark the candidate change, and prepare this PR. I reviewed the final patch and benchmark results before submission.

Initial submission checklist

  • My PR is based on a package issue and I have explicitly linked it.
  • I have tested my changes locally (using devtools::test() and devtools::check()).
  • I have added or updated unit tests where necessary.
  • I have updated the documentation if required and rebuilt docs if yes (using devtools::document()).
  • I have followed the established coding standards (and checked using lintr::lint_package()).
  • I have added a news item linked to this PR.

After the initial Pull Request

  • I have reviewed Checks for this PR and addressed any issues as far as I am able.

Summary by CodeRabbit

  • Refactor
    • Optimised internal convolution computation performance by streamlining index calculation logic.

Review Change Stack

Co-authored-by: Codex <noreply@openai.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 12, 2026

Walkthrough

The PR refactors convolve_with_rev_pmf to compute convolution slice boundaries inline within its two summation loops instead of calling helper functions that build intermediate index arrays. The dot-product boundaries are now calculated per iteration and used immediately.

Changes

Convolution index inlining

Layer / File(s) Summary
Inlined index computation in summation loops
inst/stan/functions/convolve.stan
The s in 1:xlen loop inlines s_minus_ylen, start_x, end_x, and start_y calculations; the s in (xlen + 1):len loop inlines s_minus_ylen, start_x, start_y, and end_y (with end_x fixed to xlen). Both compute boundaries and apply them to dot_product slices immediately, replacing calls to calc_conv_indices_xlen and calc_conv_indices_len.
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and specifically describes the main change: inlining convolution index bookkeeping computation in a Stan hot path function to improve performance.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

This is how benchmark results would change (along with a 95% confidence interval in relative change) if abe5767 is merged into main:

  • ✔️default: 24.1s -> 23s [-11.58%, +2.71%]
  • ✔️no_delays: 26.9s -> 24.8s [-20.05%, +4.35%]
  • ✔️random_walk: 9.63s -> 19.4s [-36.41%, +238.78%]
  • ✔️stationary: 13.1s -> 12.1s [-24.15%, +8.53%]
  • ✔️uncertain: 1.19m -> 1.17m [-10.73%, +6.5%]
    Further explanation regarding interpretation and methodology can be found in the documentation.

@ajh-oai ajh-oai marked this pull request as ready for review May 12, 2026 21:38
@ajh-oai ajh-oai marked this pull request as draft May 12, 2026 21:38
@ajh-oai ajh-oai marked this pull request as ready for review May 12, 2026 21:42
Co-authored-by: Codex <noreply@openai.com>
@github-actions
Copy link
Copy Markdown
Contributor

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 7859a14 is merged into main:

  • ✔️default: 26.7s -> 21s [-62.47%, +20.28%]
  • ✔️no_delays: 23.8s -> 22.2s [-14.38%, +0.34%]
  • ✔️random_walk: 9.63s -> 9.14s [-10.98%, +0.85%]
  • ✔️stationary: 12.4s -> 21.4s [-106.69%, +251.56%]
  • ✔️uncertain: 1.06m -> 1.1m [-7.91%, +15.51%]
    Further explanation regarding interpretation and methodology can be found in the documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant