From 5d870b539a6ea1aa1935093cf246ccaa00388ff4 Mon Sep 17 00:00:00 2001 From: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com> Date: Fri, 8 May 2026 12:25:57 +0100 Subject: [PATCH 01/13] Improve identifiability of non-stationary GP Three changes to inst/stan that together eliminate stuck chains and divergences seen on the non-stationary Gaussian process used to model Rt: 1. Rescale GP increments by 1/sqrt(gp_n) inside update_Rt so alpha controls the trajectory SD rather than the increment SD, matching what gp_opts() docs already claim. 2. Centre the cumulated GP (gp -= mean(gp)) so log R0 = mean log Rt rather than the initial value. Eliminates the (R0, drift) ridge in the joint posterior. 3. Switch GP coefficient sampling to centred form: eta ~ N(0, diagSPD) with noise = PHI * eta. Avoids the (alpha, eta) funnel that arises in non-centred form when alpha is small. On previously catastrophic seeds (R-hat > 4 with the old form) the new parameterisation gives R-hat < 1.02 with zero divergences and roughly 4x faster sampling. Co-authored-by: sbfnk --- NEWS.md | 4 ++ inst/stan/estimate_infections.stan | 14 ++++- inst/stan/functions/gaussian_process.stan | 69 +++++++++++++++++------ inst/stan/functions/rt.stan | 9 ++- 4 files changed, 77 insertions(+), 19 deletions(-) diff --git a/NEWS.md b/NEWS.md index 7264770fc..8e5d35936 100644 --- a/NEWS.md +++ b/NEWS.md @@ -17,6 +17,10 @@ - Added a model overview vignette with an architecture diagram showing how the package's models connect. - Added a model features vignette providing a quick reference to all modelling options with links to detailed documentation. +## Model changes + +- Improved identifiability of the non-stationary Gaussian process used to model Rt over time. Three changes in `inst/stan`: (1) GP increments are rescaled by `1/sqrt(gp_n)` so the `alpha` hyperparameter controls the standard deviation of the cumulated log-Rt trajectory rather than of the increment, matching what `gp_opts()` already documents; (2) the cumulated GP is centred so that `R0` is the mean Rt over the trajectory rather than the initial value, eliminating the `(R0, drift)` ridge in the joint posterior; (3) the GP coefficients are now sampled in centred form (`eta ~ N(0, diagSPD)` rather than unit-normal eta scaled by `diagSPD`), which avoids the `(alpha, eta)` funnel and gives substantially smoother HMC geometry in the small-`alpha` regime. Together the changes eliminate stuck chains and divergent transitions seen on previously pathological seeds, while sampling roughly 4× faster. + ## Bug fixes - Fixed a bug in `forecast_infections()` where the summary call to extract dates was using modified args instead of the original fit dimensions, causing a date-dimension mismatch when extending the R trajectory beyond the original observation period. diff --git a/inst/stan/estimate_infections.stan b/inst/stan/estimate_infections.stan index d535d016d..6fae8cf2c 100644 --- a/inst/stan/estimate_infections.stan +++ b/inst/stan/estimate_infections.stan @@ -202,7 +202,19 @@ model { // priors for noise GP if (!fixed) { profile("gp lp") { - gaussian_process_lp(eta); + // Recompute alpha and rescaled_rho here because they are local to + // the transformed parameters block. The centred-form gaussian_process_lp + // needs them to compute diagSPD for the eta prior. + real alpha_for_lp = get_param( + param_id_alpha, params_fixed_lookup, params_variable_lookup, + params_value, params + ); + real rescaled_rho_for_lp = 2 * get_param( + param_id_rho, params_fixed_lookup, params_variable_lookup, + params_value, params + ) / noise_terms; + gaussian_process_lp(eta, M, L, alpha_for_lp, rescaled_rho_for_lp, + gp_type, nu); } } diff --git a/inst/stan/functions/gaussian_process.stan b/inst/stan/functions/gaussian_process.stan index 2f067b4bd..bcb1a2e8a 100644 --- a/inst/stan/functions/gaussian_process.stan +++ b/inst/stan/functions/gaussian_process.stan @@ -181,23 +181,20 @@ matrix setup_gp(int M, real L, int dimension, int is_periodic, real w0) { } /** - * Update Gaussian process using spectral densities + * Compute the diagonal of the spectral density matrix for the chosen kernel. * - * @param PHI Basis functions matrix * @param M Number of basis functions * @param L Length of the interval - * @param alpha Scaling parameter - * @param rho Length scale parameter - * @param eta Vector of noise terms - * @param type Type of kernel (0: SE, 1: Periodic, 2: Matern) - * @param nu Smoothness parameter for Matern kernel - * @return A vector of updated noise terms + * @param alpha GP magnitude + * @param rho GP length-scale + * @param type Kernel type (0=EQ, 1=Periodic, 2=Matern) + * @param nu Matern smoothness + * @return Diagonal vector of spectral densities + * + * @ingroup estimates_smoothing */ -vector update_gp(matrix PHI, int M, real L, real alpha, - real rho, vector eta, int type, real nu) { - vector[type == 1 ? 2 * M : M] diagSPD; // spectral density - - // GP in noise - spectral densities +vector compute_diagSPD(int M, real L, real alpha, real rho, int type, real nu) { + vector[type == 1 ? 2 * M : M] diagSPD; if (type == 0) { diagSPD = diagSPD_EQ(alpha, rho, L, M); } else if (type == 1) { @@ -213,17 +210,55 @@ vector update_gp(matrix PHI, int M, real L, real alpha, reject("nu must be one of 1/2, 3/2 or 5/2; found nu=", nu); } } - return PHI * (diagSPD .* eta); + return diagSPD; +} + +/** + * Update Gaussian process using spectral densities (centred form). + * + * Returns PHI * eta where eta is sampled with a N(0, diagSPD) prior in + * gaussian_process_lp. The spectral density scaling is applied via the + * prior rather than via multiplication of unit-normal eta, which avoids + * the (alpha, eta) funnel that arises in the non-centred form when alpha + * is small (data uninformative about eta). + * + * @param PHI Basis functions matrix + * @param M Number of basis functions (unused; kept for signature stability) + * @param L Length of the interval (unused; kept for signature stability) + * @param alpha Scaling parameter (unused; kept for signature stability) + * @param rho Length scale parameter (unused; kept for signature stability) + * @param eta Vector of noise terms (already on diagSPD scale) + * @param type Type of kernel (unused; kept for signature stability) + * @param nu Smoothness parameter (unused; kept for signature stability) + * @return A vector of GP noise values + */ +vector update_gp(matrix PHI, int M, real L, real alpha, + real rho, vector eta, int type, real nu) { + return PHI * eta; } /** - * Priors for Gaussian process (excluding length scale) + * Priors for Gaussian process (centred form). + * + * Samples eta on its natural scale: eta ~ N(0, diagSPD(alpha, rho)). This + * is the centred parameterisation of the HSGP coefficients. Combined with + * `noise = PHI * eta` in update_gp it gives smoother HMC geometry than + * the non-centred form when alpha is small (Stan manual recommendation). * * @param eta Vector of noise terms + * @param M Number of basis functions + * @param L Boundary + * @param alpha GP magnitude + * @param rho GP length-scale + * @param type Kernel type (0=EQ, 1=Periodic, 2=Matern) + * @param nu Matern smoothness * * @ingroup estimates_smoothing */ -void gaussian_process_lp(vector eta) { - eta ~ std_normal(); +void gaussian_process_lp(vector eta, int M, real L, real alpha, real rho, + int type, real nu) { + vector[type == 1 ? 2 * M : M] diagSPD = + compute_diagSPD(M, L, alpha, rho, type, nu); + eta ~ normal(0, diagSPD); } diff --git a/inst/stan/functions/rt.stan b/inst/stan/functions/rt.stan index 0c1a85ce2..61bcaa17d 100644 --- a/inst/stan/functions/rt.stan +++ b/inst/stan/functions/rt.stan @@ -46,8 +46,15 @@ vector update_Rt(int t, real R0, vector noise, array[] int bps, gp[(gp_n + 1):t] = rep_vector(noise[gp_n], t - gp_n); } } else { - gp[2:(gp_n + 1)] = noise; + // alpha controls trajectory SD (not increment SD): rescale increments + // by 1/sqrt(gp_n) so the cumulative log Rt has variance ≈ alpha^2 + // instead of alpha^2 * gp_n. + gp[2:(gp_n + 1)] = noise / sqrt(gp_n); gp = cumulative_sum(gp); + // Identifiability: subtract the trajectory mean so log R0 = mean log + // Rt rather than the initial value. Eliminates the (R0, drift) ridge + // in the joint posterior. + gp -= mean(gp); } logR = logR + gp; } From 3d96f3fb5024e85b86735c14c56beb7cfc1656b4 Mon Sep 17 00:00:00 2001 From: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com> Date: Fri, 8 May 2026 13:47:02 +0100 Subject: [PATCH 02/13] Widen default alpha to compensate for rescaled GP MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Under the old non-stationary GP parameterisation, alpha was the increment SD and the implied trajectory SD scaled with sqrt(gp_n). With the new parameterisation (rescaling + cumsum centring), alpha is the trajectory SD directly, and the same numerical alpha implies a ~13× tighter Rt trajectory than before for typical gp_n. Bumping the default from Normal(0, 0.01) to Normal(0, 0.08) restores approximately the same Rt expressiveness as the old default. Documents the new alpha semantics in the gp_opts() roxygen. Verified empirically: the previously stuck seeds still sample without divergences under the wider prior, so the geometric improvements stand on their own — this is purely a default-tuning fix. Co-authored-by: sbfnk --- NEWS.md | 1 + R/opts.R | 12 ++++++------ 2 files changed, 7 insertions(+), 6 deletions(-) diff --git a/NEWS.md b/NEWS.md index 8e5d35936..93bee07e8 100644 --- a/NEWS.md +++ b/NEWS.md @@ -20,6 +20,7 @@ ## Model changes - Improved identifiability of the non-stationary Gaussian process used to model Rt over time. Three changes in `inst/stan`: (1) GP increments are rescaled by `1/sqrt(gp_n)` so the `alpha` hyperparameter controls the standard deviation of the cumulated log-Rt trajectory rather than of the increment, matching what `gp_opts()` already documents; (2) the cumulated GP is centred so that `R0` is the mean Rt over the trajectory rather than the initial value, eliminating the `(R0, drift)` ridge in the joint posterior; (3) the GP coefficients are now sampled in centred form (`eta ~ N(0, diagSPD)` rather than unit-normal eta scaled by `diagSPD`), which avoids the `(alpha, eta)` funnel and gives substantially smoother HMC geometry in the small-`alpha` regime. Together the changes eliminate stuck chains and divergent transitions seen on previously pathological seeds, while sampling roughly 4× faster. +- The default prior on `alpha` in `gp_opts()` has been widened from `Normal(0, 0.01)` to `Normal(0, 0.08)` to compensate for the rescaling and centring above. Under the old parameterisation, `alpha` represented the increment SD and the implied trajectory SD scaled with `sqrt(gp_n)`; under the new parameterisation, `alpha` represents the trajectory SD directly. Without widening the prior, the default would silently imply a much smoother Rt trajectory than before. The new value targets approximately the same trajectory SD that the old default produced. Users with custom `alpha` priors should rescale by `sqrt(gp_n) * sqrt(3)` to match previous expressiveness (the `sqrt(3)` accounts for the cumulative-sum centring suppressing the constant mode). ## Bug fixes diff --git a/R/opts.R b/R/opts.R index ccc236700..7b4c8bd8a 100644 --- a/R/opts.R +++ b/R/opts.R @@ -454,11 +454,11 @@ backcalc_opts <- function(prior = c("reports", "none", "infections"), #' enforced automatically to ensure positivity) #' #' @param alpha A `` giving the prior distribution of the magnitude -#' parameter of the Gaussian process kernel. Should be approximately the -#' expected standard deviation of the Gaussian process (logged Rt in case of -#' the renewal model, logged infections in case of the nonmechanistic model). -#' Defaults to a half-normal distribution with mean 0 and sd 0.01: -#' `Normal(mean = 0, sd = 0.01)` (a lower limit of 0 will be enforced +#' parameter of the Gaussian process kernel. After rescaling of the +#' non-stationary GP, this is approximately the standard deviation of the +#' cumulated log-Rt trajectory over the observation window. Defaults to a +#' half-normal distribution with mean 0 and sd 0.08: +#' `Normal(mean = 0, sd = 0.08)` (a lower limit of 0 will be enforced #' automatically to ensure positivity) #' #' @param kernel Character string, the type of kernel required. Currently @@ -501,7 +501,7 @@ backcalc_opts <- function(prior = c("reports", "none", "infections"), gp_opts <- function(basis_prop = 0.2, boundary_scale = 1.5, ls = LogNormal(mean = 21, sd = 7, max = 60), - alpha = Normal(mean = 0, sd = 0.01), + alpha = Normal(mean = 0, sd = 0.08), kernel = c("matern", "se", "ou", "periodic"), matern_order = 3 / 2, w0 = 1.0) { From f25229efdef83de57df8937c3f54cdab48f3076d Mon Sep 17 00:00:00 2001 From: "epiforecasts-workflows[bot]" Date: Fri, 8 May 2026 12:54:04 +0000 Subject: [PATCH 03/13] Update documentation --- man/gp_opts.Rd | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/man/gp_opts.Rd b/man/gp_opts.Rd index 6074783e9..552189f35 100644 --- a/man/gp_opts.Rd +++ b/man/gp_opts.Rd @@ -8,7 +8,7 @@ gp_opts( basis_prop = 0.2, boundary_scale = 1.5, ls = LogNormal(mean = 21, sd = 7, max = 60), - alpha = Normal(mean = 0, sd = 0.01), + alpha = Normal(mean = 0, sd = 0.08), kernel = c("matern", "se", "ou", "periodic"), matern_order = 3/2, w0 = 1 @@ -33,11 +33,11 @@ a Lognormal distribution with mean 21 days, sd 7 days and maximum 60 days: enforced automatically to ensure positivity)} \item{alpha}{A \verb{} giving the prior distribution of the magnitude -parameter of the Gaussian process kernel. Should be approximately the -expected standard deviation of the Gaussian process (logged Rt in case of -the renewal model, logged infections in case of the nonmechanistic model). -Defaults to a half-normal distribution with mean 0 and sd 0.01: -\code{Normal(mean = 0, sd = 0.01)} (a lower limit of 0 will be enforced +parameter of the Gaussian process kernel. After rescaling of the +non-stationary GP, this is approximately the standard deviation of the +cumulated log-Rt trajectory over the observation window. Defaults to a +half-normal distribution with mean 0 and sd 0.08: +\code{Normal(mean = 0, sd = 0.08)} (a lower limit of 0 will be enforced automatically to ensure positivity)} \item{kernel}{Character string, the type of kernel required. Currently From e0455bc2f422e4a4e22d3f6a50d4804c5adffcf3 Mon Sep 17 00:00:00 2001 From: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com> Date: Fri, 8 May 2026 19:27:31 +0100 Subject: [PATCH 04/13] Widen default alpha further to match observed posterior MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Empirical tests show the data prefers alpha around 0.29 in the new parameterisation, but Normal(0, 0.08) puts only 0.06% prior mass above that — the prior would clamp alpha well below where the data wants it. Also: the sqrt(3) Brownian-bridge factor for the centring's effect on trajectory SD is an underestimate for the smooth Matern GP we use. Bumping the default sd to 0.2 (16% prior mass above 0.29) lets the data drive the posterior without the prior dominating, restoring Rt expressiveness comparable to the old parameterisation. Co-authored-by: sbfnk --- NEWS.md | 2 +- R/opts.R | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/NEWS.md b/NEWS.md index 93bee07e8..3b5f07836 100644 --- a/NEWS.md +++ b/NEWS.md @@ -20,7 +20,7 @@ ## Model changes - Improved identifiability of the non-stationary Gaussian process used to model Rt over time. Three changes in `inst/stan`: (1) GP increments are rescaled by `1/sqrt(gp_n)` so the `alpha` hyperparameter controls the standard deviation of the cumulated log-Rt trajectory rather than of the increment, matching what `gp_opts()` already documents; (2) the cumulated GP is centred so that `R0` is the mean Rt over the trajectory rather than the initial value, eliminating the `(R0, drift)` ridge in the joint posterior; (3) the GP coefficients are now sampled in centred form (`eta ~ N(0, diagSPD)` rather than unit-normal eta scaled by `diagSPD`), which avoids the `(alpha, eta)` funnel and gives substantially smoother HMC geometry in the small-`alpha` regime. Together the changes eliminate stuck chains and divergent transitions seen on previously pathological seeds, while sampling roughly 4× faster. -- The default prior on `alpha` in `gp_opts()` has been widened from `Normal(0, 0.01)` to `Normal(0, 0.08)` to compensate for the rescaling and centring above. Under the old parameterisation, `alpha` represented the increment SD and the implied trajectory SD scaled with `sqrt(gp_n)`; under the new parameterisation, `alpha` represents the trajectory SD directly. Without widening the prior, the default would silently imply a much smoother Rt trajectory than before. The new value targets approximately the same trajectory SD that the old default produced. Users with custom `alpha` priors should rescale by `sqrt(gp_n) * sqrt(3)` to match previous expressiveness (the `sqrt(3)` accounts for the cumulative-sum centring suppressing the constant mode). +- The default prior on `alpha` in `gp_opts()` has been widened from `Normal(0, 0.01)` to `Normal(0, 0.2)` to compensate for the rescaling and centring above. Under the old parameterisation, `alpha` represented the increment SD and the implied trajectory SD scaled with `sqrt(gp_n)`; under the new parameterisation, `alpha` represents the trajectory SD directly. The new default leaves enough prior mass at moderate `alpha` (e.g. ~0.3) for the data to determine the actual posterior trajectory SD without the prior dominating. Users with custom `alpha` priors should rescale by approximately `sqrt(gp_n) * sqrt(3)` to match the previous prior's implied trajectory SD (the `sqrt(3)` accounts for the cumulative-sum centring suppressing the constant mode), but in practice the data is informative enough about `alpha` that loose priors are recommended. ## Bug fixes diff --git a/R/opts.R b/R/opts.R index 7b4c8bd8a..ca95b1ea5 100644 --- a/R/opts.R +++ b/R/opts.R @@ -457,8 +457,8 @@ backcalc_opts <- function(prior = c("reports", "none", "infections"), #' parameter of the Gaussian process kernel. After rescaling of the #' non-stationary GP, this is approximately the standard deviation of the #' cumulated log-Rt trajectory over the observation window. Defaults to a -#' half-normal distribution with mean 0 and sd 0.08: -#' `Normal(mean = 0, sd = 0.08)` (a lower limit of 0 will be enforced +#' half-normal distribution with mean 0 and sd 0.2: +#' `Normal(mean = 0, sd = 0.2)` (a lower limit of 0 will be enforced #' automatically to ensure positivity) #' #' @param kernel Character string, the type of kernel required. Currently @@ -501,7 +501,7 @@ backcalc_opts <- function(prior = c("reports", "none", "infections"), gp_opts <- function(basis_prop = 0.2, boundary_scale = 1.5, ls = LogNormal(mean = 21, sd = 7, max = 60), - alpha = Normal(mean = 0, sd = 0.08), + alpha = Normal(mean = 0, sd = 0.2), kernel = c("matern", "se", "ou", "periodic"), matern_order = 3 / 2, w0 = 1.0) { From 965d38c725ffaf01c43def2d1cd5635ed9de10ab Mon Sep 17 00:00:00 2001 From: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com> Date: Fri, 8 May 2026 19:28:29 +0100 Subject: [PATCH 05/13] Update gp_opts.Rd for new alpha default --- man/gp_opts.Rd | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/man/gp_opts.Rd b/man/gp_opts.Rd index 552189f35..4452e146c 100644 --- a/man/gp_opts.Rd +++ b/man/gp_opts.Rd @@ -8,7 +8,7 @@ gp_opts( basis_prop = 0.2, boundary_scale = 1.5, ls = LogNormal(mean = 21, sd = 7, max = 60), - alpha = Normal(mean = 0, sd = 0.08), + alpha = Normal(mean = 0, sd = 0.2), kernel = c("matern", "se", "ou", "periodic"), matern_order = 3/2, w0 = 1 @@ -36,8 +36,8 @@ enforced automatically to ensure positivity)} parameter of the Gaussian process kernel. After rescaling of the non-stationary GP, this is approximately the standard deviation of the cumulated log-Rt trajectory over the observation window. Defaults to a -half-normal distribution with mean 0 and sd 0.08: -\code{Normal(mean = 0, sd = 0.08)} (a lower limit of 0 will be enforced +half-normal distribution with mean 0 and sd 0.2: +\code{Normal(mean = 0, sd = 0.2)} (a lower limit of 0 will be enforced automatically to ensure positivity)} \item{kernel}{Character string, the type of kernel required. Currently From e2342162c6f32e3fb9fd8d544de1ebc15a7d6d6a Mon Sep 17 00:00:00 2001 From: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com> Date: Fri, 8 May 2026 20:13:21 +0100 Subject: [PATCH 06/13] Update tests for new GP parameterisation Four test failures triggered by the model changes in this branch: - test-gp_opts.R:5: default alpha is now Normal(0, 0.2) - test-stan-guassian-process.R:163: update_gp now returns PHI * eta directly (the diagSPD scaling moved to the eta prior in the centred form) - test-stan-rt.R:12, :56: update_Rt outputs differ because of the 1/sqrt(gp_n) increment rescaling and the cumsum centring; new expected values computed for the same inputs Co-authored-by: sbfnk --- tests/testthat/test-gp_opts.R | 2 +- tests/testthat/test-stan-guassian-process.R | 26 ++++++++++++++++++--- tests/testthat/test-stan-rt.R | 14 +++++++---- 3 files changed, 34 insertions(+), 8 deletions(-) diff --git a/tests/testthat/test-gp_opts.R b/tests/testthat/test-gp_opts.R index 94ffd0b40..d1a91941e 100644 --- a/tests/testthat/test-gp_opts.R +++ b/tests/testthat/test-gp_opts.R @@ -2,7 +2,7 @@ test_that("gp_opts returns correct default values", { gp <- gp_opts() expect_equal(gp$basis_prop, 0.2) expect_equal(gp$boundary_scale, 1.5) - expect_equal(gp$alpha, Normal(0, 0.01)) + expect_equal(gp$alpha, Normal(0, 0.2)) expect_equal(gp$ls, LogNormal(mean = 21, sd = 7, max = 60)) expect_equal(gp$kernel, "matern") expect_equal(gp$matern_order, 3 / 2) diff --git a/tests/testthat/test-stan-guassian-process.R b/tests/testthat/test-stan-guassian-process.R index 085285e67..7d53db2da 100644 --- a/tests/testthat/test-stan-guassian-process.R +++ b/tests/testthat/test-stan-guassian-process.R @@ -157,8 +157,28 @@ test_that("update_gp returns correct dimensions and values", { nu <- 1.5 # Not used in SE case result <- update_gp(PHI, M, L, alpha, rho, eta, type, nu) expect_equal(length(result), nrow(PHI)) # Should match number of observations - # Check specific values for known inputs - diagSPD <- diagSPD_EQ(alpha, rho, L, M) - expected_result <- PHI %*% (diagSPD * eta) + # update_gp uses the centred parameterisation: eta is sampled with prior + # N(0, diagSPD) in gaussian_process_lp, so update_gp just returns PHI * eta + # without applying the diagSPD scaling itself. + expected_result <- PHI %*% eta expect_equal(matrix(result, ncol = 1), expected_result, tolerance = 1e-8) }) + +test_that("gaussian_process_lp eta prior scales with diagSPD", { + # Verify the prior contribution matches eta ~ N(0, diagSPD). + M <- 3 + L <- 1.0 + alpha <- 1.0 + rho <- 2.0 + type <- 0 + nu <- 1.5 # not used for type=0 + eta <- c(0.5, -0.3, 0.1) + diagSPD <- diagSPD_EQ(alpha, rho, L, M) + expected_lp <- sum(dnorm(eta, mean = 0, sd = diagSPD, log = TRUE)) + # gaussian_process_lp adds to target; in expose_stan_fns we get a void + # function. Calling it should not error; the prior contribution itself is + # tested via the expected normal log-density formula. + expect_silent(gaussian_process_lp(eta, M, L, alpha, rho, type, nu)) + expect_true(all(is.finite(diagSPD))) + expect_true(is.finite(expected_lp)) +}) diff --git a/tests/testthat/test-stan-rt.R b/tests/testthat/test-stan-rt.R index bffc9c8b8..30fb657ed 100644 --- a/tests/testthat/test-stan-rt.R +++ b/tests/testthat/test-stan-rt.R @@ -9,9 +9,13 @@ test_that("update_Rt works to produce multiple Rt estimates with a static gaussi ) }) test_that("update_Rt works to produce multiple Rt estimates with a non-static gaussian process", { + # Non-stationary GP: increments rescaled by 1/sqrt(gp_n) and the cumulated + # trajectory centred so log R0 = mean log Rt rather than initial log Rt. + # For noise = rep(0.1, 9), gp_n = 9, log(R0) = log(1.2): + # gp = cumsum(rep(0.1/3, 9)), prepended with 0, then mean-subtracted. expect_equal( - round(update_Rt(10, 1.2, rep(0.1, 9), rep(10, 0), numeric(0), 0), 2), - c(1.20, 1.33, 1.47, 1.62, 1.79, 1.98, 2.19, 2.42, 2.67, 2.95) + round(update_Rt(10, 1.2, rep(0.1, 9), rep(10, 0), numeric(0), 0), 3), + c(1.033, 1.068, 1.104, 1.141, 1.180, 1.220, 1.262, 1.304, 1.348, 1.394) ) }) test_that("update_Rt works to produce multiple Rt estimates with a non-static stationary gaussian process", { @@ -53,9 +57,11 @@ test_that("update_Rt works when Rt is variable and a breakpoint is present", { round(update_Rt(5, 1.2, rep(0, 5), c(1, 1, 2, 2, 2), 0.1, 1), 2), c(1.2, 1.2, rep(1.33, 3)) ) + # Non-stationary GP: see explanation in the earlier non-static GP test. + # Here gp_n = 4, breakpoints add 0.1 from t=3 onward. expect_equal( - round(update_Rt(5, 1.2, rep(0.1, 4), c(1, 1, 2, 2, 2), 0.1, 0), 2), - c(1.20, 1.33, 1.62, 1.79, 1.98) + round(update_Rt(5, 1.2, rep(0.1, 4), c(1, 1, 2, 2, 2), 0.1, 0), 3), + c(1.086, 1.141, 1.326, 1.394, 1.466) ) }) From 71112fd85700e8362fce98fff6e45de236621b55 Mon Sep 17 00:00:00 2001 From: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com> Date: Fri, 8 May 2026 21:36:48 +0100 Subject: [PATCH 07/13] Widen default alpha to Normal(0, 0.5) and document weak identification MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Prior-sensitivity in alpha is real: a 13× shift in prior median moves the posterior median by ~50%, suggesting the data alone is only weakly informative about alpha. Bumping the default to Normal(0, 0.5) gives users plenty of headroom for the data to find its preferred value, while documenting that informed priors are worthwhile when domain knowledge is available. Co-authored-by: sbfnk --- NEWS.md | 2 +- R/opts.R | 11 +++++++---- man/gp_opts.Rd | 11 +++++++---- tests/testthat/test-gp_opts.R | 2 +- 4 files changed, 16 insertions(+), 10 deletions(-) diff --git a/NEWS.md b/NEWS.md index 3b5f07836..852112589 100644 --- a/NEWS.md +++ b/NEWS.md @@ -20,7 +20,7 @@ ## Model changes - Improved identifiability of the non-stationary Gaussian process used to model Rt over time. Three changes in `inst/stan`: (1) GP increments are rescaled by `1/sqrt(gp_n)` so the `alpha` hyperparameter controls the standard deviation of the cumulated log-Rt trajectory rather than of the increment, matching what `gp_opts()` already documents; (2) the cumulated GP is centred so that `R0` is the mean Rt over the trajectory rather than the initial value, eliminating the `(R0, drift)` ridge in the joint posterior; (3) the GP coefficients are now sampled in centred form (`eta ~ N(0, diagSPD)` rather than unit-normal eta scaled by `diagSPD`), which avoids the `(alpha, eta)` funnel and gives substantially smoother HMC geometry in the small-`alpha` regime. Together the changes eliminate stuck chains and divergent transitions seen on previously pathological seeds, while sampling roughly 4× faster. -- The default prior on `alpha` in `gp_opts()` has been widened from `Normal(0, 0.01)` to `Normal(0, 0.2)` to compensate for the rescaling and centring above. Under the old parameterisation, `alpha` represented the increment SD and the implied trajectory SD scaled with `sqrt(gp_n)`; under the new parameterisation, `alpha` represents the trajectory SD directly. The new default leaves enough prior mass at moderate `alpha` (e.g. ~0.3) for the data to determine the actual posterior trajectory SD without the prior dominating. Users with custom `alpha` priors should rescale by approximately `sqrt(gp_n) * sqrt(3)` to match the previous prior's implied trajectory SD (the `sqrt(3)` accounts for the cumulative-sum centring suppressing the constant mode), but in practice the data is informative enough about `alpha` that loose priors are recommended. +- The default prior on `alpha` in `gp_opts()` has been widened from `Normal(0, 0.01)` to `Normal(0, 0.5)` to compensate for the rescaling and centring above. Under the old parameterisation, `alpha` represented the increment SD and the implied trajectory SD scaled with `sqrt(gp_n)`; under the new parameterisation, `alpha` represents the trajectory SD directly. The data is only weakly informative about `alpha` (a 13× shift in prior median typically moves the posterior median by ~50%), so the wider default lets users with informed beliefs about Rt variability supply tighter priors as needed. Users with custom `alpha` priors should rescale by approximately `sqrt(gp_n) * sqrt(3)` to match the previous prior's implied trajectory SD. ## Bug fixes diff --git a/R/opts.R b/R/opts.R index ca95b1ea5..2c98d86f4 100644 --- a/R/opts.R +++ b/R/opts.R @@ -457,9 +457,12 @@ backcalc_opts <- function(prior = c("reports", "none", "infections"), #' parameter of the Gaussian process kernel. After rescaling of the #' non-stationary GP, this is approximately the standard deviation of the #' cumulated log-Rt trajectory over the observation window. Defaults to a -#' half-normal distribution with mean 0 and sd 0.2: -#' `Normal(mean = 0, sd = 0.2)` (a lower limit of 0 will be enforced -#' automatically to ensure positivity) +#' half-normal distribution with mean 0 and sd 0.5: +#' `Normal(mean = 0, sd = 0.5)` (a lower limit of 0 will be enforced +#' automatically to ensure positivity). Note: the data is only weakly +#' informative about `alpha` — the posterior is sensitive to the prior +#' choice. Users with strong domain knowledge of the expected scale of +#' log-Rt variation should consider supplying an informed prior. #' #' @param kernel Character string, the type of kernel required. Currently #' supporting the Matern kernel ("matern"), squared exponential kernel ("se"), @@ -501,7 +504,7 @@ backcalc_opts <- function(prior = c("reports", "none", "infections"), gp_opts <- function(basis_prop = 0.2, boundary_scale = 1.5, ls = LogNormal(mean = 21, sd = 7, max = 60), - alpha = Normal(mean = 0, sd = 0.2), + alpha = Normal(mean = 0, sd = 0.5), kernel = c("matern", "se", "ou", "periodic"), matern_order = 3 / 2, w0 = 1.0) { diff --git a/man/gp_opts.Rd b/man/gp_opts.Rd index 4452e146c..433f73372 100644 --- a/man/gp_opts.Rd +++ b/man/gp_opts.Rd @@ -8,7 +8,7 @@ gp_opts( basis_prop = 0.2, boundary_scale = 1.5, ls = LogNormal(mean = 21, sd = 7, max = 60), - alpha = Normal(mean = 0, sd = 0.2), + alpha = Normal(mean = 0, sd = 0.5), kernel = c("matern", "se", "ou", "periodic"), matern_order = 3/2, w0 = 1 @@ -36,9 +36,12 @@ enforced automatically to ensure positivity)} parameter of the Gaussian process kernel. After rescaling of the non-stationary GP, this is approximately the standard deviation of the cumulated log-Rt trajectory over the observation window. Defaults to a -half-normal distribution with mean 0 and sd 0.2: -\code{Normal(mean = 0, sd = 0.2)} (a lower limit of 0 will be enforced -automatically to ensure positivity)} +half-normal distribution with mean 0 and sd 0.5: +\code{Normal(mean = 0, sd = 0.5)} (a lower limit of 0 will be enforced +automatically to ensure positivity). Note: the data is only weakly +informative about \code{alpha} — the posterior is sensitive to the prior +choice. Users with strong domain knowledge of the expected scale of +log-Rt variation should consider supplying an informed prior.} \item{kernel}{Character string, the type of kernel required. Currently supporting the Matern kernel ("matern"), squared exponential kernel ("se"), diff --git a/tests/testthat/test-gp_opts.R b/tests/testthat/test-gp_opts.R index d1a91941e..924ff2358 100644 --- a/tests/testthat/test-gp_opts.R +++ b/tests/testthat/test-gp_opts.R @@ -2,7 +2,7 @@ test_that("gp_opts returns correct default values", { gp <- gp_opts() expect_equal(gp$basis_prop, 0.2) expect_equal(gp$boundary_scale, 1.5) - expect_equal(gp$alpha, Normal(0, 0.2)) + expect_equal(gp$alpha, Normal(0, 0.5)) expect_equal(gp$ls, LogNormal(mean = 21, sd = 7, max = 60)) expect_equal(gp$kernel, "matern") expect_equal(gp$matern_order, 3 / 2) From 7a0326ec59f7fc928f7c78d553653f81aecdac38 Mon Sep 17 00:00:00 2001 From: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com> Date: Fri, 8 May 2026 23:51:38 +0100 Subject: [PATCH 08/13] Revert default alpha to Normal(0, 0.2) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Normal(0, 0.5) breaks sampling on previously stuck seeds: the wider prior lets chains wander into large-alpha regions during warmup where the geometry funnel is bad enough that the centred form can't compensate. Verified empirically — seed=8 went from clean (R-hat=1.022, td=0) at sd=0.2 to catastrophic (R-hat=1020, td_hits=500) at sd=0.5. Document the prior-sensitivity but keep the default conservative. Co-authored-by: sbfnk --- NEWS.md | 2 +- R/opts.R | 12 +++++++----- man/gp_opts.Rd | 12 +++++++----- tests/testthat/test-gp_opts.R | 2 +- 4 files changed, 16 insertions(+), 12 deletions(-) diff --git a/NEWS.md b/NEWS.md index 852112589..792836a18 100644 --- a/NEWS.md +++ b/NEWS.md @@ -20,7 +20,7 @@ ## Model changes - Improved identifiability of the non-stationary Gaussian process used to model Rt over time. Three changes in `inst/stan`: (1) GP increments are rescaled by `1/sqrt(gp_n)` so the `alpha` hyperparameter controls the standard deviation of the cumulated log-Rt trajectory rather than of the increment, matching what `gp_opts()` already documents; (2) the cumulated GP is centred so that `R0` is the mean Rt over the trajectory rather than the initial value, eliminating the `(R0, drift)` ridge in the joint posterior; (3) the GP coefficients are now sampled in centred form (`eta ~ N(0, diagSPD)` rather than unit-normal eta scaled by `diagSPD`), which avoids the `(alpha, eta)` funnel and gives substantially smoother HMC geometry in the small-`alpha` regime. Together the changes eliminate stuck chains and divergent transitions seen on previously pathological seeds, while sampling roughly 4× faster. -- The default prior on `alpha` in `gp_opts()` has been widened from `Normal(0, 0.01)` to `Normal(0, 0.5)` to compensate for the rescaling and centring above. Under the old parameterisation, `alpha` represented the increment SD and the implied trajectory SD scaled with `sqrt(gp_n)`; under the new parameterisation, `alpha` represents the trajectory SD directly. The data is only weakly informative about `alpha` (a 13× shift in prior median typically moves the posterior median by ~50%), so the wider default lets users with informed beliefs about Rt variability supply tighter priors as needed. Users with custom `alpha` priors should rescale by approximately `sqrt(gp_n) * sqrt(3)` to match the previous prior's implied trajectory SD. +- The default prior on `alpha` in `gp_opts()` has been widened from `Normal(0, 0.01)` to `Normal(0, 0.2)` to compensate for the rescaling and centring above. Under the old parameterisation, `alpha` represented the increment SD and the implied trajectory SD scaled with `sqrt(gp_n)`; under the new parameterisation, `alpha` represents the trajectory SD directly. The data is only weakly informative about `alpha` (a 13× shift in prior median typically moves the posterior median by ~50%), and very wide priors (e.g. sd > 0.3) can allow chains to wander into pathological regions during warmup. Users with custom `alpha` priors should rescale by approximately `sqrt(gp_n) * sqrt(3)` to match the previous prior's implied trajectory SD. ## Bug fixes diff --git a/R/opts.R b/R/opts.R index 2c98d86f4..0c0bb442c 100644 --- a/R/opts.R +++ b/R/opts.R @@ -457,12 +457,14 @@ backcalc_opts <- function(prior = c("reports", "none", "infections"), #' parameter of the Gaussian process kernel. After rescaling of the #' non-stationary GP, this is approximately the standard deviation of the #' cumulated log-Rt trajectory over the observation window. Defaults to a -#' half-normal distribution with mean 0 and sd 0.5: -#' `Normal(mean = 0, sd = 0.5)` (a lower limit of 0 will be enforced +#' half-normal distribution with mean 0 and sd 0.2: +#' `Normal(mean = 0, sd = 0.2)` (a lower limit of 0 will be enforced #' automatically to ensure positivity). Note: the data is only weakly #' informative about `alpha` — the posterior is sensitive to the prior -#' choice. Users with strong domain knowledge of the expected scale of -#' log-Rt variation should consider supplying an informed prior. +#' choice — and very wide priors (e.g. sd > 0.3) may allow chains to +#' wander into pathological regions during warmup. Users with strong +#' domain knowledge of the expected scale of log-Rt variation should +#' consider supplying an informed prior. #' #' @param kernel Character string, the type of kernel required. Currently #' supporting the Matern kernel ("matern"), squared exponential kernel ("se"), @@ -504,7 +506,7 @@ backcalc_opts <- function(prior = c("reports", "none", "infections"), gp_opts <- function(basis_prop = 0.2, boundary_scale = 1.5, ls = LogNormal(mean = 21, sd = 7, max = 60), - alpha = Normal(mean = 0, sd = 0.5), + alpha = Normal(mean = 0, sd = 0.2), kernel = c("matern", "se", "ou", "periodic"), matern_order = 3 / 2, w0 = 1.0) { diff --git a/man/gp_opts.Rd b/man/gp_opts.Rd index 433f73372..5edf9f7dc 100644 --- a/man/gp_opts.Rd +++ b/man/gp_opts.Rd @@ -8,7 +8,7 @@ gp_opts( basis_prop = 0.2, boundary_scale = 1.5, ls = LogNormal(mean = 21, sd = 7, max = 60), - alpha = Normal(mean = 0, sd = 0.5), + alpha = Normal(mean = 0, sd = 0.2), kernel = c("matern", "se", "ou", "periodic"), matern_order = 3/2, w0 = 1 @@ -36,12 +36,14 @@ enforced automatically to ensure positivity)} parameter of the Gaussian process kernel. After rescaling of the non-stationary GP, this is approximately the standard deviation of the cumulated log-Rt trajectory over the observation window. Defaults to a -half-normal distribution with mean 0 and sd 0.5: -\code{Normal(mean = 0, sd = 0.5)} (a lower limit of 0 will be enforced +half-normal distribution with mean 0 and sd 0.2: +\code{Normal(mean = 0, sd = 0.2)} (a lower limit of 0 will be enforced automatically to ensure positivity). Note: the data is only weakly informative about \code{alpha} — the posterior is sensitive to the prior -choice. Users with strong domain knowledge of the expected scale of -log-Rt variation should consider supplying an informed prior.} +choice — and very wide priors (e.g. sd > 0.3) may allow chains to +wander into pathological regions during warmup. Users with strong +domain knowledge of the expected scale of log-Rt variation should +consider supplying an informed prior.} \item{kernel}{Character string, the type of kernel required. Currently supporting the Matern kernel ("matern"), squared exponential kernel ("se"), diff --git a/tests/testthat/test-gp_opts.R b/tests/testthat/test-gp_opts.R index 924ff2358..d1a91941e 100644 --- a/tests/testthat/test-gp_opts.R +++ b/tests/testthat/test-gp_opts.R @@ -2,7 +2,7 @@ test_that("gp_opts returns correct default values", { gp <- gp_opts() expect_equal(gp$basis_prop, 0.2) expect_equal(gp$boundary_scale, 1.5) - expect_equal(gp$alpha, Normal(0, 0.5)) + expect_equal(gp$alpha, Normal(0, 0.2)) expect_equal(gp$ls, LogNormal(mean = 21, sd = 7, max = 60)) expect_equal(gp$kernel, "matern") expect_equal(gp$matern_order, 3 / 2) From 7a3d9578b6509143939d6df3c5b1be2478971bc1 Mon Sep 17 00:00:00 2001 From: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com> Date: Mon, 11 May 2026 08:32:10 +0100 Subject: [PATCH 09/13] Strip PR back to centring-only Empirical testing showed that just centring the cumulated GP (gp -= mean(gp)) fixes the (R0, drift) ridge that was the root cause of the stuck-chain and catastrophic R-hat issues. The previous version of this PR also bundled two other changes (rescaling of GP increments by 1/sqrt(gp_n) and a centred-form parameterisation of the GP coefficients with a much wider default alpha prior); those turn out not to be needed and brought substantial API impact for marginal gains in efficiency. Verified across the four previously-stuck seeds (2, 4, 8, 11) under OMP_NUM_THREADS=1 with the original default alpha = Normal(0, 0.01): - 0 treedepth hits, max 1 divergence, R-hat 1.004-1.008, ESS 569-842 - 2x more efficient per effective sample than the bundled version - No semantic change to alpha, no prior change Co-authored-by: sbfnk --- NEWS.md | 3 +- R/opts.R | 19 +++--- inst/stan/estimate_infections.stan | 14 +---- inst/stan/functions/gaussian_process.stan | 69 +++++---------------- inst/stan/functions/rt.stan | 11 ++-- man/gp_opts.Rd | 19 +++--- tests/testthat/test-gp_opts.R | 2 +- tests/testthat/test-stan-guassian-process.R | 26 +------- tests/testthat/test-stan-rt.R | 16 ++--- 9 files changed, 50 insertions(+), 129 deletions(-) diff --git a/NEWS.md b/NEWS.md index 792836a18..1eb4b2df0 100644 --- a/NEWS.md +++ b/NEWS.md @@ -19,8 +19,7 @@ ## Model changes -- Improved identifiability of the non-stationary Gaussian process used to model Rt over time. Three changes in `inst/stan`: (1) GP increments are rescaled by `1/sqrt(gp_n)` so the `alpha` hyperparameter controls the standard deviation of the cumulated log-Rt trajectory rather than of the increment, matching what `gp_opts()` already documents; (2) the cumulated GP is centred so that `R0` is the mean Rt over the trajectory rather than the initial value, eliminating the `(R0, drift)` ridge in the joint posterior; (3) the GP coefficients are now sampled in centred form (`eta ~ N(0, diagSPD)` rather than unit-normal eta scaled by `diagSPD`), which avoids the `(alpha, eta)` funnel and gives substantially smoother HMC geometry in the small-`alpha` regime. Together the changes eliminate stuck chains and divergent transitions seen on previously pathological seeds, while sampling roughly 4× faster. -- The default prior on `alpha` in `gp_opts()` has been widened from `Normal(0, 0.01)` to `Normal(0, 0.2)` to compensate for the rescaling and centring above. Under the old parameterisation, `alpha` represented the increment SD and the implied trajectory SD scaled with `sqrt(gp_n)`; under the new parameterisation, `alpha` represents the trajectory SD directly. The data is only weakly informative about `alpha` (a 13× shift in prior median typically moves the posterior median by ~50%), and very wide priors (e.g. sd > 0.3) can allow chains to wander into pathological regions during warmup. Users with custom `alpha` priors should rescale by approximately `sqrt(gp_n) * sqrt(3)` to match the previous prior's implied trajectory SD. +- The cumulated non-stationary Gaussian process used to model Rt over time is now mean-centred (`gp -= mean(gp)` in `inst/stan/functions/rt.stan`), so `R0` represents the mean Rt over the trajectory rather than the initial value. This eliminates the `(R0, drift)` ridge in the joint posterior that was responsible for stuck chains and catastrophic R-hat values on some seeds. No API change and no change to the `alpha` prior — verified across previously stuck seeds (R-hat goes from up to 6.10 down to <1.01, treedepth hits from hundreds down to zero). ## Bug fixes diff --git a/R/opts.R b/R/opts.R index 0c0bb442c..ccc236700 100644 --- a/R/opts.R +++ b/R/opts.R @@ -454,17 +454,12 @@ backcalc_opts <- function(prior = c("reports", "none", "infections"), #' enforced automatically to ensure positivity) #' #' @param alpha A `` giving the prior distribution of the magnitude -#' parameter of the Gaussian process kernel. After rescaling of the -#' non-stationary GP, this is approximately the standard deviation of the -#' cumulated log-Rt trajectory over the observation window. Defaults to a -#' half-normal distribution with mean 0 and sd 0.2: -#' `Normal(mean = 0, sd = 0.2)` (a lower limit of 0 will be enforced -#' automatically to ensure positivity). Note: the data is only weakly -#' informative about `alpha` — the posterior is sensitive to the prior -#' choice — and very wide priors (e.g. sd > 0.3) may allow chains to -#' wander into pathological regions during warmup. Users with strong -#' domain knowledge of the expected scale of log-Rt variation should -#' consider supplying an informed prior. +#' parameter of the Gaussian process kernel. Should be approximately the +#' expected standard deviation of the Gaussian process (logged Rt in case of +#' the renewal model, logged infections in case of the nonmechanistic model). +#' Defaults to a half-normal distribution with mean 0 and sd 0.01: +#' `Normal(mean = 0, sd = 0.01)` (a lower limit of 0 will be enforced +#' automatically to ensure positivity) #' #' @param kernel Character string, the type of kernel required. Currently #' supporting the Matern kernel ("matern"), squared exponential kernel ("se"), @@ -506,7 +501,7 @@ backcalc_opts <- function(prior = c("reports", "none", "infections"), gp_opts <- function(basis_prop = 0.2, boundary_scale = 1.5, ls = LogNormal(mean = 21, sd = 7, max = 60), - alpha = Normal(mean = 0, sd = 0.2), + alpha = Normal(mean = 0, sd = 0.01), kernel = c("matern", "se", "ou", "periodic"), matern_order = 3 / 2, w0 = 1.0) { diff --git a/inst/stan/estimate_infections.stan b/inst/stan/estimate_infections.stan index 6fae8cf2c..d535d016d 100644 --- a/inst/stan/estimate_infections.stan +++ b/inst/stan/estimate_infections.stan @@ -202,19 +202,7 @@ model { // priors for noise GP if (!fixed) { profile("gp lp") { - // Recompute alpha and rescaled_rho here because they are local to - // the transformed parameters block. The centred-form gaussian_process_lp - // needs them to compute diagSPD for the eta prior. - real alpha_for_lp = get_param( - param_id_alpha, params_fixed_lookup, params_variable_lookup, - params_value, params - ); - real rescaled_rho_for_lp = 2 * get_param( - param_id_rho, params_fixed_lookup, params_variable_lookup, - params_value, params - ) / noise_terms; - gaussian_process_lp(eta, M, L, alpha_for_lp, rescaled_rho_for_lp, - gp_type, nu); + gaussian_process_lp(eta); } } diff --git a/inst/stan/functions/gaussian_process.stan b/inst/stan/functions/gaussian_process.stan index bcb1a2e8a..2f067b4bd 100644 --- a/inst/stan/functions/gaussian_process.stan +++ b/inst/stan/functions/gaussian_process.stan @@ -181,20 +181,23 @@ matrix setup_gp(int M, real L, int dimension, int is_periodic, real w0) { } /** - * Compute the diagonal of the spectral density matrix for the chosen kernel. + * Update Gaussian process using spectral densities * + * @param PHI Basis functions matrix * @param M Number of basis functions * @param L Length of the interval - * @param alpha GP magnitude - * @param rho GP length-scale - * @param type Kernel type (0=EQ, 1=Periodic, 2=Matern) - * @param nu Matern smoothness - * @return Diagonal vector of spectral densities - * - * @ingroup estimates_smoothing + * @param alpha Scaling parameter + * @param rho Length scale parameter + * @param eta Vector of noise terms + * @param type Type of kernel (0: SE, 1: Periodic, 2: Matern) + * @param nu Smoothness parameter for Matern kernel + * @return A vector of updated noise terms */ -vector compute_diagSPD(int M, real L, real alpha, real rho, int type, real nu) { - vector[type == 1 ? 2 * M : M] diagSPD; +vector update_gp(matrix PHI, int M, real L, real alpha, + real rho, vector eta, int type, real nu) { + vector[type == 1 ? 2 * M : M] diagSPD; // spectral density + + // GP in noise - spectral densities if (type == 0) { diagSPD = diagSPD_EQ(alpha, rho, L, M); } else if (type == 1) { @@ -210,55 +213,17 @@ vector compute_diagSPD(int M, real L, real alpha, real rho, int type, real nu) { reject("nu must be one of 1/2, 3/2 or 5/2; found nu=", nu); } } - return diagSPD; -} - -/** - * Update Gaussian process using spectral densities (centred form). - * - * Returns PHI * eta where eta is sampled with a N(0, diagSPD) prior in - * gaussian_process_lp. The spectral density scaling is applied via the - * prior rather than via multiplication of unit-normal eta, which avoids - * the (alpha, eta) funnel that arises in the non-centred form when alpha - * is small (data uninformative about eta). - * - * @param PHI Basis functions matrix - * @param M Number of basis functions (unused; kept for signature stability) - * @param L Length of the interval (unused; kept for signature stability) - * @param alpha Scaling parameter (unused; kept for signature stability) - * @param rho Length scale parameter (unused; kept for signature stability) - * @param eta Vector of noise terms (already on diagSPD scale) - * @param type Type of kernel (unused; kept for signature stability) - * @param nu Smoothness parameter (unused; kept for signature stability) - * @return A vector of GP noise values - */ -vector update_gp(matrix PHI, int M, real L, real alpha, - real rho, vector eta, int type, real nu) { - return PHI * eta; + return PHI * (diagSPD .* eta); } /** - * Priors for Gaussian process (centred form). - * - * Samples eta on its natural scale: eta ~ N(0, diagSPD(alpha, rho)). This - * is the centred parameterisation of the HSGP coefficients. Combined with - * `noise = PHI * eta` in update_gp it gives smoother HMC geometry than - * the non-centred form when alpha is small (Stan manual recommendation). + * Priors for Gaussian process (excluding length scale) * * @param eta Vector of noise terms - * @param M Number of basis functions - * @param L Boundary - * @param alpha GP magnitude - * @param rho GP length-scale - * @param type Kernel type (0=EQ, 1=Periodic, 2=Matern) - * @param nu Matern smoothness * * @ingroup estimates_smoothing */ -void gaussian_process_lp(vector eta, int M, real L, real alpha, real rho, - int type, real nu) { - vector[type == 1 ? 2 * M : M] diagSPD = - compute_diagSPD(M, L, alpha, rho, type, nu); - eta ~ normal(0, diagSPD); +void gaussian_process_lp(vector eta) { + eta ~ std_normal(); } diff --git a/inst/stan/functions/rt.stan b/inst/stan/functions/rt.stan index 61bcaa17d..3210ed8b7 100644 --- a/inst/stan/functions/rt.stan +++ b/inst/stan/functions/rt.stan @@ -46,14 +46,11 @@ vector update_Rt(int t, real R0, vector noise, array[] int bps, gp[(gp_n + 1):t] = rep_vector(noise[gp_n], t - gp_n); } } else { - // alpha controls trajectory SD (not increment SD): rescale increments - // by 1/sqrt(gp_n) so the cumulative log Rt has variance ≈ alpha^2 - // instead of alpha^2 * gp_n. - gp[2:(gp_n + 1)] = noise / sqrt(gp_n); + gp[2:(gp_n + 1)] = noise; gp = cumulative_sum(gp); - // Identifiability: subtract the trajectory mean so log R0 = mean log - // Rt rather than the initial value. Eliminates the (R0, drift) ridge - // in the joint posterior. + // Identifiability: subtract the trajectory mean so log R0 is the mean + // log Rt over the window rather than the initial value. Eliminates + // the (R0, drift) ridge in the joint posterior. gp -= mean(gp); } logR = logR + gp; diff --git a/man/gp_opts.Rd b/man/gp_opts.Rd index 5edf9f7dc..6074783e9 100644 --- a/man/gp_opts.Rd +++ b/man/gp_opts.Rd @@ -8,7 +8,7 @@ gp_opts( basis_prop = 0.2, boundary_scale = 1.5, ls = LogNormal(mean = 21, sd = 7, max = 60), - alpha = Normal(mean = 0, sd = 0.2), + alpha = Normal(mean = 0, sd = 0.01), kernel = c("matern", "se", "ou", "periodic"), matern_order = 3/2, w0 = 1 @@ -33,17 +33,12 @@ a Lognormal distribution with mean 21 days, sd 7 days and maximum 60 days: enforced automatically to ensure positivity)} \item{alpha}{A \verb{} giving the prior distribution of the magnitude -parameter of the Gaussian process kernel. After rescaling of the -non-stationary GP, this is approximately the standard deviation of the -cumulated log-Rt trajectory over the observation window. Defaults to a -half-normal distribution with mean 0 and sd 0.2: -\code{Normal(mean = 0, sd = 0.2)} (a lower limit of 0 will be enforced -automatically to ensure positivity). Note: the data is only weakly -informative about \code{alpha} — the posterior is sensitive to the prior -choice — and very wide priors (e.g. sd > 0.3) may allow chains to -wander into pathological regions during warmup. Users with strong -domain knowledge of the expected scale of log-Rt variation should -consider supplying an informed prior.} +parameter of the Gaussian process kernel. Should be approximately the +expected standard deviation of the Gaussian process (logged Rt in case of +the renewal model, logged infections in case of the nonmechanistic model). +Defaults to a half-normal distribution with mean 0 and sd 0.01: +\code{Normal(mean = 0, sd = 0.01)} (a lower limit of 0 will be enforced +automatically to ensure positivity)} \item{kernel}{Character string, the type of kernel required. Currently supporting the Matern kernel ("matern"), squared exponential kernel ("se"), diff --git a/tests/testthat/test-gp_opts.R b/tests/testthat/test-gp_opts.R index d1a91941e..94ffd0b40 100644 --- a/tests/testthat/test-gp_opts.R +++ b/tests/testthat/test-gp_opts.R @@ -2,7 +2,7 @@ test_that("gp_opts returns correct default values", { gp <- gp_opts() expect_equal(gp$basis_prop, 0.2) expect_equal(gp$boundary_scale, 1.5) - expect_equal(gp$alpha, Normal(0, 0.2)) + expect_equal(gp$alpha, Normal(0, 0.01)) expect_equal(gp$ls, LogNormal(mean = 21, sd = 7, max = 60)) expect_equal(gp$kernel, "matern") expect_equal(gp$matern_order, 3 / 2) diff --git a/tests/testthat/test-stan-guassian-process.R b/tests/testthat/test-stan-guassian-process.R index 7d53db2da..085285e67 100644 --- a/tests/testthat/test-stan-guassian-process.R +++ b/tests/testthat/test-stan-guassian-process.R @@ -157,28 +157,8 @@ test_that("update_gp returns correct dimensions and values", { nu <- 1.5 # Not used in SE case result <- update_gp(PHI, M, L, alpha, rho, eta, type, nu) expect_equal(length(result), nrow(PHI)) # Should match number of observations - # update_gp uses the centred parameterisation: eta is sampled with prior - # N(0, diagSPD) in gaussian_process_lp, so update_gp just returns PHI * eta - # without applying the diagSPD scaling itself. - expected_result <- PHI %*% eta - expect_equal(matrix(result, ncol = 1), expected_result, tolerance = 1e-8) -}) - -test_that("gaussian_process_lp eta prior scales with diagSPD", { - # Verify the prior contribution matches eta ~ N(0, diagSPD). - M <- 3 - L <- 1.0 - alpha <- 1.0 - rho <- 2.0 - type <- 0 - nu <- 1.5 # not used for type=0 - eta <- c(0.5, -0.3, 0.1) + # Check specific values for known inputs diagSPD <- diagSPD_EQ(alpha, rho, L, M) - expected_lp <- sum(dnorm(eta, mean = 0, sd = diagSPD, log = TRUE)) - # gaussian_process_lp adds to target; in expose_stan_fns we get a void - # function. Calling it should not error; the prior contribution itself is - # tested via the expected normal log-density formula. - expect_silent(gaussian_process_lp(eta, M, L, alpha, rho, type, nu)) - expect_true(all(is.finite(diagSPD))) - expect_true(is.finite(expected_lp)) + expected_result <- PHI %*% (diagSPD * eta) + expect_equal(matrix(result, ncol = 1), expected_result, tolerance = 1e-8) }) diff --git a/tests/testthat/test-stan-rt.R b/tests/testthat/test-stan-rt.R index 30fb657ed..13cc93e39 100644 --- a/tests/testthat/test-stan-rt.R +++ b/tests/testthat/test-stan-rt.R @@ -9,13 +9,14 @@ test_that("update_Rt works to produce multiple Rt estimates with a static gaussi ) }) test_that("update_Rt works to produce multiple Rt estimates with a non-static gaussian process", { - # Non-stationary GP: increments rescaled by 1/sqrt(gp_n) and the cumulated - # trajectory centred so log R0 = mean log Rt rather than initial log Rt. - # For noise = rep(0.1, 9), gp_n = 9, log(R0) = log(1.2): - # gp = cumsum(rep(0.1/3, 9)), prepended with 0, then mean-subtracted. + # Non-stationary GP: cumulated trajectory is centred so log R0 = mean log Rt + # over the window rather than the initial value (eliminates the (R0, drift) + # ridge in the joint posterior). For noise = rep(0.1, 9), gp_n = 9: + # gp = cumsum(noise) = c(0, 0.1, 0.2, ..., 0.9), mean = 0.45, + # centred = c(-0.45, -0.35, ..., 0.45). log Rt = log(1.2) + centred. expect_equal( round(update_Rt(10, 1.2, rep(0.1, 9), rep(10, 0), numeric(0), 0), 3), - c(1.033, 1.068, 1.104, 1.141, 1.180, 1.220, 1.262, 1.304, 1.348, 1.394) + c(0.765, 0.846, 0.935, 1.033, 1.141, 1.262, 1.394, 1.541, 1.703, 1.882) ) }) test_that("update_Rt works to produce multiple Rt estimates with a non-static stationary gaussian process", { @@ -58,10 +59,11 @@ test_that("update_Rt works when Rt is variable and a breakpoint is present", { c(1.2, 1.2, rep(1.33, 3)) ) # Non-stationary GP: see explanation in the earlier non-static GP test. - # Here gp_n = 4, breakpoints add 0.1 from t=3 onward. + # Here gp_n = 4, gp_centred = c(-0.2, -0.1, 0, 0.1, 0.2), breakpoint adds + # 0.1 from t = 3 onward. expect_equal( round(update_Rt(5, 1.2, rep(0.1, 4), c(1, 1, 2, 2, 2), 0.1, 0), 3), - c(1.086, 1.141, 1.326, 1.394, 1.466) + c(0.982, 1.086, 1.326, 1.466, 1.620) ) }) From be45c1b98fc28297e5c0127ba6c02dea7b0fc56a Mon Sep 17 00:00:00 2001 From: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com> Date: Mon, 11 May 2026 10:29:31 +0100 Subject: [PATCH 10/13] address review: document new R0 interpretation under centred GP The cumsum centring changes the meaning of the rt_opts() `prior` from "the initial reproduction number" to "the mean reproduction number over the observation window" when using the default non-stationary GP. The documentation in rt_opts() and in the three estimate_infections vignettes now reflects this distinction (stationary GP and no-GP cases unchanged). Co-authored-by: sbfnk --- R/opts.R | 13 +++++++++---- man/rt_opts.Rd | 13 +++++++++---- vignettes/estimate_infections.Rmd | 4 ++-- vignettes/estimate_infections_options.Rmd | 2 +- vignettes/estimate_infections_options.Rmd.orig | 2 +- vignettes/estimate_infections_workflow.Rmd | 7 ++++--- vignettes/estimate_infections_workflow.Rmd.orig | 7 ++++--- 7 files changed, 30 insertions(+), 18 deletions(-) diff --git a/R/opts.R b/R/opts.R index ccc236700..0330e5868 100644 --- a/R/opts.R +++ b/R/opts.R @@ -238,10 +238,15 @@ trunc_opts <- function(dist = Fixed(0), default_cdf_cutoff = 0.001, #' reproduction number. Custom settings can be supplied which override the #' defaults. #' -#' @param prior A `` giving the prior of the initial reproduction -#' number. Ignored if `use_rt` is `FALSE`. Defaults to a LogNormal distribution -#' with mean of 1 and standard deviation of 1: `LogNormal(mean = 1, sd = 1)`. -#' A lower limit of 0 will be enforced automatically. +#' @param prior A `` giving the prior of the reproduction number. +#' With the default non-stationary GP (`gp_on = "R_t-1"`), the cumulated GP +#' is mean-centred by construction so that this prior is on the mean +#' reproduction number over the observation window. With the stationary GP +#' (`gp_on = "R0"`) or with no GP, it is the prior on the initial +#' reproduction number. Ignored if `use_rt` is `FALSE`. Defaults to a +#' LogNormal distribution with mean of 1 and standard deviation of 1: +#' `LogNormal(mean = 1, sd = 1)`. A lower limit of 0 will be enforced +#' automatically. #' #' @param use_rt Logical, defaults to `TRUE`. Should Rt be used to generate #' infections and hence reported cases. diff --git a/man/rt_opts.Rd b/man/rt_opts.Rd index b1efb5a24..570dbdb0c 100644 --- a/man/rt_opts.Rd +++ b/man/rt_opts.Rd @@ -18,10 +18,15 @@ rt_opts( ) } \arguments{ -\item{prior}{A \verb{} giving the prior of the initial reproduction -number. Ignored if \code{use_rt} is \code{FALSE}. Defaults to a LogNormal distribution -with mean of 1 and standard deviation of 1: \code{LogNormal(mean = 1, sd = 1)}. -A lower limit of 0 will be enforced automatically.} +\item{prior}{A \verb{} giving the prior of the reproduction number. +With the default non-stationary GP (\code{gp_on = "R_t-1"}), the cumulated GP +is mean-centred by construction so that this prior is on the mean +reproduction number over the observation window. With the stationary GP +(\code{gp_on = "R0"}) or with no GP, it is the prior on the initial +reproduction number. Ignored if \code{use_rt} is \code{FALSE}. Defaults to a +LogNormal distribution with mean of 1 and standard deviation of 1: +\code{LogNormal(mean = 1, sd = 1)}. A lower limit of 0 will be enforced +automatically.} \item{use_rt}{Logical, defaults to \code{TRUE}. Should Rt be used to generate infections and hence reported cases.} diff --git a/vignettes/estimate_infections.Rmd b/vignettes/estimate_infections.Rmd index 6c35082f0..f192689c0 100644 --- a/vignettes/estimate_infections.Rmd +++ b/vignettes/estimate_infections.Rmd @@ -82,7 +82,7 @@ Different options are available for setting a prior for $R_t$, the instantaneous \log R_{t} - \log R_{t-1} \sim \mathrm{GP}_t \end{equation} -More details on the mathematical form of the GP approximation and implementation are given in the [Gaussian Process implementation details](gaussian_process_implementation_details.html) vignette. Other choices for the prior of $R_t$ are available such as a GP prior for the difference between $R_t$ and its mean value (implying that in the absence of data $R_t$ will revert to its prior mean rather than the last value with robust support from the data). +The cumulated trajectory is mean-centred so that $\log R_0 = \mathrm{mean}(\log R_t)$ over the observation window rather than the initial value; this eliminates a $(R_0, \mathrm{drift})$ ridge in the joint posterior that otherwise produced stuck chains on some seeds. More details on the mathematical form of the GP approximation and implementation are given in the [Gaussian Process implementation details](gaussian_process_implementation_details.html) vignette. Other choices for the prior of $R_t$ are available such as a GP prior for the difference between $R_t$ and its mean value (implying that in the absence of data $R_t$ will revert to its prior mean rather than the last value with robust support from the data). \begin{equation} \log R_{t} - \log R_0 \sim \mathrm{GP}_t @@ -99,7 +99,7 @@ where $\div$ indicates interval-valued division (i.e. the floor of the division) The choice of prior for the time-varying reproduction number impact run-time, smoothness of the estimates and real-time behaviour and may alter the best use-case for the model. -The prior distribution of the initial reproduction number $R_{0}$ can be set by the user. +The prior distribution of the reproduction number $R_{0}$ can be set by the user. With the default non-stationary Gaussian process model ($gp_on = "R_{t-1}"$, see below), the cumulated GP is centred so that $R_0$ represents the mean reproduction number over the observation window; with no GP or with the stationary GP, $R_0$ is the initial reproduction number. By default this is a log-normal distribution with mean 1 and standard deviation 1. \begin{equation} diff --git a/vignettes/estimate_infections_options.Rmd b/vignettes/estimate_infections_options.Rmd index 8633d0c59..546778a39 100644 --- a/vignettes/estimate_infections_options.Rmd +++ b/vignettes/estimate_infections_options.Rmd @@ -70,7 +70,7 @@ ggplot(reported_cases, aes(x = date, y = confirm)) + # Parameters -Before running the model we need to decide on some parameter values, in particular any delays, the generation time, and a prior on the initial reproduction number. +Before running the model we need to decide on some parameter values, in particular any delays, the generation time, and a prior on the reproduction number (the mean Rt over the observation window with the default centred non-stationary GP, otherwise the initial Rt). ## Delays: incubation period and reporting delay diff --git a/vignettes/estimate_infections_options.Rmd.orig b/vignettes/estimate_infections_options.Rmd.orig index 11d1e1ac0..25d6fdba8 100644 --- a/vignettes/estimate_infections_options.Rmd.orig +++ b/vignettes/estimate_infections_options.Rmd.orig @@ -58,7 +58,7 @@ ggplot(reported_cases, aes(x = date, y = confirm)) + # Parameters -Before running the model we need to decide on some parameter values, in particular any delays, the generation time, and a prior on the initial reproduction number. +Before running the model we need to decide on some parameter values, in particular any delays, the generation time, and a prior on the reproduction number (the mean Rt over the observation window with the default centred non-stationary GP, otherwise the initial Rt). ## Delays: incubation period and reporting delay diff --git a/vignettes/estimate_infections_workflow.Rmd b/vignettes/estimate_infections_workflow.Rmd index 10d66e25a..a62d66c09 100644 --- a/vignettes/estimate_infections_workflow.Rmd +++ b/vignettes/estimate_infections_workflow.Rmd @@ -258,10 +258,11 @@ obs_scale <- Normal(mean = 0.4, sd = 0.01) obs_opts(scale = obs_scale) ``` -## Initial reproduction number +## Reproduction number prior -The default model that `estimate_infections()` uses to estimate reproduction numbers requires specification of a prior probability distribution for the initial reproduction number. -This represents the user's initial belief of the value of the reproduction number, where there is no data yet to inform its value. +The default model that `estimate_infections()` uses to estimate reproduction numbers requires specification of a prior probability distribution for the reproduction number. +With the default non-stationary Gaussian process the cumulated GP is mean-centred so this prior is on the mean reproduction number over the observation window; with no GP or with the stationary GP it is the prior on the initial reproduction number. +It represents the user's prior belief of the central value of the reproduction number, before any data informs it. By default this is assumed to be represented by a lognormal distribution with mean and standard deviation of 1. It can be changed using the `rt_opts()` function. For example, if the user believes that at the very start of the data the reproduction number was 2, with uncertainty in this belief represented by a standard deviation of 1, they would use diff --git a/vignettes/estimate_infections_workflow.Rmd.orig b/vignettes/estimate_infections_workflow.Rmd.orig index b29fde4ac..dbc2fce21 100644 --- a/vignettes/estimate_infections_workflow.Rmd.orig +++ b/vignettes/estimate_infections_workflow.Rmd.orig @@ -192,10 +192,11 @@ obs_scale <- Normal(mean = 0.4, sd = 0.01) obs_opts(scale = obs_scale) ``` -## Initial reproduction number +## Reproduction number prior -The default model that `estimate_infections()` uses to estimate reproduction numbers requires specification of a prior probability distribution for the initial reproduction number. -This represents the user's initial belief of the value of the reproduction number, where there is no data yet to inform its value. +The default model that `estimate_infections()` uses to estimate reproduction numbers requires specification of a prior probability distribution for the reproduction number. +With the default non-stationary Gaussian process the cumulated GP is mean-centred so this prior is on the mean reproduction number over the observation window; with no GP or with the stationary GP it is the prior on the initial reproduction number. +It represents the user's prior belief of the central value of the reproduction number, before any data informs it. By default this is assumed to be represented by a lognormal distribution with mean and standard deviation of 1. It can be changed using the `rt_opts()` function. For example, if the user believes that at the very start of the data the reproduction number was 2, with uncertainty in this belief represented by a standard deviation of 1, they would use From 0055390cb3d4bbc625be9f9de0112cbf6b3c35ed Mon Sep 17 00:00:00 2001 From: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com> Date: Mon, 11 May 2026 10:44:24 +0100 Subject: [PATCH 11/13] address review: warn on custom prior under centred GP Adds a cli_warn() in rt_opts() when the user supplies a non-default `prior` and `gp_on` is the default "R_t-1", flagging that the prior is now on the mean Rt over the window rather than the initial Rt. Co-authored-by: sbfnk --- R/opts.R | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/R/opts.R b/R/opts.R index 0330e5868..ebd4b2315 100644 --- a/R/opts.R +++ b/R/opts.R @@ -346,6 +346,17 @@ rt_opts <- function(prior = LogNormal(mean = 1, sd = 1), pop_floor = pop_floor, growth_method = arg_match(growth_method) ) + if (!missing(prior) && opts$gp_on == "R_t-1") { + cli_warn(c( + "!" = "The interpretation of `prior` in {.fn rt_opts} has changed.", + "i" = "With the default non-stationary Gaussian process \\ + ({.code gp_on = \"R_t-1\"}), the cumulated GP is mean-centred, so \\ + this prior is on the {.strong mean reproduction number over the \\ + observation window} rather than the initial reproduction number.", + ">" = "If you intended the prior to describe the initial Rt, you \\ + will need to adjust it (or use {.code gp_on = \"R0\"} / no GP)." + )) + } # replace default settings with those specified by user if (opts$rw > 0) { From d5f90814250ec8931d1d1b0751e82b12d021ea44 Mon Sep 17 00:00:00 2001 From: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com> Date: Wed, 20 May 2026 16:47:07 +0100 Subject: [PATCH 12/13] Apply user prior on derived initial Rt under centred GP Refactors so that centring the cumulated GP no longer changes the user-facing meaning of rt_opts(prior=...). R0 leaves the generic params vector. R_mean (the trajectory mean) is sampled internally for the geometry benefit, and the user's prior is applied to the derived initial Rt R[1] via a new Jacobian-correct stan helper centred_gp_init_lpdf in inst/stan/functions/params.stan. The init-prior plumbing (data items init_param_ids/init_dists/ init_dist_params, R helper pack_init_prior()) is parameter-agnostic and designed as a prototype for the future generic time-varying parameter framework (issue #600). Today only R0 is wired up; future time-varying parameters drop in by appending to the same data items and adding one dispatch branch in stan. Reverts the rt_opts() warning and vignette doc updates from earlier commits since the user-facing interpretation is now unchanged. Co-authored-by: sbfnk --- NEWS.md | 2 +- R/estimate_infections.R | 25 ++++++++++++- R/opts.R | 25 ++----------- R/utilities.R | 37 +++++++++++++++++++ .../stan/data/estimate_infections_params.stan | 9 +++++ inst/stan/estimate_infections.stan | 36 +++++++++++++++--- inst/stan/functions/params.stan | 33 +++++++++++++++++ man/pack_init_prior.Rd | 29 +++++++++++++++ man/rt_opts.Rd | 13 ++----- vignettes/estimate_infections.Rmd | 4 +- vignettes/estimate_infections_options.Rmd | 2 +- .../estimate_infections_options.Rmd.orig | 2 +- vignettes/estimate_infections_workflow.Rmd | 7 ++-- .../estimate_infections_workflow.Rmd.orig | 7 ++-- 14 files changed, 182 insertions(+), 49 deletions(-) create mode 100644 man/pack_init_prior.Rd diff --git a/NEWS.md b/NEWS.md index 1eb4b2df0..da14ef353 100644 --- a/NEWS.md +++ b/NEWS.md @@ -19,7 +19,7 @@ ## Model changes -- The cumulated non-stationary Gaussian process used to model Rt over time is now mean-centred (`gp -= mean(gp)` in `inst/stan/functions/rt.stan`), so `R0` represents the mean Rt over the trajectory rather than the initial value. This eliminates the `(R0, drift)` ridge in the joint posterior that was responsible for stuck chains and catastrophic R-hat values on some seeds. No API change and no change to the `alpha` prior — verified across previously stuck seeds (R-hat goes from up to 6.10 down to <1.01, treedepth hits from hundreds down to zero). +- The non-stationary Gaussian process used to model Rt over time now samples in a mean-centred parameterisation internally (`gp -= mean(gp)` in `inst/stan/functions/rt.stan`), eliminating the `(R0, drift)` ridge in the joint posterior that caused stuck chains and catastrophic R-hat values on some seeds. The user-facing interpretation is unchanged: the `prior` argument in `rt_opts()` is still the prior on the initial Rt. This is achieved by sampling the trajectory mean internally and applying the user prior to the derived initial Rt via a new generic `centred_gp_init_lpdf` Stan helper, with Jacobian-correct change of variables (so the joint prior matches the pre-change model). The plumbing (`init_dists`, `init_dist_params`, `pack_init_prior()`, `centred_gp_init_lpdf`) is parameter-agnostic, designed as a prototype for the general time-varying-parameter framework planned for issue #600. ## Bug fixes diff --git a/R/estimate_infections.R b/R/estimate_infections.R index 00578d508..c620ee70c 100644 --- a/R/estimate_infections.R +++ b/R/estimate_infections.R @@ -181,10 +181,12 @@ estimate_infections <- function(data, # Add initial zeroes model_data <- pad_reported_cases(model_data, seeding_time) + # R0 is handled separately from the generic params system: it is wrapped + # by the centred non-stationary GP, so its user-facing prior is on the + # initial Rt (R[1]) rather than on the sampled internal log-mean. params <- list( make_param("alpha", gp$alpha, lower_bound = 0), make_param("rho", gp$ls, lower_bound = 0), - make_param("R0", rt$prior, lower_bound = 0), make_param("fraction_observed", obs$scale, lower_bound = 0), make_param("reporting_overdispersion", obs$dispersion, lower_bound = 0), make_param("pop", rt$pop, lower_bound = 0) @@ -202,6 +204,27 @@ estimate_infections <- function(data, params = params ) + # Register R0 as the (single, for now) centred-GP-wrapped parameter with + # its user prior applied to the derived initial Rt. The dispatch in + # `inst/stan/estimate_infections.stan` (the `init lp` profile block) uses + # `param_id_R0` to know which trajectory's initial value to apply the + # prior to. Generic plumbing: additional time-varying parameters drop in + # alongside R0 by appending to `init_param_ids` / `init_dists` / + # `init_dist_params` and adding one dispatch branch in stan. + stan_data$param_id_R0 <- stan_data$n_params_variable + 1L + if (isTRUE(rt$use_rt)) { + init_R <- pack_init_prior(rt$prior) + stan_data$n_init_priors <- 1L + stan_data$init_param_ids <- array(stan_data$param_id_R0) + stan_data$init_dists <- array(init_R$dist_type) + stan_data$init_dist_params <- array(init_R$params) + } else { + stan_data$n_init_priors <- 0L + stan_data$init_param_ids <- array(integer(0)) + stan_data$init_dists <- array(integer(0)) + stan_data$init_dist_params <- array(numeric(0)) + } + stan_data <- c(stan_data, create_stan_delays( generation_time = generation_time, reporting = delays, diff --git a/R/opts.R b/R/opts.R index ebd4b2315..64ec18757 100644 --- a/R/opts.R +++ b/R/opts.R @@ -238,15 +238,10 @@ trunc_opts <- function(dist = Fixed(0), default_cdf_cutoff = 0.001, #' reproduction number. Custom settings can be supplied which override the #' defaults. #' -#' @param prior A `` giving the prior of the reproduction number. -#' With the default non-stationary GP (`gp_on = "R_t-1"`), the cumulated GP -#' is mean-centred by construction so that this prior is on the mean -#' reproduction number over the observation window. With the stationary GP -#' (`gp_on = "R0"`) or with no GP, it is the prior on the initial -#' reproduction number. Ignored if `use_rt` is `FALSE`. Defaults to a -#' LogNormal distribution with mean of 1 and standard deviation of 1: -#' `LogNormal(mean = 1, sd = 1)`. A lower limit of 0 will be enforced -#' automatically. +#' @param prior A `` giving the prior of the initial reproduction +#' number. Ignored if `use_rt` is `FALSE`. Defaults to a LogNormal distribution +#' with mean of 1 and standard deviation of 1: `LogNormal(mean = 1, sd = 1)`. +#' A lower limit of 0 will be enforced automatically. #' #' @param use_rt Logical, defaults to `TRUE`. Should Rt be used to generate #' infections and hence reported cases. @@ -346,18 +341,6 @@ rt_opts <- function(prior = LogNormal(mean = 1, sd = 1), pop_floor = pop_floor, growth_method = arg_match(growth_method) ) - if (!missing(prior) && opts$gp_on == "R_t-1") { - cli_warn(c( - "!" = "The interpretation of `prior` in {.fn rt_opts} has changed.", - "i" = "With the default non-stationary Gaussian process \\ - ({.code gp_on = \"R_t-1\"}), the cumulated GP is mean-centred, so \\ - this prior is on the {.strong mean reproduction number over the \\ - observation window} rather than the initial reproduction number.", - ">" = "If you intended the prior to describe the initial Rt, you \\ - will need to adjust it (or use {.code gp_on = \"R0\"} / no GP)." - )) - } - # replace default settings with those specified by user if (opts$rw > 0) { opts$use_breakpoints <- TRUE diff --git a/R/utilities.R b/R/utilities.R index 276a72d4c..bef5b1a81 100644 --- a/R/utilities.R +++ b/R/utilities.R @@ -481,6 +481,43 @@ make_param <- function(name, dist = NULL, lower_bound = -Inf) { params } +##' Pack a dist_spec into stan-side init-prior fields +##' +##' For parameters wrapped in a centred non-stationary GP (today: R0), the +##' user-facing prior in `rt_opts()` is on the initial value of the trajectory +##' rather than on the sampled internal mean. This helper converts a +##' `` into the integer/vector representation consumed by the +##' generic init-prior loop in `inst/stan/estimate_infections.stan` (data +##' items `init_dists` and `init_dist_params`). +##' +##' Generic over the parameter — designed to lift directly into a future +##' general time-varying-parameter framework (issue #600). +##' +##' @param dist A `` (LogNormal, Gamma, or Normal). +##' @return A list with elements `dist_type` (integer code: 0 = lognormal, +##' 1 = gamma, 2 = normal) and `params` (numeric of length 2 with the +##' distribution parameters). +##' @keywords internal +pack_init_prior <- function(dist) { + dist_name <- get_distribution(dist) + dist_type <- switch(dist_name, + lognormal = 0L, + gamma = 1L, + normal = 2L, + cli_abort(c( + "!" = "Init prior distribution {.val {dist_name}} not supported.", + "i" = "Use {.fn LogNormal}, {.fn Gamma}, or {.fn Normal}." + )) + ) + params <- unlist(get_parameters(dist)) + if (length(params) != 2) { + cli_abort( + "Init prior must have exactly 2 distribution parameters; got {length(params)}." # nolint + ) + } + list(dist_type = dist_type, params = as.numeric(params)) +} + #' @importFrom stats glm median na.omit pexp pgamma plnorm quasipoisson rexp #' @importFrom stats rlnorm rnorm rpois runif sd var rgamma pnorm globalVariables( diff --git a/inst/stan/data/estimate_infections_params.stan b/inst/stan/data/estimate_infections_params.stan index 8b48f1975..edea29b46 100644 --- a/inst/stan/data/estimate_infections_params.stan +++ b/inst/stan/data/estimate_infections_params.stan @@ -4,3 +4,12 @@ int param_id_R0; // parameter id of R0 int param_id_fraction_observed; // parameter id of fraction_observed int param_id_reporting_overdispersion; // parameter id of reporting_overdispersion int param_id_pop; // parameter id of pop + +// Init priors for centred-GP-wrapped parameters. Today only R0 is wrapped +// (its prior in rt_opts() is on the *initial* Rt, applied to the derived +// R[1] from the centred GP). Generic shape so additional time-varying +// parameters can be added without changing the data plumbing. +int n_init_priors; +array[n_init_priors] int init_param_ids; +array[n_init_priors] int init_dists; +vector[2 * n_init_priors] init_dist_params; diff --git a/inst/stan/estimate_infections.stan b/inst/stan/estimate_infections.stan index d535d016d..5b0753fb4 100644 --- a/inst/stan/estimate_infections.stan +++ b/inst/stan/estimate_infections.stan @@ -58,7 +58,10 @@ parameters { vector[n_params_variable] params; // gaussian process vector[fixed ? 0 : gp_type == 1 ? 2*M : M] eta; // unconstrained noise - // Rt + // Rt — mean log Rt over the window (sampled internally). User prior is on + // R[1] (initial Rt) and is applied via centred_gp_init_lpdf in the model + // block, with R[1] derived from R_mean + gp_centred[1]. + array[estimate_r] real R_mean; array[estimate_r] real initial_infections; // seed infections // standard deviation of breakpoint effect array[bp_n > 0 ? 1 : 0] real bp_sd; @@ -106,11 +109,11 @@ transformed parameters { ); } profile("R0") { - real R0 = get_param( - param_id_R0, params_fixed_lookup, params_variable_lookup, params_value, params - ); + // R_mean is sampled directly (centred GP scaffolding). The user prior + // lives on R[1] (initial Rt), applied in the model block via the + // init_priors plumbing. R = update_Rt( - ot_h, R0, noise, breakpoints, bp_effects, stationary + ot_h, R_mean[1], noise, breakpoints, bp_effects, stationary ); } profile("infections") { @@ -231,6 +234,29 @@ model { } } + // Apply user priors on the initial value of any centred-GP-wrapped + // parameter (today: R0 -> R[1]). Generic loop so new time-varying + // parameters drop in via a one-line dispatch addition below. + if (n_init_priors > 0) { + profile("init lp") { + for (i in 1:n_init_priors) { + real init_value; + int pid = init_param_ids[i]; + if (pid == param_id_R0) { + init_value = R[1]; + } else { + reject("no time-varying parameter registered for id ", pid); + } + target += centred_gp_init_lpdf( + init_value | + init_dists[i], + init_dist_params[2 * i - 1], + init_dist_params[2 * i] + ); + } + } + } + // observed reports from mean of reports (update likelihood) if (likelihood) { profile("report lp") { diff --git a/inst/stan/functions/params.stan b/inst/stan/functions/params.stan index 523b12025..b758558a5 100644 --- a/inst/stan/functions/params.stan +++ b/inst/stan/functions/params.stan @@ -106,3 +106,36 @@ void params_lp(vector params, array[] int prior_dist, } } +/** + * Apply user prior on the initial value of a centred-GP-wrapped parameter. + * + * When a parameter is wrapped by a centred non-stationary GP, the user-facing + * prior is on the initial value of the trajectory (X[1]) rather than on the + * sampled internal log-mean. This helper applies the prior to the derived + * initial value with the Jacobian correction for the linear-shift transform + * from log-mean to log-initial. + * + * Generic over the parameter — used by R0 today, lifts to any future + * time-varying parameter (alpha, dispersion, ...) via the same call. + * + * @param init_value Derived initial value of the trajectory (e.g. R[1]). + * @param dist_type Prior distribution type (0 = lognormal, 1 = gamma, 2 = normal). + * @param p1 First distribution parameter. + * @param p2 Second distribution parameter. + * @return Log-density contribution to the target, Jacobian included. + * + * @ingroup parameter_handlers + */ +real centred_gp_init_lpdf(real init_value, int dist_type, real p1, real p2) { + if (dist_type == 0) { + return lognormal_lpdf(init_value | p1, p2) + log(init_value); + } else if (dist_type == 1) { + return gamma_lpdf(init_value | p1, p2) + log(init_value); + } else if (dist_type == 2) { + return normal_lpdf(init_value | p1, p2) + log(init_value); + } else { + reject("centred_gp_init_lpdf: dist_type must be 0, 1, or 2"); + } + return 0; +} + diff --git a/man/pack_init_prior.Rd b/man/pack_init_prior.Rd new file mode 100644 index 000000000..a44643f36 --- /dev/null +++ b/man/pack_init_prior.Rd @@ -0,0 +1,29 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/utilities.R +\name{pack_init_prior} +\alias{pack_init_prior} +\title{Pack a dist_spec into stan-side init-prior fields} +\usage{ +pack_init_prior(dist) +} +\arguments{ +\item{dist}{A \verb{} (LogNormal, Gamma, or Normal).} +} +\value{ +A list with elements \code{dist_type} (integer code: 0 = lognormal, +1 = gamma, 2 = normal) and \code{params} (numeric of length 2 with the +distribution parameters). +} +\description{ +For parameters wrapped in a centred non-stationary GP (today: R0), the +user-facing prior in \code{rt_opts()} is on the initial value of the trajectory +rather than on the sampled internal mean. This helper converts a +\verb{} into the integer/vector representation consumed by the +generic init-prior loop in \code{inst/stan/estimate_infections.stan} (data +items \code{init_dists} and \code{init_dist_params}). +} +\details{ +Generic over the parameter — designed to lift directly into a future +general time-varying-parameter framework (issue #600). +} +\keyword{internal} diff --git a/man/rt_opts.Rd b/man/rt_opts.Rd index 570dbdb0c..b1efb5a24 100644 --- a/man/rt_opts.Rd +++ b/man/rt_opts.Rd @@ -18,15 +18,10 @@ rt_opts( ) } \arguments{ -\item{prior}{A \verb{} giving the prior of the reproduction number. -With the default non-stationary GP (\code{gp_on = "R_t-1"}), the cumulated GP -is mean-centred by construction so that this prior is on the mean -reproduction number over the observation window. With the stationary GP -(\code{gp_on = "R0"}) or with no GP, it is the prior on the initial -reproduction number. Ignored if \code{use_rt} is \code{FALSE}. Defaults to a -LogNormal distribution with mean of 1 and standard deviation of 1: -\code{LogNormal(mean = 1, sd = 1)}. A lower limit of 0 will be enforced -automatically.} +\item{prior}{A \verb{} giving the prior of the initial reproduction +number. Ignored if \code{use_rt} is \code{FALSE}. Defaults to a LogNormal distribution +with mean of 1 and standard deviation of 1: \code{LogNormal(mean = 1, sd = 1)}. +A lower limit of 0 will be enforced automatically.} \item{use_rt}{Logical, defaults to \code{TRUE}. Should Rt be used to generate infections and hence reported cases.} diff --git a/vignettes/estimate_infections.Rmd b/vignettes/estimate_infections.Rmd index f192689c0..6c35082f0 100644 --- a/vignettes/estimate_infections.Rmd +++ b/vignettes/estimate_infections.Rmd @@ -82,7 +82,7 @@ Different options are available for setting a prior for $R_t$, the instantaneous \log R_{t} - \log R_{t-1} \sim \mathrm{GP}_t \end{equation} -The cumulated trajectory is mean-centred so that $\log R_0 = \mathrm{mean}(\log R_t)$ over the observation window rather than the initial value; this eliminates a $(R_0, \mathrm{drift})$ ridge in the joint posterior that otherwise produced stuck chains on some seeds. More details on the mathematical form of the GP approximation and implementation are given in the [Gaussian Process implementation details](gaussian_process_implementation_details.html) vignette. Other choices for the prior of $R_t$ are available such as a GP prior for the difference between $R_t$ and its mean value (implying that in the absence of data $R_t$ will revert to its prior mean rather than the last value with robust support from the data). +More details on the mathematical form of the GP approximation and implementation are given in the [Gaussian Process implementation details](gaussian_process_implementation_details.html) vignette. Other choices for the prior of $R_t$ are available such as a GP prior for the difference between $R_t$ and its mean value (implying that in the absence of data $R_t$ will revert to its prior mean rather than the last value with robust support from the data). \begin{equation} \log R_{t} - \log R_0 \sim \mathrm{GP}_t @@ -99,7 +99,7 @@ where $\div$ indicates interval-valued division (i.e. the floor of the division) The choice of prior for the time-varying reproduction number impact run-time, smoothness of the estimates and real-time behaviour and may alter the best use-case for the model. -The prior distribution of the reproduction number $R_{0}$ can be set by the user. With the default non-stationary Gaussian process model ($gp_on = "R_{t-1}"$, see below), the cumulated GP is centred so that $R_0$ represents the mean reproduction number over the observation window; with no GP or with the stationary GP, $R_0$ is the initial reproduction number. +The prior distribution of the initial reproduction number $R_{0}$ can be set by the user. By default this is a log-normal distribution with mean 1 and standard deviation 1. \begin{equation} diff --git a/vignettes/estimate_infections_options.Rmd b/vignettes/estimate_infections_options.Rmd index 546778a39..8633d0c59 100644 --- a/vignettes/estimate_infections_options.Rmd +++ b/vignettes/estimate_infections_options.Rmd @@ -70,7 +70,7 @@ ggplot(reported_cases, aes(x = date, y = confirm)) + # Parameters -Before running the model we need to decide on some parameter values, in particular any delays, the generation time, and a prior on the reproduction number (the mean Rt over the observation window with the default centred non-stationary GP, otherwise the initial Rt). +Before running the model we need to decide on some parameter values, in particular any delays, the generation time, and a prior on the initial reproduction number. ## Delays: incubation period and reporting delay diff --git a/vignettes/estimate_infections_options.Rmd.orig b/vignettes/estimate_infections_options.Rmd.orig index 25d6fdba8..11d1e1ac0 100644 --- a/vignettes/estimate_infections_options.Rmd.orig +++ b/vignettes/estimate_infections_options.Rmd.orig @@ -58,7 +58,7 @@ ggplot(reported_cases, aes(x = date, y = confirm)) + # Parameters -Before running the model we need to decide on some parameter values, in particular any delays, the generation time, and a prior on the reproduction number (the mean Rt over the observation window with the default centred non-stationary GP, otherwise the initial Rt). +Before running the model we need to decide on some parameter values, in particular any delays, the generation time, and a prior on the initial reproduction number. ## Delays: incubation period and reporting delay diff --git a/vignettes/estimate_infections_workflow.Rmd b/vignettes/estimate_infections_workflow.Rmd index a62d66c09..10d66e25a 100644 --- a/vignettes/estimate_infections_workflow.Rmd +++ b/vignettes/estimate_infections_workflow.Rmd @@ -258,11 +258,10 @@ obs_scale <- Normal(mean = 0.4, sd = 0.01) obs_opts(scale = obs_scale) ``` -## Reproduction number prior +## Initial reproduction number -The default model that `estimate_infections()` uses to estimate reproduction numbers requires specification of a prior probability distribution for the reproduction number. -With the default non-stationary Gaussian process the cumulated GP is mean-centred so this prior is on the mean reproduction number over the observation window; with no GP or with the stationary GP it is the prior on the initial reproduction number. -It represents the user's prior belief of the central value of the reproduction number, before any data informs it. +The default model that `estimate_infections()` uses to estimate reproduction numbers requires specification of a prior probability distribution for the initial reproduction number. +This represents the user's initial belief of the value of the reproduction number, where there is no data yet to inform its value. By default this is assumed to be represented by a lognormal distribution with mean and standard deviation of 1. It can be changed using the `rt_opts()` function. For example, if the user believes that at the very start of the data the reproduction number was 2, with uncertainty in this belief represented by a standard deviation of 1, they would use diff --git a/vignettes/estimate_infections_workflow.Rmd.orig b/vignettes/estimate_infections_workflow.Rmd.orig index dbc2fce21..b29fde4ac 100644 --- a/vignettes/estimate_infections_workflow.Rmd.orig +++ b/vignettes/estimate_infections_workflow.Rmd.orig @@ -192,11 +192,10 @@ obs_scale <- Normal(mean = 0.4, sd = 0.01) obs_opts(scale = obs_scale) ``` -## Reproduction number prior +## Initial reproduction number -The default model that `estimate_infections()` uses to estimate reproduction numbers requires specification of a prior probability distribution for the reproduction number. -With the default non-stationary Gaussian process the cumulated GP is mean-centred so this prior is on the mean reproduction number over the observation window; with no GP or with the stationary GP it is the prior on the initial reproduction number. -It represents the user's prior belief of the central value of the reproduction number, before any data informs it. +The default model that `estimate_infections()` uses to estimate reproduction numbers requires specification of a prior probability distribution for the initial reproduction number. +This represents the user's initial belief of the value of the reproduction number, where there is no data yet to inform its value. By default this is assumed to be represented by a lognormal distribution with mean and standard deviation of 1. It can be changed using the `rt_opts()` function. For example, if the user believes that at the very start of the data the reproduction number was 2, with uncertainty in this belief represented by a standard deviation of 1, they would use From 3351aa57878e068851c150228e7576fa0007e4cd Mon Sep 17 00:00:00 2001 From: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com> Date: Wed, 20 May 2026 17:52:54 +0100 Subject: [PATCH 13/13] fix ci: provide empty init prior data items in simulate_infections simulate_infections.stan includes the shared data/estimate_infections_params.stan which now declares the init prior data block. simulate_infections does not fit, so set the items to empty defaults (n_init_priors = 0, empty arrays). Co-authored-by: sbfnk --- R/simulate_infections.R | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/R/simulate_infections.R b/R/simulate_infections.R index af35ee884..0288d3dc4 100644 --- a/R/simulate_infections.R +++ b/R/simulate_infections.R @@ -198,6 +198,14 @@ simulate_infections <- function(R, ## set empty params matrix - variable parameters not supported here stan_data$params <- array(dim = c(1, 0)) + ## init priors: not used in forward simulation (no model fitting). Provide + ## empty defaults so the shared estimate_infections_params.stan data block + ## is satisfied. + stan_data$n_init_priors <- 0L + stan_data$init_param_ids <- array(integer(0)) + stan_data$init_dists <- array(integer(0)) + stan_data$init_dist_params <- array(numeric(0)) + ## day of week effect if (is.null(day_of_week_effect)) { day_of_week_effect <- rep(1, stan_data$week_effect)