From 5d870b539a6ea1aa1935093cf246ccaa00388ff4 Mon Sep 17 00:00:00 2001
From: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
Date: Fri, 8 May 2026 12:25:57 +0100
Subject: [PATCH 01/13] Improve identifiability of non-stationary GP

Three changes to inst/stan that together eliminate stuck chains and
divergences seen on the non-stationary Gaussian process used to model Rt:

1. Rescale GP increments by 1/sqrt(gp_n) inside update_Rt so alpha
   controls the trajectory SD rather than the increment SD, matching
   what gp_opts() docs already claim.

2. Centre the cumulated GP (gp -= mean(gp)) so log R0 = mean log Rt
   rather than the initial value. Eliminates the (R0, drift) ridge in
   the joint posterior.

3. Switch GP coefficient sampling to centred form: eta ~ N(0, diagSPD)
   with noise = PHI * eta. Avoids the (alpha, eta) funnel that arises
   in non-centred form when alpha is small.

On previously catastrophic seeds (R-hat > 4 with the old form) the new
parameterisation gives R-hat < 1.02 with zero divergences and roughly 4x
faster sampling.

Co-authored-by: sbfnk <sebastian.funk@lshtm.ac.uk>
---
 NEWS.md                                   |  4 ++
 inst/stan/estimate_infections.stan        | 14 ++++-
 inst/stan/functions/gaussian_process.stan | 69 +++++++++++++++++------
 inst/stan/functions/rt.stan               |  9 ++-
 4 files changed, 77 insertions(+), 19 deletions(-)

diff --git a/NEWS.md b/NEWS.md
index 7264770fc..8e5d35936 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -17,6 +17,10 @@
 - Added a model overview vignette with an architecture diagram showing how the package's models connect.
 - Added a model features vignette providing a quick reference to all modelling options with links to detailed documentation.
 
+## Model changes
+
+- Improved identifiability of the non-stationary Gaussian process used to model Rt over time. Three changes in `inst/stan`: (1) GP increments are rescaled by `1/sqrt(gp_n)` so the `alpha` hyperparameter controls the standard deviation of the cumulated log-Rt trajectory rather than of the increment, matching what `gp_opts()` already documents; (2) the cumulated GP is centred so that `R0` is the mean Rt over the trajectory rather than the initial value, eliminating the `(R0, drift)` ridge in the joint posterior; (3) the GP coefficients are now sampled in centred form (`eta ~ N(0, diagSPD)` rather than unit-normal eta scaled by `diagSPD`), which avoids the `(alpha, eta)` funnel and gives substantially smoother HMC geometry in the small-`alpha` regime. Together the changes eliminate stuck chains and divergent transitions seen on previously pathological seeds, while sampling roughly 4× faster.
+
 ## Bug fixes
 
 - Fixed a bug in `forecast_infections()` where the summary call to extract dates was using modified args instead of the original fit dimensions, causing a date-dimension mismatch when extending the R trajectory beyond the original observation period.
diff --git a/inst/stan/estimate_infections.stan b/inst/stan/estimate_infections.stan
index d535d016d..6fae8cf2c 100644
--- a/inst/stan/estimate_infections.stan
+++ b/inst/stan/estimate_infections.stan
@@ -202,7 +202,19 @@ model {
   // priors for noise GP
   if (!fixed) {
     profile("gp lp") {
-      gaussian_process_lp(eta);
+      // Recompute alpha and rescaled_rho here because they are local to
+      // the transformed parameters block. The centred-form gaussian_process_lp
+      // needs them to compute diagSPD for the eta prior.
+      real alpha_for_lp = get_param(
+        param_id_alpha, params_fixed_lookup, params_variable_lookup,
+        params_value, params
+      );
+      real rescaled_rho_for_lp = 2 * get_param(
+        param_id_rho, params_fixed_lookup, params_variable_lookup,
+        params_value, params
+      ) / noise_terms;
+      gaussian_process_lp(eta, M, L, alpha_for_lp, rescaled_rho_for_lp,
+                          gp_type, nu);
     }
   }
 
diff --git a/inst/stan/functions/gaussian_process.stan b/inst/stan/functions/gaussian_process.stan
index 2f067b4bd..bcb1a2e8a 100644
--- a/inst/stan/functions/gaussian_process.stan
+++ b/inst/stan/functions/gaussian_process.stan
@@ -181,23 +181,20 @@ matrix setup_gp(int M, real L, int dimension, int is_periodic, real w0) {
 }
 
 /**
-  * Update Gaussian process using spectral densities
+  * Compute the diagonal of the spectral density matrix for the chosen kernel.
   *
-  * @param PHI Basis functions matrix
   * @param M Number of basis functions
   * @param L Length of the interval
-  * @param alpha Scaling parameter
-  * @param rho Length scale parameter
-  * @param eta Vector of noise terms
-  * @param type Type of kernel (0: SE, 1: Periodic, 2: Matern)
-  * @param nu Smoothness parameter for Matern kernel
-  * @return A vector of updated noise terms
+  * @param alpha GP magnitude
+  * @param rho GP length-scale
+  * @param type Kernel type (0=EQ, 1=Periodic, 2=Matern)
+  * @param nu Matern smoothness
+  * @return Diagonal vector of spectral densities
+  *
+  * @ingroup estimates_smoothing
   */
-vector update_gp(matrix PHI, int M, real L, real alpha,
-                 real rho, vector eta, int type, real nu) {
-  vector[type == 1 ? 2 * M : M] diagSPD;    // spectral density
-
-  // GP in noise - spectral densities
+vector compute_diagSPD(int M, real L, real alpha, real rho, int type, real nu) {
+  vector[type == 1 ? 2 * M : M] diagSPD;
   if (type == 0) {
     diagSPD = diagSPD_EQ(alpha, rho, L, M);
   } else if (type == 1) {
@@ -213,17 +210,55 @@ vector update_gp(matrix PHI, int M, real L, real alpha,
       reject("nu must be one of 1/2, 3/2 or 5/2; found nu=", nu);
     }
   }
-  return PHI * (diagSPD .* eta);
+  return diagSPD;
+}
+
+/**
+  * Update Gaussian process using spectral densities (centred form).
+  *
+  * Returns PHI * eta where eta is sampled with a N(0, diagSPD) prior in
+  * gaussian_process_lp. The spectral density scaling is applied via the
+  * prior rather than via multiplication of unit-normal eta, which avoids
+  * the (alpha, eta) funnel that arises in the non-centred form when alpha
+  * is small (data uninformative about eta).
+  *
+  * @param PHI Basis functions matrix
+  * @param M Number of basis functions (unused; kept for signature stability)
+  * @param L Length of the interval (unused; kept for signature stability)
+  * @param alpha Scaling parameter (unused; kept for signature stability)
+  * @param rho Length scale parameter (unused; kept for signature stability)
+  * @param eta Vector of noise terms (already on diagSPD scale)
+  * @param type Type of kernel (unused; kept for signature stability)
+  * @param nu Smoothness parameter (unused; kept for signature stability)
+  * @return A vector of GP noise values
+  */
+vector update_gp(matrix PHI, int M, real L, real alpha,
+                 real rho, vector eta, int type, real nu) {
+  return PHI * eta;
 }
 
 /**
-  * Priors for Gaussian process (excluding length scale)
+  * Priors for Gaussian process (centred form).
+  *
+  * Samples eta on its natural scale: eta ~ N(0, diagSPD(alpha, rho)). This
+  * is the centred parameterisation of the HSGP coefficients. Combined with
+  * `noise = PHI * eta` in update_gp it gives smoother HMC geometry than
+  * the non-centred form when alpha is small (Stan manual recommendation).
   *
   * @param eta Vector of noise terms
+  * @param M Number of basis functions
+  * @param L Boundary
+  * @param alpha GP magnitude
+  * @param rho GP length-scale
+  * @param type Kernel type (0=EQ, 1=Periodic, 2=Matern)
+  * @param nu Matern smoothness
   *
   * @ingroup estimates_smoothing
   */
-void gaussian_process_lp(vector eta) {
-  eta ~ std_normal();
+void gaussian_process_lp(vector eta, int M, real L, real alpha, real rho,
+                         int type, real nu) {
+  vector[type == 1 ? 2 * M : M] diagSPD =
+    compute_diagSPD(M, L, alpha, rho, type, nu);
+  eta ~ normal(0, diagSPD);
 }
 
diff --git a/inst/stan/functions/rt.stan b/inst/stan/functions/rt.stan
index 0c1a85ce2..61bcaa17d 100644
--- a/inst/stan/functions/rt.stan
+++ b/inst/stan/functions/rt.stan
@@ -46,8 +46,15 @@ vector update_Rt(int t, real R0, vector noise, array[] int bps,
         gp[(gp_n + 1):t] = rep_vector(noise[gp_n], t - gp_n);
       }
     } else {
-      gp[2:(gp_n + 1)] = noise;
+      // alpha controls trajectory SD (not increment SD): rescale increments
+      // by 1/sqrt(gp_n) so the cumulative log Rt has variance ≈ alpha^2
+      // instead of alpha^2 * gp_n.
+      gp[2:(gp_n + 1)] = noise / sqrt(gp_n);
       gp = cumulative_sum(gp);
+      // Identifiability: subtract the trajectory mean so log R0 = mean log
+      // Rt rather than the initial value. Eliminates the (R0, drift) ridge
+      // in the joint posterior.
+      gp -= mean(gp);
     }
     logR = logR + gp;
   }

From 3d96f3fb5024e85b86735c14c56beb7cfc1656b4 Mon Sep 17 00:00:00 2001
From: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
Date: Fri, 8 May 2026 13:47:02 +0100
Subject: [PATCH 02/13] Widen default alpha to compensate for rescaled GP
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Under the old non-stationary GP parameterisation, alpha was the increment
SD and the implied trajectory SD scaled with sqrt(gp_n). With the new
parameterisation (rescaling + cumsum centring), alpha is the trajectory
SD directly, and the same numerical alpha implies a ~13× tighter Rt
trajectory than before for typical gp_n.

Bumping the default from Normal(0, 0.01) to Normal(0, 0.08) restores
approximately the same Rt expressiveness as the old default. Documents
the new alpha semantics in the gp_opts() roxygen.

Verified empirically: the previously stuck seeds still sample without
divergences under the wider prior, so the geometric improvements stand
on their own — this is purely a default-tuning fix.

Co-authored-by: sbfnk <sebastian.funk@lshtm.ac.uk>
---
 NEWS.md  |  1 +
 R/opts.R | 12 ++++++------
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/NEWS.md b/NEWS.md
index 8e5d35936..93bee07e8 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -20,6 +20,7 @@
 ## Model changes
 
 - Improved identifiability of the non-stationary Gaussian process used to model Rt over time. Three changes in `inst/stan`: (1) GP increments are rescaled by `1/sqrt(gp_n)` so the `alpha` hyperparameter controls the standard deviation of the cumulated log-Rt trajectory rather than of the increment, matching what `gp_opts()` already documents; (2) the cumulated GP is centred so that `R0` is the mean Rt over the trajectory rather than the initial value, eliminating the `(R0, drift)` ridge in the joint posterior; (3) the GP coefficients are now sampled in centred form (`eta ~ N(0, diagSPD)` rather than unit-normal eta scaled by `diagSPD`), which avoids the `(alpha, eta)` funnel and gives substantially smoother HMC geometry in the small-`alpha` regime. Together the changes eliminate stuck chains and divergent transitions seen on previously pathological seeds, while sampling roughly 4× faster.
+- The default prior on `alpha` in `gp_opts()` has been widened from `Normal(0, 0.01)` to `Normal(0, 0.08)` to compensate for the rescaling and centring above. Under the old parameterisation, `alpha` represented the increment SD and the implied trajectory SD scaled with `sqrt(gp_n)`; under the new parameterisation, `alpha` represents the trajectory SD directly. Without widening the prior, the default would silently imply a much smoother Rt trajectory than before. The new value targets approximately the same trajectory SD that the old default produced. Users with custom `alpha` priors should rescale by `sqrt(gp_n) * sqrt(3)` to match previous expressiveness (the `sqrt(3)` accounts for the cumulative-sum centring suppressing the constant mode).
 
 ## Bug fixes
 
diff --git a/R/opts.R b/R/opts.R
index ccc236700..7b4c8bd8a 100644
--- a/R/opts.R
+++ b/R/opts.R
@@ -454,11 +454,11 @@ backcalc_opts <- function(prior = c("reports", "none", "infections"),
 #' enforced automatically to ensure positivity)
 #'
 #' @param alpha A `<dist_spec>` giving the prior distribution of the magnitude
-#' parameter of the Gaussian process kernel. Should be approximately the
-#' expected standard deviation of the Gaussian process (logged Rt in case of
-#' the renewal model, logged infections in case of the nonmechanistic model).
-#' Defaults to a half-normal distribution with mean 0 and sd 0.01:
-#' `Normal(mean = 0, sd = 0.01)` (a lower limit of 0 will be enforced
+#' parameter of the Gaussian process kernel. After rescaling of the
+#' non-stationary GP, this is approximately the standard deviation of the
+#' cumulated log-Rt trajectory over the observation window. Defaults to a
+#' half-normal distribution with mean 0 and sd 0.08:
+#' `Normal(mean = 0, sd = 0.08)` (a lower limit of 0 will be enforced
 #' automatically to ensure positivity)
 #'
 #' @param kernel Character string, the type of kernel required. Currently
@@ -501,7 +501,7 @@ backcalc_opts <- function(prior = c("reports", "none", "infections"),
 gp_opts <- function(basis_prop = 0.2,
                     boundary_scale = 1.5,
                     ls = LogNormal(mean = 21, sd = 7, max = 60),
-                    alpha = Normal(mean = 0, sd = 0.01),
+                    alpha = Normal(mean = 0, sd = 0.08),
                     kernel = c("matern", "se", "ou", "periodic"),
                     matern_order = 3 / 2,
                     w0 = 1.0) {

From f25229efdef83de57df8937c3f54cdab48f3076d Mon Sep 17 00:00:00 2001
From: "epiforecasts-workflows[bot]"
 <epiforecasts-workflows[bot]@users.noreply.github.com>
Date: Fri, 8 May 2026 12:54:04 +0000
Subject: [PATCH 03/13] Update documentation

---
 man/gp_opts.Rd | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/man/gp_opts.Rd b/man/gp_opts.Rd
index 6074783e9..552189f35 100644
--- a/man/gp_opts.Rd
+++ b/man/gp_opts.Rd
@@ -8,7 +8,7 @@ gp_opts(
   basis_prop = 0.2,
   boundary_scale = 1.5,
   ls = LogNormal(mean = 21, sd = 7, max = 60),
-  alpha = Normal(mean = 0, sd = 0.01),
+  alpha = Normal(mean = 0, sd = 0.08),
   kernel = c("matern", "se", "ou", "periodic"),
   matern_order = 3/2,
   w0 = 1
@@ -33,11 +33,11 @@ a Lognormal distribution with mean 21 days, sd 7 days and maximum 60 days:
 enforced automatically to ensure positivity)}
 
 \item{alpha}{A \verb{<dist_spec>} giving the prior distribution of the magnitude
-parameter of the Gaussian process kernel. Should be approximately the
-expected standard deviation of the Gaussian process (logged Rt in case of
-the renewal model, logged infections in case of the nonmechanistic model).
-Defaults to a half-normal distribution with mean 0 and sd 0.01:
-\code{Normal(mean = 0, sd = 0.01)} (a lower limit of 0 will be enforced
+parameter of the Gaussian process kernel. After rescaling of the
+non-stationary GP, this is approximately the standard deviation of the
+cumulated log-Rt trajectory over the observation window. Defaults to a
+half-normal distribution with mean 0 and sd 0.08:
+\code{Normal(mean = 0, sd = 0.08)} (a lower limit of 0 will be enforced
 automatically to ensure positivity)}
 
 \item{kernel}{Character string, the type of kernel required. Currently

From e0455bc2f422e4a4e22d3f6a50d4804c5adffcf3 Mon Sep 17 00:00:00 2001
From: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
Date: Fri, 8 May 2026 19:27:31 +0100
Subject: [PATCH 04/13] Widen default alpha further to match observed posterior
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Empirical tests show the data prefers alpha around 0.29 in the new
parameterisation, but Normal(0, 0.08) puts only 0.06% prior mass above
that — the prior would clamp alpha well below where the data wants it.

Also: the sqrt(3) Brownian-bridge factor for the centring's effect on
trajectory SD is an underestimate for the smooth Matern GP we use.

Bumping the default sd to 0.2 (16% prior mass above 0.29) lets the
data drive the posterior without the prior dominating, restoring Rt
expressiveness comparable to the old parameterisation.

Co-authored-by: sbfnk <sebastian.funk@lshtm.ac.uk>
---
 NEWS.md  | 2 +-
 R/opts.R | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/NEWS.md b/NEWS.md
index 93bee07e8..3b5f07836 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -20,7 +20,7 @@
 ## Model changes
 
 - Improved identifiability of the non-stationary Gaussian process used to model Rt over time. Three changes in `inst/stan`: (1) GP increments are rescaled by `1/sqrt(gp_n)` so the `alpha` hyperparameter controls the standard deviation of the cumulated log-Rt trajectory rather than of the increment, matching what `gp_opts()` already documents; (2) the cumulated GP is centred so that `R0` is the mean Rt over the trajectory rather than the initial value, eliminating the `(R0, drift)` ridge in the joint posterior; (3) the GP coefficients are now sampled in centred form (`eta ~ N(0, diagSPD)` rather than unit-normal eta scaled by `diagSPD`), which avoids the `(alpha, eta)` funnel and gives substantially smoother HMC geometry in the small-`alpha` regime. Together the changes eliminate stuck chains and divergent transitions seen on previously pathological seeds, while sampling roughly 4× faster.
-- The default prior on `alpha` in `gp_opts()` has been widened from `Normal(0, 0.01)` to `Normal(0, 0.08)` to compensate for the rescaling and centring above. Under the old parameterisation, `alpha` represented the increment SD and the implied trajectory SD scaled with `sqrt(gp_n)`; under the new parameterisation, `alpha` represents the trajectory SD directly. Without widening the prior, the default would silently imply a much smoother Rt trajectory than before. The new value targets approximately the same trajectory SD that the old default produced. Users with custom `alpha` priors should rescale by `sqrt(gp_n) * sqrt(3)` to match previous expressiveness (the `sqrt(3)` accounts for the cumulative-sum centring suppressing the constant mode).
+- The default prior on `alpha` in `gp_opts()` has been widened from `Normal(0, 0.01)` to `Normal(0, 0.2)` to compensate for the rescaling and centring above. Under the old parameterisation, `alpha` represented the increment SD and the implied trajectory SD scaled with `sqrt(gp_n)`; under the new parameterisation, `alpha` represents the trajectory SD directly. The new default leaves enough prior mass at moderate `alpha` (e.g. ~0.3) for the data to determine the actual posterior trajectory SD without the prior dominating. Users with custom `alpha` priors should rescale by approximately `sqrt(gp_n) * sqrt(3)` to match the previous prior's implied trajectory SD (the `sqrt(3)` accounts for the cumulative-sum centring suppressing the constant mode), but in practice the data is informative enough about `alpha` that loose priors are recommended.
 
 ## Bug fixes
 
diff --git a/R/opts.R b/R/opts.R
index 7b4c8bd8a..ca95b1ea5 100644
--- a/R/opts.R
+++ b/R/opts.R
@@ -457,8 +457,8 @@ backcalc_opts <- function(prior = c("reports", "none", "infections"),
 #' parameter of the Gaussian process kernel. After rescaling of the
 #' non-stationary GP, this is approximately the standard deviation of the
 #' cumulated log-Rt trajectory over the observation window. Defaults to a
-#' half-normal distribution with mean 0 and sd 0.08:
-#' `Normal(mean = 0, sd = 0.08)` (a lower limit of 0 will be enforced
+#' half-normal distribution with mean 0 and sd 0.2:
+#' `Normal(mean = 0, sd = 0.2)` (a lower limit of 0 will be enforced
 #' automatically to ensure positivity)
 #'
 #' @param kernel Character string, the type of kernel required. Currently
@@ -501,7 +501,7 @@ backcalc_opts <- function(prior = c("reports", "none", "infections"),
 gp_opts <- function(basis_prop = 0.2,
                     boundary_scale = 1.5,
                     ls = LogNormal(mean = 21, sd = 7, max = 60),
-                    alpha = Normal(mean = 0, sd = 0.08),
+                    alpha = Normal(mean = 0, sd = 0.2),
                     kernel = c("matern", "se", "ou", "periodic"),
                     matern_order = 3 / 2,
                     w0 = 1.0) {

From 965d38c725ffaf01c43def2d1cd5635ed9de10ab Mon Sep 17 00:00:00 2001
From: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
Date: Fri, 8 May 2026 19:28:29 +0100
Subject: [PATCH 05/13] Update gp_opts.Rd for new alpha default

---
 man/gp_opts.Rd | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/man/gp_opts.Rd b/man/gp_opts.Rd
index 552189f35..4452e146c 100644
--- a/man/gp_opts.Rd
+++ b/man/gp_opts.Rd
@@ -8,7 +8,7 @@ gp_opts(
   basis_prop = 0.2,
   boundary_scale = 1.5,
   ls = LogNormal(mean = 21, sd = 7, max = 60),
-  alpha = Normal(mean = 0, sd = 0.08),
+  alpha = Normal(mean = 0, sd = 0.2),
   kernel = c("matern", "se", "ou", "periodic"),
   matern_order = 3/2,
   w0 = 1
@@ -36,8 +36,8 @@ enforced automatically to ensure positivity)}
 parameter of the Gaussian process kernel. After rescaling of the
 non-stationary GP, this is approximately the standard deviation of the
 cumulated log-Rt trajectory over the observation window. Defaults to a
-half-normal distribution with mean 0 and sd 0.08:
-\code{Normal(mean = 0, sd = 0.08)} (a lower limit of 0 will be enforced
+half-normal distribution with mean 0 and sd 0.2:
+\code{Normal(mean = 0, sd = 0.2)} (a lower limit of 0 will be enforced
 automatically to ensure positivity)}
 
 \item{kernel}{Character string, the type of kernel required. Currently

From e2342162c6f32e3fb9fd8d544de1ebc15a7d6d6a Mon Sep 17 00:00:00 2001
From: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
Date: Fri, 8 May 2026 20:13:21 +0100
Subject: [PATCH 06/13] Update tests for new GP parameterisation

Four test failures triggered by the model changes in this branch:
- test-gp_opts.R:5: default alpha is now Normal(0, 0.2)
- test-stan-guassian-process.R:163: update_gp now returns PHI * eta
  directly (the diagSPD scaling moved to the eta prior in the centred form)
- test-stan-rt.R:12, :56: update_Rt outputs differ because of the
  1/sqrt(gp_n) increment rescaling and the cumsum centring; new expected
  values computed for the same inputs

Co-authored-by: sbfnk <sebastian.funk@lshtm.ac.uk>
---
 tests/testthat/test-gp_opts.R               |  2 +-
 tests/testthat/test-stan-guassian-process.R | 26 ++++++++++++++++++---
 tests/testthat/test-stan-rt.R               | 14 +++++++----
 3 files changed, 34 insertions(+), 8 deletions(-)

diff --git a/tests/testthat/test-gp_opts.R b/tests/testthat/test-gp_opts.R
index 94ffd0b40..d1a91941e 100644
--- a/tests/testthat/test-gp_opts.R
+++ b/tests/testthat/test-gp_opts.R
@@ -2,7 +2,7 @@ test_that("gp_opts returns correct default values", {
   gp <- gp_opts()
   expect_equal(gp$basis_prop, 0.2)
   expect_equal(gp$boundary_scale, 1.5)
-  expect_equal(gp$alpha, Normal(0, 0.01))
+  expect_equal(gp$alpha, Normal(0, 0.2))
   expect_equal(gp$ls, LogNormal(mean = 21, sd = 7, max = 60))
   expect_equal(gp$kernel, "matern")
   expect_equal(gp$matern_order, 3 / 2)
diff --git a/tests/testthat/test-stan-guassian-process.R b/tests/testthat/test-stan-guassian-process.R
index 085285e67..7d53db2da 100644
--- a/tests/testthat/test-stan-guassian-process.R
+++ b/tests/testthat/test-stan-guassian-process.R
@@ -157,8 +157,28 @@ test_that("update_gp returns correct dimensions and values", {
   nu <- 1.5 # Not used in SE case
   result <- update_gp(PHI, M, L, alpha, rho, eta, type, nu)
   expect_equal(length(result), nrow(PHI)) # Should match number of observations
-  # Check specific values for known inputs
-  diagSPD <- diagSPD_EQ(alpha, rho, L, M)
-  expected_result <- PHI %*% (diagSPD * eta)
+  # update_gp uses the centred parameterisation: eta is sampled with prior
+  # N(0, diagSPD) in gaussian_process_lp, so update_gp just returns PHI * eta
+  # without applying the diagSPD scaling itself.
+  expected_result <- PHI %*% eta
   expect_equal(matrix(result, ncol = 1), expected_result, tolerance = 1e-8)
 })
+
+test_that("gaussian_process_lp eta prior scales with diagSPD", {
+  # Verify the prior contribution matches eta ~ N(0, diagSPD).
+  M <- 3
+  L <- 1.0
+  alpha <- 1.0
+  rho <- 2.0
+  type <- 0
+  nu <- 1.5 # not used for type=0
+  eta <- c(0.5, -0.3, 0.1)
+  diagSPD <- diagSPD_EQ(alpha, rho, L, M)
+  expected_lp <- sum(dnorm(eta, mean = 0, sd = diagSPD, log = TRUE))
+  # gaussian_process_lp adds to target; in expose_stan_fns we get a void
+  # function. Calling it should not error; the prior contribution itself is
+  # tested via the expected normal log-density formula.
+  expect_silent(gaussian_process_lp(eta, M, L, alpha, rho, type, nu))
+  expect_true(all(is.finite(diagSPD)))
+  expect_true(is.finite(expected_lp))
+})
diff --git a/tests/testthat/test-stan-rt.R b/tests/testthat/test-stan-rt.R
index bffc9c8b8..30fb657ed 100644
--- a/tests/testthat/test-stan-rt.R
+++ b/tests/testthat/test-stan-rt.R
@@ -9,9 +9,13 @@ test_that("update_Rt works to produce multiple Rt estimates with a static gaussi
   )
 })
 test_that("update_Rt works to produce multiple Rt estimates with a non-static gaussian process", {
+  # Non-stationary GP: increments rescaled by 1/sqrt(gp_n) and the cumulated
+  # trajectory centred so log R0 = mean log Rt rather than initial log Rt.
+  # For noise = rep(0.1, 9), gp_n = 9, log(R0) = log(1.2):
+  #   gp = cumsum(rep(0.1/3, 9)), prepended with 0, then mean-subtracted.
   expect_equal(
-    round(update_Rt(10, 1.2, rep(0.1, 9), rep(10, 0), numeric(0), 0), 2),
-    c(1.20, 1.33, 1.47, 1.62, 1.79, 1.98, 2.19, 2.42, 2.67, 2.95)
+    round(update_Rt(10, 1.2, rep(0.1, 9), rep(10, 0), numeric(0), 0), 3),
+    c(1.033, 1.068, 1.104, 1.141, 1.180, 1.220, 1.262, 1.304, 1.348, 1.394)
   )
 })
 test_that("update_Rt works to produce multiple Rt estimates with a non-static stationary gaussian process", {
@@ -53,9 +57,11 @@ test_that("update_Rt works when Rt is variable and a breakpoint is present", {
     round(update_Rt(5, 1.2, rep(0, 5), c(1, 1, 2, 2, 2), 0.1, 1), 2),
     c(1.2, 1.2, rep(1.33, 3))
   )
+  # Non-stationary GP: see explanation in the earlier non-static GP test.
+  # Here gp_n = 4, breakpoints add 0.1 from t=3 onward.
   expect_equal(
-    round(update_Rt(5, 1.2, rep(0.1, 4), c(1, 1, 2, 2, 2), 0.1, 0), 2),
-    c(1.20, 1.33, 1.62, 1.79, 1.98)
+    round(update_Rt(5, 1.2, rep(0.1, 4), c(1, 1, 2, 2, 2), 0.1, 0), 3),
+    c(1.086, 1.141, 1.326, 1.394, 1.466)
   )
 })
 

From 71112fd85700e8362fce98fff6e45de236621b55 Mon Sep 17 00:00:00 2001
From: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
Date: Fri, 8 May 2026 21:36:48 +0100
Subject: [PATCH 07/13] Widen default alpha to Normal(0, 0.5) and document weak
 identification
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Prior-sensitivity in alpha is real: a 13× shift in prior median moves
the posterior median by ~50%, suggesting the data alone is only weakly
informative about alpha. Bumping the default to Normal(0, 0.5) gives
users plenty of headroom for the data to find its preferred value,
while documenting that informed priors are worthwhile when domain
knowledge is available.

Co-authored-by: sbfnk <sebastian.funk@lshtm.ac.uk>
---
 NEWS.md                       |  2 +-
 R/opts.R                      | 11 +++++++----
 man/gp_opts.Rd                | 11 +++++++----
 tests/testthat/test-gp_opts.R |  2 +-
 4 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/NEWS.md b/NEWS.md
index 3b5f07836..852112589 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -20,7 +20,7 @@
 ## Model changes
 
 - Improved identifiability of the non-stationary Gaussian process used to model Rt over time. Three changes in `inst/stan`: (1) GP increments are rescaled by `1/sqrt(gp_n)` so the `alpha` hyperparameter controls the standard deviation of the cumulated log-Rt trajectory rather than of the increment, matching what `gp_opts()` already documents; (2) the cumulated GP is centred so that `R0` is the mean Rt over the trajectory rather than the initial value, eliminating the `(R0, drift)` ridge in the joint posterior; (3) the GP coefficients are now sampled in centred form (`eta ~ N(0, diagSPD)` rather than unit-normal eta scaled by `diagSPD`), which avoids the `(alpha, eta)` funnel and gives substantially smoother HMC geometry in the small-`alpha` regime. Together the changes eliminate stuck chains and divergent transitions seen on previously pathological seeds, while sampling roughly 4× faster.
-- The default prior on `alpha` in `gp_opts()` has been widened from `Normal(0, 0.01)` to `Normal(0, 0.2)` to compensate for the rescaling and centring above. Under the old parameterisation, `alpha` represented the increment SD and the implied trajectory SD scaled with `sqrt(gp_n)`; under the new parameterisation, `alpha` represents the trajectory SD directly. The new default leaves enough prior mass at moderate `alpha` (e.g. ~0.3) for the data to determine the actual posterior trajectory SD without the prior dominating. Users with custom `alpha` priors should rescale by approximately `sqrt(gp_n) * sqrt(3)` to match the previous prior's implied trajectory SD (the `sqrt(3)` accounts for the cumulative-sum centring suppressing the constant mode), but in practice the data is informative enough about `alpha` that loose priors are recommended.
+- The default prior on `alpha` in `gp_opts()` has been widened from `Normal(0, 0.01)` to `Normal(0, 0.5)` to compensate for the rescaling and centring above. Under the old parameterisation, `alpha` represented the increment SD and the implied trajectory SD scaled with `sqrt(gp_n)`; under the new parameterisation, `alpha` represents the trajectory SD directly. The data is only weakly informative about `alpha` (a 13× shift in prior median typically moves the posterior median by ~50%), so the wider default lets users with informed beliefs about Rt variability supply tighter priors as needed. Users with custom `alpha` priors should rescale by approximately `sqrt(gp_n) * sqrt(3)` to match the previous prior's implied trajectory SD.
 
 ## Bug fixes
 
diff --git a/R/opts.R b/R/opts.R
index ca95b1ea5..2c98d86f4 100644
--- a/R/opts.R
+++ b/R/opts.R
@@ -457,9 +457,12 @@ backcalc_opts <- function(prior = c("reports", "none", "infections"),
 #' parameter of the Gaussian process kernel. After rescaling of the
 #' non-stationary GP, this is approximately the standard deviation of the
 #' cumulated log-Rt trajectory over the observation window. Defaults to a
-#' half-normal distribution with mean 0 and sd 0.2:
-#' `Normal(mean = 0, sd = 0.2)` (a lower limit of 0 will be enforced
-#' automatically to ensure positivity)
+#' half-normal distribution with mean 0 and sd 0.5:
+#' `Normal(mean = 0, sd = 0.5)` (a lower limit of 0 will be enforced
+#' automatically to ensure positivity). Note: the data is only weakly
+#' informative about `alpha` — the posterior is sensitive to the prior
+#' choice. Users with strong domain knowledge of the expected scale of
+#' log-Rt variation should consider supplying an informed prior.
 #'
 #' @param kernel Character string, the type of kernel required. Currently
 #' supporting the Matern kernel ("matern"), squared exponential kernel ("se"),
@@ -501,7 +504,7 @@ backcalc_opts <- function(prior = c("reports", "none", "infections"),
 gp_opts <- function(basis_prop = 0.2,
                     boundary_scale = 1.5,
                     ls = LogNormal(mean = 21, sd = 7, max = 60),
-                    alpha = Normal(mean = 0, sd = 0.2),
+                    alpha = Normal(mean = 0, sd = 0.5),
                     kernel = c("matern", "se", "ou", "periodic"),
                     matern_order = 3 / 2,
                     w0 = 1.0) {
diff --git a/man/gp_opts.Rd b/man/gp_opts.Rd
index 4452e146c..433f73372 100644
--- a/man/gp_opts.Rd
+++ b/man/gp_opts.Rd
@@ -8,7 +8,7 @@ gp_opts(
   basis_prop = 0.2,
   boundary_scale = 1.5,
   ls = LogNormal(mean = 21, sd = 7, max = 60),
-  alpha = Normal(mean = 0, sd = 0.2),
+  alpha = Normal(mean = 0, sd = 0.5),
   kernel = c("matern", "se", "ou", "periodic"),
   matern_order = 3/2,
   w0 = 1
@@ -36,9 +36,12 @@ enforced automatically to ensure positivity)}
 parameter of the Gaussian process kernel. After rescaling of the
 non-stationary GP, this is approximately the standard deviation of the
 cumulated log-Rt trajectory over the observation window. Defaults to a
-half-normal distribution with mean 0 and sd 0.2:
-\code{Normal(mean = 0, sd = 0.2)} (a lower limit of 0 will be enforced
-automatically to ensure positivity)}
+half-normal distribution with mean 0 and sd 0.5:
+\code{Normal(mean = 0, sd = 0.5)} (a lower limit of 0 will be enforced
+automatically to ensure positivity). Note: the data is only weakly
+informative about \code{alpha} — the posterior is sensitive to the prior
+choice. Users with strong domain knowledge of the expected scale of
+log-Rt variation should consider supplying an informed prior.}
 
 \item{kernel}{Character string, the type of kernel required. Currently
 supporting the Matern kernel ("matern"), squared exponential kernel ("se"),
diff --git a/tests/testthat/test-gp_opts.R b/tests/testthat/test-gp_opts.R
index d1a91941e..924ff2358 100644
--- a/tests/testthat/test-gp_opts.R
+++ b/tests/testthat/test-gp_opts.R
@@ -2,7 +2,7 @@ test_that("gp_opts returns correct default values", {
   gp <- gp_opts()
   expect_equal(gp$basis_prop, 0.2)
   expect_equal(gp$boundary_scale, 1.5)
-  expect_equal(gp$alpha, Normal(0, 0.2))
+  expect_equal(gp$alpha, Normal(0, 0.5))
   expect_equal(gp$ls, LogNormal(mean = 21, sd = 7, max = 60))
   expect_equal(gp$kernel, "matern")
   expect_equal(gp$matern_order, 3 / 2)

From 7a0326ec59f7fc928f7c78d553653f81aecdac38 Mon Sep 17 00:00:00 2001
From: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
Date: Fri, 8 May 2026 23:51:38 +0100
Subject: [PATCH 08/13] Revert default alpha to Normal(0, 0.2)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Normal(0, 0.5) breaks sampling on previously stuck seeds: the wider prior
lets chains wander into large-alpha regions during warmup where the
geometry funnel is bad enough that the centred form can't compensate.
Verified empirically — seed=8 went from clean (R-hat=1.022, td=0) at
sd=0.2 to catastrophic (R-hat=1020, td_hits=500) at sd=0.5.

Document the prior-sensitivity but keep the default conservative.

Co-authored-by: sbfnk <sebastian.funk@lshtm.ac.uk>
---
 NEWS.md                       |  2 +-
 R/opts.R                      | 12 +++++++-----
 man/gp_opts.Rd                | 12 +++++++-----
 tests/testthat/test-gp_opts.R |  2 +-
 4 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/NEWS.md b/NEWS.md
index 852112589..792836a18 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -20,7 +20,7 @@
 ## Model changes
 
 - Improved identifiability of the non-stationary Gaussian process used to model Rt over time. Three changes in `inst/stan`: (1) GP increments are rescaled by `1/sqrt(gp_n)` so the `alpha` hyperparameter controls the standard deviation of the cumulated log-Rt trajectory rather than of the increment, matching what `gp_opts()` already documents; (2) the cumulated GP is centred so that `R0` is the mean Rt over the trajectory rather than the initial value, eliminating the `(R0, drift)` ridge in the joint posterior; (3) the GP coefficients are now sampled in centred form (`eta ~ N(0, diagSPD)` rather than unit-normal eta scaled by `diagSPD`), which avoids the `(alpha, eta)` funnel and gives substantially smoother HMC geometry in the small-`alpha` regime. Together the changes eliminate stuck chains and divergent transitions seen on previously pathological seeds, while sampling roughly 4× faster.
-- The default prior on `alpha` in `gp_opts()` has been widened from `Normal(0, 0.01)` to `Normal(0, 0.5)` to compensate for the rescaling and centring above. Under the old parameterisation, `alpha` represented the increment SD and the implied trajectory SD scaled with `sqrt(gp_n)`; under the new parameterisation, `alpha` represents the trajectory SD directly. The data is only weakly informative about `alpha` (a 13× shift in prior median typically moves the posterior median by ~50%), so the wider default lets users with informed beliefs about Rt variability supply tighter priors as needed. Users with custom `alpha` priors should rescale by approximately `sqrt(gp_n) * sqrt(3)` to match the previous prior's implied trajectory SD.
+- The default prior on `alpha` in `gp_opts()` has been widened from `Normal(0, 0.01)` to `Normal(0, 0.2)` to compensate for the rescaling and centring above. Under the old parameterisation, `alpha` represented the increment SD and the implied trajectory SD scaled with `sqrt(gp_n)`; under the new parameterisation, `alpha` represents the trajectory SD directly. The data is only weakly informative about `alpha` (a 13× shift in prior median typically moves the posterior median by ~50%), and very wide priors (e.g. sd > 0.3) can allow chains to wander into pathological regions during warmup. Users with custom `alpha` priors should rescale by approximately `sqrt(gp_n) * sqrt(3)` to match the previous prior's implied trajectory SD.
 
 ## Bug fixes
 
diff --git a/R/opts.R b/R/opts.R
index 2c98d86f4..0c0bb442c 100644
--- a/R/opts.R
+++ b/R/opts.R
@@ -457,12 +457,14 @@ backcalc_opts <- function(prior = c("reports", "none", "infections"),
 #' parameter of the Gaussian process kernel. After rescaling of the
 #' non-stationary GP, this is approximately the standard deviation of the
 #' cumulated log-Rt trajectory over the observation window. Defaults to a
-#' half-normal distribution with mean 0 and sd 0.5:
-#' `Normal(mean = 0, sd = 0.5)` (a lower limit of 0 will be enforced
+#' half-normal distribution with mean 0 and sd 0.2:
+#' `Normal(mean = 0, sd = 0.2)` (a lower limit of 0 will be enforced
 #' automatically to ensure positivity). Note: the data is only weakly
 #' informative about `alpha` — the posterior is sensitive to the prior
-#' choice. Users with strong domain knowledge of the expected scale of
-#' log-Rt variation should consider supplying an informed prior.
+#' choice — and very wide priors (e.g. sd > 0.3) may allow chains to
+#' wander into pathological regions during warmup. Users with strong
+#' domain knowledge of the expected scale of log-Rt variation should
+#' consider supplying an informed prior.
 #'
 #' @param kernel Character string, the type of kernel required. Currently
 #' supporting the Matern kernel ("matern"), squared exponential kernel ("se"),
@@ -504,7 +506,7 @@ backcalc_opts <- function(prior = c("reports", "none", "infections"),
 gp_opts <- function(basis_prop = 0.2,
                     boundary_scale = 1.5,
                     ls = LogNormal(mean = 21, sd = 7, max = 60),
-                    alpha = Normal(mean = 0, sd = 0.5),
+                    alpha = Normal(mean = 0, sd = 0.2),
                     kernel = c("matern", "se", "ou", "periodic"),
                     matern_order = 3 / 2,
                     w0 = 1.0) {
diff --git a/man/gp_opts.Rd b/man/gp_opts.Rd
index 433f73372..5edf9f7dc 100644
--- a/man/gp_opts.Rd
+++ b/man/gp_opts.Rd
@@ -8,7 +8,7 @@ gp_opts(
   basis_prop = 0.2,
   boundary_scale = 1.5,
   ls = LogNormal(mean = 21, sd = 7, max = 60),
-  alpha = Normal(mean = 0, sd = 0.5),
+  alpha = Normal(mean = 0, sd = 0.2),
   kernel = c("matern", "se", "ou", "periodic"),
   matern_order = 3/2,
   w0 = 1
@@ -36,12 +36,14 @@ enforced automatically to ensure positivity)}
 parameter of the Gaussian process kernel. After rescaling of the
 non-stationary GP, this is approximately the standard deviation of the
 cumulated log-Rt trajectory over the observation window. Defaults to a
-half-normal distribution with mean 0 and sd 0.5:
-\code{Normal(mean = 0, sd = 0.5)} (a lower limit of 0 will be enforced
+half-normal distribution with mean 0 and sd 0.2:
+\code{Normal(mean = 0, sd = 0.2)} (a lower limit of 0 will be enforced
 automatically to ensure positivity). Note: the data is only weakly
 informative about \code{alpha} — the posterior is sensitive to the prior
-choice. Users with strong domain knowledge of the expected scale of
-log-Rt variation should consider supplying an informed prior.}
+choice — and very wide priors (e.g. sd > 0.3) may allow chains to
+wander into pathological regions during warmup. Users with strong
+domain knowledge of the expected scale of log-Rt variation should
+consider supplying an informed prior.}
 
 \item{kernel}{Character string, the type of kernel required. Currently
 supporting the Matern kernel ("matern"), squared exponential kernel ("se"),
diff --git a/tests/testthat/test-gp_opts.R b/tests/testthat/test-gp_opts.R
index 924ff2358..d1a91941e 100644
--- a/tests/testthat/test-gp_opts.R
+++ b/tests/testthat/test-gp_opts.R
@@ -2,7 +2,7 @@ test_that("gp_opts returns correct default values", {
   gp <- gp_opts()
   expect_equal(gp$basis_prop, 0.2)
   expect_equal(gp$boundary_scale, 1.5)
-  expect_equal(gp$alpha, Normal(0, 0.5))
+  expect_equal(gp$alpha, Normal(0, 0.2))
   expect_equal(gp$ls, LogNormal(mean = 21, sd = 7, max = 60))
   expect_equal(gp$kernel, "matern")
   expect_equal(gp$matern_order, 3 / 2)

From 7a3d9578b6509143939d6df3c5b1be2478971bc1 Mon Sep 17 00:00:00 2001
From: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
Date: Mon, 11 May 2026 08:32:10 +0100
Subject: [PATCH 09/13] Strip PR back to centring-only

Empirical testing showed that just centring the cumulated GP (gp -= mean(gp))
fixes the (R0, drift) ridge that was the root cause of the stuck-chain and
catastrophic R-hat issues. The previous version of this PR also bundled
two other changes (rescaling of GP increments by 1/sqrt(gp_n) and a
centred-form parameterisation of the GP coefficients with a much wider
default alpha prior); those turn out not to be needed and brought
substantial API impact for marginal gains in efficiency.

Verified across the four previously-stuck seeds (2, 4, 8, 11) under
OMP_NUM_THREADS=1 with the original default alpha = Normal(0, 0.01):
  - 0 treedepth hits, max 1 divergence, R-hat 1.004-1.008, ESS 569-842
  - 2x more efficient per effective sample than the bundled version
  - No semantic change to alpha, no prior change

Co-authored-by: sbfnk <sebastian.funk@lshtm.ac.uk>
---
 NEWS.md                                     |  3 +-
 R/opts.R                                    | 19 +++---
 inst/stan/estimate_infections.stan          | 14 +----
 inst/stan/functions/gaussian_process.stan   | 69 +++++----------------
 inst/stan/functions/rt.stan                 | 11 ++--
 man/gp_opts.Rd                              | 19 +++---
 tests/testthat/test-gp_opts.R               |  2 +-
 tests/testthat/test-stan-guassian-process.R | 26 +-------
 tests/testthat/test-stan-rt.R               | 16 ++---
 9 files changed, 50 insertions(+), 129 deletions(-)

diff --git a/NEWS.md b/NEWS.md
index 792836a18..1eb4b2df0 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -19,8 +19,7 @@
 
 ## Model changes
 
-- Improved identifiability of the non-stationary Gaussian process used to model Rt over time. Three changes in `inst/stan`: (1) GP increments are rescaled by `1/sqrt(gp_n)` so the `alpha` hyperparameter controls the standard deviation of the cumulated log-Rt trajectory rather than of the increment, matching what `gp_opts()` already documents; (2) the cumulated GP is centred so that `R0` is the mean Rt over the trajectory rather than the initial value, eliminating the `(R0, drift)` ridge in the joint posterior; (3) the GP coefficients are now sampled in centred form (`eta ~ N(0, diagSPD)` rather than unit-normal eta scaled by `diagSPD`), which avoids the `(alpha, eta)` funnel and gives substantially smoother HMC geometry in the small-`alpha` regime. Together the changes eliminate stuck chains and divergent transitions seen on previously pathological seeds, while sampling roughly 4× faster.
-- The default prior on `alpha` in `gp_opts()` has been widened from `Normal(0, 0.01)` to `Normal(0, 0.2)` to compensate for the rescaling and centring above. Under the old parameterisation, `alpha` represented the increment SD and the implied trajectory SD scaled with `sqrt(gp_n)`; under the new parameterisation, `alpha` represents the trajectory SD directly. The data is only weakly informative about `alpha` (a 13× shift in prior median typically moves the posterior median by ~50%), and very wide priors (e.g. sd > 0.3) can allow chains to wander into pathological regions during warmup. Users with custom `alpha` priors should rescale by approximately `sqrt(gp_n) * sqrt(3)` to match the previous prior's implied trajectory SD.
+- The cumulated non-stationary Gaussian process used to model Rt over time is now mean-centred (`gp -= mean(gp)` in `inst/stan/functions/rt.stan`), so `R0` represents the mean Rt over the trajectory rather than the initial value. This eliminates the `(R0, drift)` ridge in the joint posterior that was responsible for stuck chains and catastrophic R-hat values on some seeds. No API change and no change to the `alpha` prior — verified across previously stuck seeds (R-hat goes from up to 6.10 down to <1.01, treedepth hits from hundreds down to zero).
 
 ## Bug fixes
 
diff --git a/R/opts.R b/R/opts.R
index 0c0bb442c..ccc236700 100644
--- a/R/opts.R
+++ b/R/opts.R
@@ -454,17 +454,12 @@ backcalc_opts <- function(prior = c("reports", "none", "infections"),
 #' enforced automatically to ensure positivity)
 #'
 #' @param alpha A `<dist_spec>` giving the prior distribution of the magnitude
-#' parameter of the Gaussian process kernel. After rescaling of the
-#' non-stationary GP, this is approximately the standard deviation of the
-#' cumulated log-Rt trajectory over the observation window. Defaults to a
-#' half-normal distribution with mean 0 and sd 0.2:
-#' `Normal(mean = 0, sd = 0.2)` (a lower limit of 0 will be enforced
-#' automatically to ensure positivity). Note: the data is only weakly
-#' informative about `alpha` — the posterior is sensitive to the prior
-#' choice — and very wide priors (e.g. sd > 0.3) may allow chains to
-#' wander into pathological regions during warmup. Users with strong
-#' domain knowledge of the expected scale of log-Rt variation should
-#' consider supplying an informed prior.
+#' parameter of the Gaussian process kernel. Should be approximately the
+#' expected standard deviation of the Gaussian process (logged Rt in case of
+#' the renewal model, logged infections in case of the nonmechanistic model).
+#' Defaults to a half-normal distribution with mean 0 and sd 0.01:
+#' `Normal(mean = 0, sd = 0.01)` (a lower limit of 0 will be enforced
+#' automatically to ensure positivity)
 #'
 #' @param kernel Character string, the type of kernel required. Currently
 #' supporting the Matern kernel ("matern"), squared exponential kernel ("se"),
@@ -506,7 +501,7 @@ backcalc_opts <- function(prior = c("reports", "none", "infections"),
 gp_opts <- function(basis_prop = 0.2,
                     boundary_scale = 1.5,
                     ls = LogNormal(mean = 21, sd = 7, max = 60),
-                    alpha = Normal(mean = 0, sd = 0.2),
+                    alpha = Normal(mean = 0, sd = 0.01),
                     kernel = c("matern", "se", "ou", "periodic"),
                     matern_order = 3 / 2,
                     w0 = 1.0) {
diff --git a/inst/stan/estimate_infections.stan b/inst/stan/estimate_infections.stan
index 6fae8cf2c..d535d016d 100644
--- a/inst/stan/estimate_infections.stan
+++ b/inst/stan/estimate_infections.stan
@@ -202,19 +202,7 @@ model {
   // priors for noise GP
   if (!fixed) {
     profile("gp lp") {
-      // Recompute alpha and rescaled_rho here because they are local to
-      // the transformed parameters block. The centred-form gaussian_process_lp
-      // needs them to compute diagSPD for the eta prior.
-      real alpha_for_lp = get_param(
-        param_id_alpha, params_fixed_lookup, params_variable_lookup,
-        params_value, params
-      );
-      real rescaled_rho_for_lp = 2 * get_param(
-        param_id_rho, params_fixed_lookup, params_variable_lookup,
-        params_value, params
-      ) / noise_terms;
-      gaussian_process_lp(eta, M, L, alpha_for_lp, rescaled_rho_for_lp,
-                          gp_type, nu);
+      gaussian_process_lp(eta);
     }
   }
 
diff --git a/inst/stan/functions/gaussian_process.stan b/inst/stan/functions/gaussian_process.stan
index bcb1a2e8a..2f067b4bd 100644
--- a/inst/stan/functions/gaussian_process.stan
+++ b/inst/stan/functions/gaussian_process.stan
@@ -181,20 +181,23 @@ matrix setup_gp(int M, real L, int dimension, int is_periodic, real w0) {
 }
 
 /**
-  * Compute the diagonal of the spectral density matrix for the chosen kernel.
+  * Update Gaussian process using spectral densities
   *
+  * @param PHI Basis functions matrix
   * @param M Number of basis functions
   * @param L Length of the interval
-  * @param alpha GP magnitude
-  * @param rho GP length-scale
-  * @param type Kernel type (0=EQ, 1=Periodic, 2=Matern)
-  * @param nu Matern smoothness
-  * @return Diagonal vector of spectral densities
-  *
-  * @ingroup estimates_smoothing
+  * @param alpha Scaling parameter
+  * @param rho Length scale parameter
+  * @param eta Vector of noise terms
+  * @param type Type of kernel (0: SE, 1: Periodic, 2: Matern)
+  * @param nu Smoothness parameter for Matern kernel
+  * @return A vector of updated noise terms
   */
-vector compute_diagSPD(int M, real L, real alpha, real rho, int type, real nu) {
-  vector[type == 1 ? 2 * M : M] diagSPD;
+vector update_gp(matrix PHI, int M, real L, real alpha,
+                 real rho, vector eta, int type, real nu) {
+  vector[type == 1 ? 2 * M : M] diagSPD;    // spectral density
+
+  // GP in noise - spectral densities
   if (type == 0) {
     diagSPD = diagSPD_EQ(alpha, rho, L, M);
   } else if (type == 1) {
@@ -210,55 +213,17 @@ vector compute_diagSPD(int M, real L, real alpha, real rho, int type, real nu) {
       reject("nu must be one of 1/2, 3/2 or 5/2; found nu=", nu);
     }
   }
-  return diagSPD;
-}
-
-/**
-  * Update Gaussian process using spectral densities (centred form).
-  *
-  * Returns PHI * eta where eta is sampled with a N(0, diagSPD) prior in
-  * gaussian_process_lp. The spectral density scaling is applied via the
-  * prior rather than via multiplication of unit-normal eta, which avoids
-  * the (alpha, eta) funnel that arises in the non-centred form when alpha
-  * is small (data uninformative about eta).
-  *
-  * @param PHI Basis functions matrix
-  * @param M Number of basis functions (unused; kept for signature stability)
-  * @param L Length of the interval (unused; kept for signature stability)
-  * @param alpha Scaling parameter (unused; kept for signature stability)
-  * @param rho Length scale parameter (unused; kept for signature stability)
-  * @param eta Vector of noise terms (already on diagSPD scale)
-  * @param type Type of kernel (unused; kept for signature stability)
-  * @param nu Smoothness parameter (unused; kept for signature stability)
-  * @return A vector of GP noise values
-  */
-vector update_gp(matrix PHI, int M, real L, real alpha,
-                 real rho, vector eta, int type, real nu) {
-  return PHI * eta;
+  return PHI * (diagSPD .* eta);
 }
 
 /**
-  * Priors for Gaussian process (centred form).
-  *
-  * Samples eta on its natural scale: eta ~ N(0, diagSPD(alpha, rho)). This
-  * is the centred parameterisation of the HSGP coefficients. Combined with
-  * `noise = PHI * eta` in update_gp it gives smoother HMC geometry than
-  * the non-centred form when alpha is small (Stan manual recommendation).
+  * Priors for Gaussian process (excluding length scale)
   *
   * @param eta Vector of noise terms
-  * @param M Number of basis functions
-  * @param L Boundary
-  * @param alpha GP magnitude
-  * @param rho GP length-scale
-  * @param type Kernel type (0=EQ, 1=Periodic, 2=Matern)
-  * @param nu Matern smoothness
   *
   * @ingroup estimates_smoothing
   */
-void gaussian_process_lp(vector eta, int M, real L, real alpha, real rho,
-                         int type, real nu) {
-  vector[type == 1 ? 2 * M : M] diagSPD =
-    compute_diagSPD(M, L, alpha, rho, type, nu);
-  eta ~ normal(0, diagSPD);
+void gaussian_process_lp(vector eta) {
+  eta ~ std_normal();
 }
 
diff --git a/inst/stan/functions/rt.stan b/inst/stan/functions/rt.stan
index 61bcaa17d..3210ed8b7 100644
--- a/inst/stan/functions/rt.stan
+++ b/inst/stan/functions/rt.stan
@@ -46,14 +46,11 @@ vector update_Rt(int t, real R0, vector noise, array[] int bps,
         gp[(gp_n + 1):t] = rep_vector(noise[gp_n], t - gp_n);
       }
     } else {
-      // alpha controls trajectory SD (not increment SD): rescale increments
-      // by 1/sqrt(gp_n) so the cumulative log Rt has variance ≈ alpha^2
-      // instead of alpha^2 * gp_n.
-      gp[2:(gp_n + 1)] = noise / sqrt(gp_n);
+      gp[2:(gp_n + 1)] = noise;
       gp = cumulative_sum(gp);
-      // Identifiability: subtract the trajectory mean so log R0 = mean log
-      // Rt rather than the initial value. Eliminates the (R0, drift) ridge
-      // in the joint posterior.
+      // Identifiability: subtract the trajectory mean so log R0 is the mean
+      // log Rt over the window rather than the initial value. Eliminates
+      // the (R0, drift) ridge in the joint posterior.
       gp -= mean(gp);
     }
     logR = logR + gp;
diff --git a/man/gp_opts.Rd b/man/gp_opts.Rd
index 5edf9f7dc..6074783e9 100644
--- a/man/gp_opts.Rd
+++ b/man/gp_opts.Rd
@@ -8,7 +8,7 @@ gp_opts(
   basis_prop = 0.2,
   boundary_scale = 1.5,
   ls = LogNormal(mean = 21, sd = 7, max = 60),
-  alpha = Normal(mean = 0, sd = 0.2),
+  alpha = Normal(mean = 0, sd = 0.01),
   kernel = c("matern", "se", "ou", "periodic"),
   matern_order = 3/2,
   w0 = 1
@@ -33,17 +33,12 @@ a Lognormal distribution with mean 21 days, sd 7 days and maximum 60 days:
 enforced automatically to ensure positivity)}
 
 \item{alpha}{A \verb{<dist_spec>} giving the prior distribution of the magnitude
-parameter of the Gaussian process kernel. After rescaling of the
-non-stationary GP, this is approximately the standard deviation of the
-cumulated log-Rt trajectory over the observation window. Defaults to a
-half-normal distribution with mean 0 and sd 0.2:
-\code{Normal(mean = 0, sd = 0.2)} (a lower limit of 0 will be enforced
-automatically to ensure positivity). Note: the data is only weakly
-informative about \code{alpha} — the posterior is sensitive to the prior
-choice — and very wide priors (e.g. sd > 0.3) may allow chains to
-wander into pathological regions during warmup. Users with strong
-domain knowledge of the expected scale of log-Rt variation should
-consider supplying an informed prior.}
+parameter of the Gaussian process kernel. Should be approximately the
+expected standard deviation of the Gaussian process (logged Rt in case of
+the renewal model, logged infections in case of the nonmechanistic model).
+Defaults to a half-normal distribution with mean 0 and sd 0.01:
+\code{Normal(mean = 0, sd = 0.01)} (a lower limit of 0 will be enforced
+automatically to ensure positivity)}
 
 \item{kernel}{Character string, the type of kernel required. Currently
 supporting the Matern kernel ("matern"), squared exponential kernel ("se"),
diff --git a/tests/testthat/test-gp_opts.R b/tests/testthat/test-gp_opts.R
index d1a91941e..94ffd0b40 100644
--- a/tests/testthat/test-gp_opts.R
+++ b/tests/testthat/test-gp_opts.R
@@ -2,7 +2,7 @@ test_that("gp_opts returns correct default values", {
   gp <- gp_opts()
   expect_equal(gp$basis_prop, 0.2)
   expect_equal(gp$boundary_scale, 1.5)
-  expect_equal(gp$alpha, Normal(0, 0.2))
+  expect_equal(gp$alpha, Normal(0, 0.01))
   expect_equal(gp$ls, LogNormal(mean = 21, sd = 7, max = 60))
   expect_equal(gp$kernel, "matern")
   expect_equal(gp$matern_order, 3 / 2)
diff --git a/tests/testthat/test-stan-guassian-process.R b/tests/testthat/test-stan-guassian-process.R
index 7d53db2da..085285e67 100644
--- a/tests/testthat/test-stan-guassian-process.R
+++ b/tests/testthat/test-stan-guassian-process.R
@@ -157,28 +157,8 @@ test_that("update_gp returns correct dimensions and values", {
   nu <- 1.5 # Not used in SE case
   result <- update_gp(PHI, M, L, alpha, rho, eta, type, nu)
   expect_equal(length(result), nrow(PHI)) # Should match number of observations
-  # update_gp uses the centred parameterisation: eta is sampled with prior
-  # N(0, diagSPD) in gaussian_process_lp, so update_gp just returns PHI * eta
-  # without applying the diagSPD scaling itself.
-  expected_result <- PHI %*% eta
-  expect_equal(matrix(result, ncol = 1), expected_result, tolerance = 1e-8)
-})
-
-test_that("gaussian_process_lp eta prior scales with diagSPD", {
-  # Verify the prior contribution matches eta ~ N(0, diagSPD).
-  M <- 3
-  L <- 1.0
-  alpha <- 1.0
-  rho <- 2.0
-  type <- 0
-  nu <- 1.5 # not used for type=0
-  eta <- c(0.5, -0.3, 0.1)
+  # Check specific values for known inputs
   diagSPD <- diagSPD_EQ(alpha, rho, L, M)
-  expected_lp <- sum(dnorm(eta, mean = 0, sd = diagSPD, log = TRUE))
-  # gaussian_process_lp adds to target; in expose_stan_fns we get a void
-  # function. Calling it should not error; the prior contribution itself is
-  # tested via the expected normal log-density formula.
-  expect_silent(gaussian_process_lp(eta, M, L, alpha, rho, type, nu))
-  expect_true(all(is.finite(diagSPD)))
-  expect_true(is.finite(expected_lp))
+  expected_result <- PHI %*% (diagSPD * eta)
+  expect_equal(matrix(result, ncol = 1), expected_result, tolerance = 1e-8)
 })
diff --git a/tests/testthat/test-stan-rt.R b/tests/testthat/test-stan-rt.R
index 30fb657ed..13cc93e39 100644
--- a/tests/testthat/test-stan-rt.R
+++ b/tests/testthat/test-stan-rt.R
@@ -9,13 +9,14 @@ test_that("update_Rt works to produce multiple Rt estimates with a static gaussi
   )
 })
 test_that("update_Rt works to produce multiple Rt estimates with a non-static gaussian process", {
-  # Non-stationary GP: increments rescaled by 1/sqrt(gp_n) and the cumulated
-  # trajectory centred so log R0 = mean log Rt rather than initial log Rt.
-  # For noise = rep(0.1, 9), gp_n = 9, log(R0) = log(1.2):
-  #   gp = cumsum(rep(0.1/3, 9)), prepended with 0, then mean-subtracted.
+  # Non-stationary GP: cumulated trajectory is centred so log R0 = mean log Rt
+  # over the window rather than the initial value (eliminates the (R0, drift)
+  # ridge in the joint posterior). For noise = rep(0.1, 9), gp_n = 9:
+  #   gp = cumsum(noise) = c(0, 0.1, 0.2, ..., 0.9), mean = 0.45,
+  #   centred = c(-0.45, -0.35, ..., 0.45). log Rt = log(1.2) + centred.
   expect_equal(
     round(update_Rt(10, 1.2, rep(0.1, 9), rep(10, 0), numeric(0), 0), 3),
-    c(1.033, 1.068, 1.104, 1.141, 1.180, 1.220, 1.262, 1.304, 1.348, 1.394)
+    c(0.765, 0.846, 0.935, 1.033, 1.141, 1.262, 1.394, 1.541, 1.703, 1.882)
   )
 })
 test_that("update_Rt works to produce multiple Rt estimates with a non-static stationary gaussian process", {
@@ -58,10 +59,11 @@ test_that("update_Rt works when Rt is variable and a breakpoint is present", {
     c(1.2, 1.2, rep(1.33, 3))
   )
   # Non-stationary GP: see explanation in the earlier non-static GP test.
-  # Here gp_n = 4, breakpoints add 0.1 from t=3 onward.
+  # Here gp_n = 4, gp_centred = c(-0.2, -0.1, 0, 0.1, 0.2), breakpoint adds
+  # 0.1 from t = 3 onward.
   expect_equal(
     round(update_Rt(5, 1.2, rep(0.1, 4), c(1, 1, 2, 2, 2), 0.1, 0), 3),
-    c(1.086, 1.141, 1.326, 1.394, 1.466)
+    c(0.982, 1.086, 1.326, 1.466, 1.620)
   )
 })
 

From be45c1b98fc28297e5c0127ba6c02dea7b0fc56a Mon Sep 17 00:00:00 2001
From: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
Date: Mon, 11 May 2026 10:29:31 +0100
Subject: [PATCH 10/13] address review: document new R0 interpretation under
 centred GP

The cumsum centring changes the meaning of the rt_opts() `prior` from "the
initial reproduction number" to "the mean reproduction number over the
observation window" when using the default non-stationary GP. The
documentation in rt_opts() and in the three estimate_infections vignettes
now reflects this distinction (stationary GP and no-GP cases unchanged).

Co-authored-by: sbfnk <sebastian.funk@lshtm.ac.uk>
---
 R/opts.R                                        | 13 +++++++++----
 man/rt_opts.Rd                                  | 13 +++++++++----
 vignettes/estimate_infections.Rmd               |  4 ++--
 vignettes/estimate_infections_options.Rmd       |  2 +-
 vignettes/estimate_infections_options.Rmd.orig  |  2 +-
 vignettes/estimate_infections_workflow.Rmd      |  7 ++++---
 vignettes/estimate_infections_workflow.Rmd.orig |  7 ++++---
 7 files changed, 30 insertions(+), 18 deletions(-)

diff --git a/R/opts.R b/R/opts.R
index ccc236700..0330e5868 100644
--- a/R/opts.R
+++ b/R/opts.R
@@ -238,10 +238,15 @@ trunc_opts <- function(dist = Fixed(0), default_cdf_cutoff = 0.001,
 #' reproduction number. Custom settings can be supplied which override the
 #' defaults.
 #'
-#' @param prior A `<dist_spec>` giving the prior of the initial reproduction
-#' number. Ignored if `use_rt` is `FALSE`. Defaults to a LogNormal distribution
-#' with mean of 1 and standard deviation of 1: `LogNormal(mean = 1, sd = 1)`.
-#' A lower limit of 0 will be enforced automatically.
+#' @param prior A `<dist_spec>` giving the prior of the reproduction number.
+#' With the default non-stationary GP (`gp_on = "R_t-1"`), the cumulated GP
+#' is mean-centred by construction so that this prior is on the mean
+#' reproduction number over the observation window. With the stationary GP
+#' (`gp_on = "R0"`) or with no GP, it is the prior on the initial
+#' reproduction number. Ignored if `use_rt` is `FALSE`. Defaults to a
+#' LogNormal distribution with mean of 1 and standard deviation of 1:
+#' `LogNormal(mean = 1, sd = 1)`. A lower limit of 0 will be enforced
+#' automatically.
 #'
 #' @param use_rt Logical, defaults to `TRUE`. Should Rt be used to generate
 #' infections and hence reported cases.
diff --git a/man/rt_opts.Rd b/man/rt_opts.Rd
index b1efb5a24..570dbdb0c 100644
--- a/man/rt_opts.Rd
+++ b/man/rt_opts.Rd
@@ -18,10 +18,15 @@ rt_opts(
 )
 }
 \arguments{
-\item{prior}{A \verb{<dist_spec>} giving the prior of the initial reproduction
-number. Ignored if \code{use_rt} is \code{FALSE}. Defaults to a LogNormal distribution
-with mean of 1 and standard deviation of 1: \code{LogNormal(mean = 1, sd = 1)}.
-A lower limit of 0 will be enforced automatically.}
+\item{prior}{A \verb{<dist_spec>} giving the prior of the reproduction number.
+With the default non-stationary GP (\code{gp_on = "R_t-1"}), the cumulated GP
+is mean-centred by construction so that this prior is on the mean
+reproduction number over the observation window. With the stationary GP
+(\code{gp_on = "R0"}) or with no GP, it is the prior on the initial
+reproduction number. Ignored if \code{use_rt} is \code{FALSE}. Defaults to a
+LogNormal distribution with mean of 1 and standard deviation of 1:
+\code{LogNormal(mean = 1, sd = 1)}. A lower limit of 0 will be enforced
+automatically.}
 
 \item{use_rt}{Logical, defaults to \code{TRUE}. Should Rt be used to generate
 infections and hence reported cases.}
diff --git a/vignettes/estimate_infections.Rmd b/vignettes/estimate_infections.Rmd
index 6c35082f0..f192689c0 100644
--- a/vignettes/estimate_infections.Rmd
+++ b/vignettes/estimate_infections.Rmd
@@ -82,7 +82,7 @@ Different options are available for setting a prior for $R_t$, the instantaneous
   \log R_{t} - \log R_{t-1} \sim \mathrm{GP}_t
 \end{equation}
 
-More details on the mathematical form of the GP approximation and implementation are given in the [Gaussian Process implementation details](gaussian_process_implementation_details.html) vignette. Other choices for the prior of $R_t$ are available such as a GP prior for the difference between $R_t$ and its mean value (implying that in the absence of data $R_t$ will revert to its prior mean rather than the last value with robust support from the data).
+The cumulated trajectory is mean-centred so that $\log R_0 = \mathrm{mean}(\log R_t)$ over the observation window rather than the initial value; this eliminates a $(R_0, \mathrm{drift})$ ridge in the joint posterior that otherwise produced stuck chains on some seeds. More details on the mathematical form of the GP approximation and implementation are given in the [Gaussian Process implementation details](gaussian_process_implementation_details.html) vignette. Other choices for the prior of $R_t$ are available such as a GP prior for the difference between $R_t$ and its mean value (implying that in the absence of data $R_t$ will revert to its prior mean rather than the last value with robust support from the data).
 
 \begin{equation}
   \log R_{t} - \log R_0 \sim \mathrm{GP}_t
@@ -99,7 +99,7 @@ where $\div$ indicates interval-valued division (i.e. the floor of the division)
 
 The choice of prior for the time-varying reproduction number impact run-time, smoothness of the estimates and real-time behaviour and may alter the best use-case for the model.
 
-The prior distribution of the initial reproduction number $R_{0}$ can be set by the user.
+The prior distribution of the reproduction number $R_{0}$ can be set by the user. With the default non-stationary Gaussian process model ($gp_on = "R_{t-1}"$, see below), the cumulated GP is centred so that $R_0$ represents the mean reproduction number over the observation window; with no GP or with the stationary GP, $R_0$ is the initial reproduction number.
 By default this is a log-normal distribution with mean 1 and standard deviation 1.
 
 \begin{equation}
diff --git a/vignettes/estimate_infections_options.Rmd b/vignettes/estimate_infections_options.Rmd
index 8633d0c59..546778a39 100644
--- a/vignettes/estimate_infections_options.Rmd
+++ b/vignettes/estimate_infections_options.Rmd
@@ -70,7 +70,7 @@ ggplot(reported_cases, aes(x =  date, y = confirm)) +
 
 # Parameters
 
-Before running the model we need to decide on some parameter values, in particular any delays, the generation time, and a prior on the initial reproduction number.
+Before running the model we need to decide on some parameter values, in particular any delays, the generation time, and a prior on the reproduction number (the mean Rt over the observation window with the default centred non-stationary GP, otherwise the initial Rt).
 
 ## Delays: incubation period and reporting delay
 
diff --git a/vignettes/estimate_infections_options.Rmd.orig b/vignettes/estimate_infections_options.Rmd.orig
index 11d1e1ac0..25d6fdba8 100644
--- a/vignettes/estimate_infections_options.Rmd.orig
+++ b/vignettes/estimate_infections_options.Rmd.orig
@@ -58,7 +58,7 @@ ggplot(reported_cases, aes(x =  date, y = confirm)) +
 
 # Parameters
 
-Before running the model we need to decide on some parameter values, in particular any delays, the generation time, and a prior on the initial reproduction number.
+Before running the model we need to decide on some parameter values, in particular any delays, the generation time, and a prior on the reproduction number (the mean Rt over the observation window with the default centred non-stationary GP, otherwise the initial Rt).
 
 ## Delays: incubation period and reporting delay
 
diff --git a/vignettes/estimate_infections_workflow.Rmd b/vignettes/estimate_infections_workflow.Rmd
index 10d66e25a..a62d66c09 100644
--- a/vignettes/estimate_infections_workflow.Rmd
+++ b/vignettes/estimate_infections_workflow.Rmd
@@ -258,10 +258,11 @@ obs_scale <- Normal(mean = 0.4, sd = 0.01)
 obs_opts(scale = obs_scale)
 ```
 
-## Initial reproduction number
+## Reproduction number prior
 
-The default model that `estimate_infections()` uses to estimate reproduction numbers requires specification of a prior probability distribution for the initial reproduction number.
-This represents the user's initial belief of the value of the reproduction number, where there is no data yet to inform its value.
+The default model that `estimate_infections()` uses to estimate reproduction numbers requires specification of a prior probability distribution for the reproduction number.
+With the default non-stationary Gaussian process the cumulated GP is mean-centred so this prior is on the mean reproduction number over the observation window; with no GP or with the stationary GP it is the prior on the initial reproduction number.
+It represents the user's prior belief of the central value of the reproduction number, before any data informs it.
 By default this is assumed to be represented by a lognormal distribution with mean and standard deviation of 1.
 It can be changed using the `rt_opts()` function.
 For example, if the user believes that at the very start of the data the reproduction number was 2, with uncertainty in this belief represented by a standard deviation of 1, they would use
diff --git a/vignettes/estimate_infections_workflow.Rmd.orig b/vignettes/estimate_infections_workflow.Rmd.orig
index b29fde4ac..dbc2fce21 100644
--- a/vignettes/estimate_infections_workflow.Rmd.orig
+++ b/vignettes/estimate_infections_workflow.Rmd.orig
@@ -192,10 +192,11 @@ obs_scale <- Normal(mean = 0.4, sd = 0.01)
 obs_opts(scale = obs_scale)
 ```
 
-## Initial reproduction number
+## Reproduction number prior
 
-The default model that `estimate_infections()` uses to estimate reproduction numbers requires specification of a prior probability distribution for the initial reproduction number.
-This represents the user's initial belief of the value of the reproduction number, where there is no data yet to inform its value.
+The default model that `estimate_infections()` uses to estimate reproduction numbers requires specification of a prior probability distribution for the reproduction number.
+With the default non-stationary Gaussian process the cumulated GP is mean-centred so this prior is on the mean reproduction number over the observation window; with no GP or with the stationary GP it is the prior on the initial reproduction number.
+It represents the user's prior belief of the central value of the reproduction number, before any data informs it.
 By default this is assumed to be represented by a lognormal distribution with mean and standard deviation of 1.
 It can be changed using the `rt_opts()` function.
 For example, if the user believes that at the very start of the data the reproduction number was 2, with uncertainty in this belief represented by a standard deviation of 1, they would use

From 0055390cb3d4bbc625be9f9de0112cbf6b3c35ed Mon Sep 17 00:00:00 2001
From: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
Date: Mon, 11 May 2026 10:44:24 +0100
Subject: [PATCH 11/13] address review: warn on custom prior under centred GP

Adds a cli_warn() in rt_opts() when the user supplies a non-default
`prior` and `gp_on` is the default "R_t-1", flagging that the prior is
now on the mean Rt over the window rather than the initial Rt.

Co-authored-by: sbfnk <sebastian.funk@lshtm.ac.uk>
---
 R/opts.R | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/R/opts.R b/R/opts.R
index 0330e5868..ebd4b2315 100644
--- a/R/opts.R
+++ b/R/opts.R
@@ -346,6 +346,17 @@ rt_opts <- function(prior = LogNormal(mean = 1, sd = 1),
     pop_floor = pop_floor,
     growth_method = arg_match(growth_method)
   )
+  if (!missing(prior) && opts$gp_on == "R_t-1") {
+    cli_warn(c(
+      "!" = "The interpretation of `prior` in {.fn rt_opts} has changed.",
+      "i" = "With the default non-stationary Gaussian process \\
+        ({.code gp_on = \"R_t-1\"}), the cumulated GP is mean-centred, so \\
+        this prior is on the {.strong mean reproduction number over the \\
+        observation window} rather than the initial reproduction number.",
+      ">" = "If you intended the prior to describe the initial Rt, you \\
+        will need to adjust it (or use {.code gp_on = \"R0\"} / no GP)."
+    ))
+  }
 
   # replace default settings with those specified by user
   if (opts$rw > 0) {

From d5f90814250ec8931d1d1b0751e82b12d021ea44 Mon Sep 17 00:00:00 2001
From: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
Date: Wed, 20 May 2026 16:47:07 +0100
Subject: [PATCH 12/13] Apply user prior on derived initial Rt under centred GP

Refactors so that centring the cumulated GP no longer changes the
user-facing meaning of rt_opts(prior=...). R0 leaves the generic
params vector. R_mean (the trajectory mean) is sampled internally for
the geometry benefit, and the user's prior is applied to the derived
initial Rt R[1] via a new Jacobian-correct stan helper
centred_gp_init_lpdf in inst/stan/functions/params.stan.

The init-prior plumbing (data items init_param_ids/init_dists/
init_dist_params, R helper pack_init_prior()) is parameter-agnostic
and designed as a prototype for the future generic time-varying
parameter framework (issue #600). Today only R0 is wired up; future
time-varying parameters drop in by appending to the same data items
and adding one dispatch branch in stan.

Reverts the rt_opts() warning and vignette doc updates from earlier
commits since the user-facing interpretation is now unchanged.

Co-authored-by: sbfnk <sebastian.funk@lshtm.ac.uk>
---
 NEWS.md                                       |  2 +-
 R/estimate_infections.R                       | 25 ++++++++++++-
 R/opts.R                                      | 25 ++-----------
 R/utilities.R                                 | 37 +++++++++++++++++++
 .../stan/data/estimate_infections_params.stan |  9 +++++
 inst/stan/estimate_infections.stan            | 36 +++++++++++++++---
 inst/stan/functions/params.stan               | 33 +++++++++++++++++
 man/pack_init_prior.Rd                        | 29 +++++++++++++++
 man/rt_opts.Rd                                | 13 ++-----
 vignettes/estimate_infections.Rmd             |  4 +-
 vignettes/estimate_infections_options.Rmd     |  2 +-
 .../estimate_infections_options.Rmd.orig      |  2 +-
 vignettes/estimate_infections_workflow.Rmd    |  7 ++--
 .../estimate_infections_workflow.Rmd.orig     |  7 ++--
 14 files changed, 182 insertions(+), 49 deletions(-)
 create mode 100644 man/pack_init_prior.Rd

diff --git a/NEWS.md b/NEWS.md
index 1eb4b2df0..da14ef353 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -19,7 +19,7 @@
 
 ## Model changes
 
-- The cumulated non-stationary Gaussian process used to model Rt over time is now mean-centred (`gp -= mean(gp)` in `inst/stan/functions/rt.stan`), so `R0` represents the mean Rt over the trajectory rather than the initial value. This eliminates the `(R0, drift)` ridge in the joint posterior that was responsible for stuck chains and catastrophic R-hat values on some seeds. No API change and no change to the `alpha` prior — verified across previously stuck seeds (R-hat goes from up to 6.10 down to <1.01, treedepth hits from hundreds down to zero).
+- The non-stationary Gaussian process used to model Rt over time now samples in a mean-centred parameterisation internally (`gp -= mean(gp)` in `inst/stan/functions/rt.stan`), eliminating the `(R0, drift)` ridge in the joint posterior that caused stuck chains and catastrophic R-hat values on some seeds. The user-facing interpretation is unchanged: the `prior` argument in `rt_opts()` is still the prior on the initial Rt. This is achieved by sampling the trajectory mean internally and applying the user prior to the derived initial Rt via a new generic `centred_gp_init_lpdf` Stan helper, with Jacobian-correct change of variables (so the joint prior matches the pre-change model). The plumbing (`init_dists`, `init_dist_params`, `pack_init_prior()`, `centred_gp_init_lpdf`) is parameter-agnostic, designed as a prototype for the general time-varying-parameter framework planned for issue #600.
 
 ## Bug fixes
 
diff --git a/R/estimate_infections.R b/R/estimate_infections.R
index 00578d508..c620ee70c 100644
--- a/R/estimate_infections.R
+++ b/R/estimate_infections.R
@@ -181,10 +181,12 @@ estimate_infections <- function(data,
   # Add initial zeroes
   model_data <- pad_reported_cases(model_data, seeding_time)
 
+  # R0 is handled separately from the generic params system: it is wrapped
+  # by the centred non-stationary GP, so its user-facing prior is on the
+  # initial Rt (R[1]) rather than on the sampled internal log-mean.
   params <- list(
     make_param("alpha", gp$alpha, lower_bound = 0),
     make_param("rho", gp$ls, lower_bound = 0),
-    make_param("R0", rt$prior, lower_bound = 0),
     make_param("fraction_observed", obs$scale, lower_bound = 0),
     make_param("reporting_overdispersion", obs$dispersion, lower_bound = 0),
     make_param("pop", rt$pop, lower_bound = 0)
@@ -202,6 +204,27 @@ estimate_infections <- function(data,
     params = params
   )
 
+  # Register R0 as the (single, for now) centred-GP-wrapped parameter with
+  # its user prior applied to the derived initial Rt. The dispatch in
+  # `inst/stan/estimate_infections.stan` (the `init lp` profile block) uses
+  # `param_id_R0` to know which trajectory's initial value to apply the
+  # prior to. Generic plumbing: additional time-varying parameters drop in
+  # alongside R0 by appending to `init_param_ids` / `init_dists` /
+  # `init_dist_params` and adding one dispatch branch in stan.
+  stan_data$param_id_R0 <- stan_data$n_params_variable + 1L
+  if (isTRUE(rt$use_rt)) {
+    init_R <- pack_init_prior(rt$prior)
+    stan_data$n_init_priors <- 1L
+    stan_data$init_param_ids <- array(stan_data$param_id_R0)
+    stan_data$init_dists <- array(init_R$dist_type)
+    stan_data$init_dist_params <- array(init_R$params)
+  } else {
+    stan_data$n_init_priors <- 0L
+    stan_data$init_param_ids <- array(integer(0))
+    stan_data$init_dists <- array(integer(0))
+    stan_data$init_dist_params <- array(numeric(0))
+  }
+
   stan_data <- c(stan_data, create_stan_delays(
     generation_time = generation_time,
     reporting = delays,
diff --git a/R/opts.R b/R/opts.R
index ebd4b2315..64ec18757 100644
--- a/R/opts.R
+++ b/R/opts.R
@@ -238,15 +238,10 @@ trunc_opts <- function(dist = Fixed(0), default_cdf_cutoff = 0.001,
 #' reproduction number. Custom settings can be supplied which override the
 #' defaults.
 #'
-#' @param prior A `<dist_spec>` giving the prior of the reproduction number.
-#' With the default non-stationary GP (`gp_on = "R_t-1"`), the cumulated GP
-#' is mean-centred by construction so that this prior is on the mean
-#' reproduction number over the observation window. With the stationary GP
-#' (`gp_on = "R0"`) or with no GP, it is the prior on the initial
-#' reproduction number. Ignored if `use_rt` is `FALSE`. Defaults to a
-#' LogNormal distribution with mean of 1 and standard deviation of 1:
-#' `LogNormal(mean = 1, sd = 1)`. A lower limit of 0 will be enforced
-#' automatically.
+#' @param prior A `<dist_spec>` giving the prior of the initial reproduction
+#' number. Ignored if `use_rt` is `FALSE`. Defaults to a LogNormal distribution
+#' with mean of 1 and standard deviation of 1: `LogNormal(mean = 1, sd = 1)`.
+#' A lower limit of 0 will be enforced automatically.
 #'
 #' @param use_rt Logical, defaults to `TRUE`. Should Rt be used to generate
 #' infections and hence reported cases.
@@ -346,18 +341,6 @@ rt_opts <- function(prior = LogNormal(mean = 1, sd = 1),
     pop_floor = pop_floor,
     growth_method = arg_match(growth_method)
   )
-  if (!missing(prior) && opts$gp_on == "R_t-1") {
-    cli_warn(c(
-      "!" = "The interpretation of `prior` in {.fn rt_opts} has changed.",
-      "i" = "With the default non-stationary Gaussian process \\
-        ({.code gp_on = \"R_t-1\"}), the cumulated GP is mean-centred, so \\
-        this prior is on the {.strong mean reproduction number over the \\
-        observation window} rather than the initial reproduction number.",
-      ">" = "If you intended the prior to describe the initial Rt, you \\
-        will need to adjust it (or use {.code gp_on = \"R0\"} / no GP)."
-    ))
-  }
-
   # replace default settings with those specified by user
   if (opts$rw > 0) {
     opts$use_breakpoints <- TRUE
diff --git a/R/utilities.R b/R/utilities.R
index 276a72d4c..bef5b1a81 100644
--- a/R/utilities.R
+++ b/R/utilities.R
@@ -481,6 +481,43 @@ make_param <- function(name, dist = NULL, lower_bound = -Inf) {
   params
 }
 
+##' Pack a dist_spec into stan-side init-prior fields
+##'
+##' For parameters wrapped in a centred non-stationary GP (today: R0), the
+##' user-facing prior in `rt_opts()` is on the initial value of the trajectory
+##' rather than on the sampled internal mean. This helper converts a
+##' `<dist_spec>` into the integer/vector representation consumed by the
+##' generic init-prior loop in `inst/stan/estimate_infections.stan` (data
+##' items `init_dists` and `init_dist_params`).
+##'
+##' Generic over the parameter — designed to lift directly into a future
+##' general time-varying-parameter framework (issue #600).
+##'
+##' @param dist A `<dist_spec>` (LogNormal, Gamma, or Normal).
+##' @return A list with elements `dist_type` (integer code: 0 = lognormal,
+##' 1 = gamma, 2 = normal) and `params` (numeric of length 2 with the
+##' distribution parameters).
+##' @keywords internal
+pack_init_prior <- function(dist) {
+  dist_name <- get_distribution(dist)
+  dist_type <- switch(dist_name,
+    lognormal = 0L,
+    gamma = 1L,
+    normal = 2L,
+    cli_abort(c(
+      "!" = "Init prior distribution {.val {dist_name}} not supported.",
+      "i" = "Use {.fn LogNormal}, {.fn Gamma}, or {.fn Normal}."
+    ))
+  )
+  params <- unlist(get_parameters(dist))
+  if (length(params) != 2) {
+    cli_abort(
+      "Init prior must have exactly 2 distribution parameters; got {length(params)}." # nolint
+    )
+  }
+  list(dist_type = dist_type, params = as.numeric(params))
+}
+
 #' @importFrom stats glm median na.omit pexp pgamma plnorm quasipoisson rexp
 #' @importFrom stats rlnorm rnorm rpois runif sd var rgamma pnorm
 globalVariables(
diff --git a/inst/stan/data/estimate_infections_params.stan b/inst/stan/data/estimate_infections_params.stan
index 8b48f1975..edea29b46 100644
--- a/inst/stan/data/estimate_infections_params.stan
+++ b/inst/stan/data/estimate_infections_params.stan
@@ -4,3 +4,12 @@ int<lower = 0> param_id_R0; // parameter id of R0
 int<lower = 0> param_id_fraction_observed; // parameter id of fraction_observed
 int<lower = 0> param_id_reporting_overdispersion; // parameter id of reporting_overdispersion
 int<lower = 0> param_id_pop; // parameter id of pop
+
+// Init priors for centred-GP-wrapped parameters. Today only R0 is wrapped
+// (its prior in rt_opts() is on the *initial* Rt, applied to the derived
+// R[1] from the centred GP). Generic shape so additional time-varying
+// parameters can be added without changing the data plumbing.
+int<lower = 0> n_init_priors;
+array[n_init_priors] int<lower = 1> init_param_ids;
+array[n_init_priors] int<lower = 0, upper = 2> init_dists;
+vector[2 * n_init_priors] init_dist_params;
diff --git a/inst/stan/estimate_infections.stan b/inst/stan/estimate_infections.stan
index d535d016d..5b0753fb4 100644
--- a/inst/stan/estimate_infections.stan
+++ b/inst/stan/estimate_infections.stan
@@ -58,7 +58,10 @@ parameters {
   vector<lower = params_lower, upper = params_upper>[n_params_variable] params;
   // gaussian process
   vector[fixed ? 0 : gp_type == 1 ? 2*M : M] eta;  // unconstrained noise
-  // Rt
+  // Rt — mean log Rt over the window (sampled internally). User prior is on
+  // R[1] (initial Rt) and is applied via centred_gp_init_lpdf in the model
+  // block, with R[1] derived from R_mean + gp_centred[1].
+  array[estimate_r] real<lower = 0> R_mean;
   array[estimate_r] real initial_infections;    // seed infections
   // standard deviation of breakpoint effect
   array[bp_n > 0 ? 1 : 0] real<lower = 0> bp_sd;
@@ -106,11 +109,11 @@ transformed parameters {
       );
     }
     profile("R0") {
-      real R0 = get_param(
-        param_id_R0, params_fixed_lookup, params_variable_lookup, params_value, params
-      );
+      // R_mean is sampled directly (centred GP scaffolding). The user prior
+      // lives on R[1] (initial Rt), applied in the model block via the
+      // init_priors plumbing.
       R = update_Rt(
-        ot_h, R0, noise, breakpoints, bp_effects, stationary
+        ot_h, R_mean[1], noise, breakpoints, bp_effects, stationary
       );
     }
     profile("infections") {
@@ -231,6 +234,29 @@ model {
     }
   }
 
+  // Apply user priors on the initial value of any centred-GP-wrapped
+  // parameter (today: R0 -> R[1]). Generic loop so new time-varying
+  // parameters drop in via a one-line dispatch addition below.
+  if (n_init_priors > 0) {
+    profile("init lp") {
+      for (i in 1:n_init_priors) {
+        real init_value;
+        int pid = init_param_ids[i];
+        if (pid == param_id_R0) {
+          init_value = R[1];
+        } else {
+          reject("no time-varying parameter registered for id ", pid);
+        }
+        target += centred_gp_init_lpdf(
+          init_value |
+          init_dists[i],
+          init_dist_params[2 * i - 1],
+          init_dist_params[2 * i]
+        );
+      }
+    }
+  }
+
   // observed reports from mean of reports (update likelihood)
   if (likelihood) {
     profile("report lp") {
diff --git a/inst/stan/functions/params.stan b/inst/stan/functions/params.stan
index 523b12025..b758558a5 100644
--- a/inst/stan/functions/params.stan
+++ b/inst/stan/functions/params.stan
@@ -106,3 +106,36 @@ void params_lp(vector params, array[] int prior_dist,
   }
 }
 
+/**
+ * Apply user prior on the initial value of a centred-GP-wrapped parameter.
+ *
+ * When a parameter is wrapped by a centred non-stationary GP, the user-facing
+ * prior is on the initial value of the trajectory (X[1]) rather than on the
+ * sampled internal log-mean. This helper applies the prior to the derived
+ * initial value with the Jacobian correction for the linear-shift transform
+ * from log-mean to log-initial.
+ *
+ * Generic over the parameter — used by R0 today, lifts to any future
+ * time-varying parameter (alpha, dispersion, ...) via the same call.
+ *
+ * @param init_value Derived initial value of the trajectory (e.g. R[1]).
+ * @param dist_type Prior distribution type (0 = lognormal, 1 = gamma, 2 = normal).
+ * @param p1 First distribution parameter.
+ * @param p2 Second distribution parameter.
+ * @return Log-density contribution to the target, Jacobian included.
+ *
+ * @ingroup parameter_handlers
+ */
+real centred_gp_init_lpdf(real init_value, int dist_type, real p1, real p2) {
+  if (dist_type == 0) {
+    return lognormal_lpdf(init_value | p1, p2) + log(init_value);
+  } else if (dist_type == 1) {
+    return gamma_lpdf(init_value | p1, p2) + log(init_value);
+  } else if (dist_type == 2) {
+    return normal_lpdf(init_value | p1, p2) + log(init_value);
+  } else {
+    reject("centred_gp_init_lpdf: dist_type must be 0, 1, or 2");
+  }
+  return 0;
+}
+
diff --git a/man/pack_init_prior.Rd b/man/pack_init_prior.Rd
new file mode 100644
index 000000000..a44643f36
--- /dev/null
+++ b/man/pack_init_prior.Rd
@@ -0,0 +1,29 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/utilities.R
+\name{pack_init_prior}
+\alias{pack_init_prior}
+\title{Pack a dist_spec into stan-side init-prior fields}
+\usage{
+pack_init_prior(dist)
+}
+\arguments{
+\item{dist}{A \verb{<dist_spec>} (LogNormal, Gamma, or Normal).}
+}
+\value{
+A list with elements \code{dist_type} (integer code: 0 = lognormal,
+1 = gamma, 2 = normal) and \code{params} (numeric of length 2 with the
+distribution parameters).
+}
+\description{
+For parameters wrapped in a centred non-stationary GP (today: R0), the
+user-facing prior in \code{rt_opts()} is on the initial value of the trajectory
+rather than on the sampled internal mean. This helper converts a
+\verb{<dist_spec>} into the integer/vector representation consumed by the
+generic init-prior loop in \code{inst/stan/estimate_infections.stan} (data
+items \code{init_dists} and \code{init_dist_params}).
+}
+\details{
+Generic over the parameter — designed to lift directly into a future
+general time-varying-parameter framework (issue #600).
+}
+\keyword{internal}
diff --git a/man/rt_opts.Rd b/man/rt_opts.Rd
index 570dbdb0c..b1efb5a24 100644
--- a/man/rt_opts.Rd
+++ b/man/rt_opts.Rd
@@ -18,15 +18,10 @@ rt_opts(
 )
 }
 \arguments{
-\item{prior}{A \verb{<dist_spec>} giving the prior of the reproduction number.
-With the default non-stationary GP (\code{gp_on = "R_t-1"}), the cumulated GP
-is mean-centred by construction so that this prior is on the mean
-reproduction number over the observation window. With the stationary GP
-(\code{gp_on = "R0"}) or with no GP, it is the prior on the initial
-reproduction number. Ignored if \code{use_rt} is \code{FALSE}. Defaults to a
-LogNormal distribution with mean of 1 and standard deviation of 1:
-\code{LogNormal(mean = 1, sd = 1)}. A lower limit of 0 will be enforced
-automatically.}
+\item{prior}{A \verb{<dist_spec>} giving the prior of the initial reproduction
+number. Ignored if \code{use_rt} is \code{FALSE}. Defaults to a LogNormal distribution
+with mean of 1 and standard deviation of 1: \code{LogNormal(mean = 1, sd = 1)}.
+A lower limit of 0 will be enforced automatically.}
 
 \item{use_rt}{Logical, defaults to \code{TRUE}. Should Rt be used to generate
 infections and hence reported cases.}
diff --git a/vignettes/estimate_infections.Rmd b/vignettes/estimate_infections.Rmd
index f192689c0..6c35082f0 100644
--- a/vignettes/estimate_infections.Rmd
+++ b/vignettes/estimate_infections.Rmd
@@ -82,7 +82,7 @@ Different options are available for setting a prior for $R_t$, the instantaneous
   \log R_{t} - \log R_{t-1} \sim \mathrm{GP}_t
 \end{equation}
 
-The cumulated trajectory is mean-centred so that $\log R_0 = \mathrm{mean}(\log R_t)$ over the observation window rather than the initial value; this eliminates a $(R_0, \mathrm{drift})$ ridge in the joint posterior that otherwise produced stuck chains on some seeds. More details on the mathematical form of the GP approximation and implementation are given in the [Gaussian Process implementation details](gaussian_process_implementation_details.html) vignette. Other choices for the prior of $R_t$ are available such as a GP prior for the difference between $R_t$ and its mean value (implying that in the absence of data $R_t$ will revert to its prior mean rather than the last value with robust support from the data).
+More details on the mathematical form of the GP approximation and implementation are given in the [Gaussian Process implementation details](gaussian_process_implementation_details.html) vignette. Other choices for the prior of $R_t$ are available such as a GP prior for the difference between $R_t$ and its mean value (implying that in the absence of data $R_t$ will revert to its prior mean rather than the last value with robust support from the data).
 
 \begin{equation}
   \log R_{t} - \log R_0 \sim \mathrm{GP}_t
@@ -99,7 +99,7 @@ where $\div$ indicates interval-valued division (i.e. the floor of the division)
 
 The choice of prior for the time-varying reproduction number impact run-time, smoothness of the estimates and real-time behaviour and may alter the best use-case for the model.
 
-The prior distribution of the reproduction number $R_{0}$ can be set by the user. With the default non-stationary Gaussian process model ($gp_on = "R_{t-1}"$, see below), the cumulated GP is centred so that $R_0$ represents the mean reproduction number over the observation window; with no GP or with the stationary GP, $R_0$ is the initial reproduction number.
+The prior distribution of the initial reproduction number $R_{0}$ can be set by the user.
 By default this is a log-normal distribution with mean 1 and standard deviation 1.
 
 \begin{equation}
diff --git a/vignettes/estimate_infections_options.Rmd b/vignettes/estimate_infections_options.Rmd
index 546778a39..8633d0c59 100644
--- a/vignettes/estimate_infections_options.Rmd
+++ b/vignettes/estimate_infections_options.Rmd
@@ -70,7 +70,7 @@ ggplot(reported_cases, aes(x =  date, y = confirm)) +
 
 # Parameters
 
-Before running the model we need to decide on some parameter values, in particular any delays, the generation time, and a prior on the reproduction number (the mean Rt over the observation window with the default centred non-stationary GP, otherwise the initial Rt).
+Before running the model we need to decide on some parameter values, in particular any delays, the generation time, and a prior on the initial reproduction number.
 
 ## Delays: incubation period and reporting delay
 
diff --git a/vignettes/estimate_infections_options.Rmd.orig b/vignettes/estimate_infections_options.Rmd.orig
index 25d6fdba8..11d1e1ac0 100644
--- a/vignettes/estimate_infections_options.Rmd.orig
+++ b/vignettes/estimate_infections_options.Rmd.orig
@@ -58,7 +58,7 @@ ggplot(reported_cases, aes(x =  date, y = confirm)) +
 
 # Parameters
 
-Before running the model we need to decide on some parameter values, in particular any delays, the generation time, and a prior on the reproduction number (the mean Rt over the observation window with the default centred non-stationary GP, otherwise the initial Rt).
+Before running the model we need to decide on some parameter values, in particular any delays, the generation time, and a prior on the initial reproduction number.
 
 ## Delays: incubation period and reporting delay
 
diff --git a/vignettes/estimate_infections_workflow.Rmd b/vignettes/estimate_infections_workflow.Rmd
index a62d66c09..10d66e25a 100644
--- a/vignettes/estimate_infections_workflow.Rmd
+++ b/vignettes/estimate_infections_workflow.Rmd
@@ -258,11 +258,10 @@ obs_scale <- Normal(mean = 0.4, sd = 0.01)
 obs_opts(scale = obs_scale)
 ```
 
-## Reproduction number prior
+## Initial reproduction number
 
-The default model that `estimate_infections()` uses to estimate reproduction numbers requires specification of a prior probability distribution for the reproduction number.
-With the default non-stationary Gaussian process the cumulated GP is mean-centred so this prior is on the mean reproduction number over the observation window; with no GP or with the stationary GP it is the prior on the initial reproduction number.
-It represents the user's prior belief of the central value of the reproduction number, before any data informs it.
+The default model that `estimate_infections()` uses to estimate reproduction numbers requires specification of a prior probability distribution for the initial reproduction number.
+This represents the user's initial belief of the value of the reproduction number, where there is no data yet to inform its value.
 By default this is assumed to be represented by a lognormal distribution with mean and standard deviation of 1.
 It can be changed using the `rt_opts()` function.
 For example, if the user believes that at the very start of the data the reproduction number was 2, with uncertainty in this belief represented by a standard deviation of 1, they would use
diff --git a/vignettes/estimate_infections_workflow.Rmd.orig b/vignettes/estimate_infections_workflow.Rmd.orig
index dbc2fce21..b29fde4ac 100644
--- a/vignettes/estimate_infections_workflow.Rmd.orig
+++ b/vignettes/estimate_infections_workflow.Rmd.orig
@@ -192,11 +192,10 @@ obs_scale <- Normal(mean = 0.4, sd = 0.01)
 obs_opts(scale = obs_scale)
 ```
 
-## Reproduction number prior
+## Initial reproduction number
 
-The default model that `estimate_infections()` uses to estimate reproduction numbers requires specification of a prior probability distribution for the reproduction number.
-With the default non-stationary Gaussian process the cumulated GP is mean-centred so this prior is on the mean reproduction number over the observation window; with no GP or with the stationary GP it is the prior on the initial reproduction number.
-It represents the user's prior belief of the central value of the reproduction number, before any data informs it.
+The default model that `estimate_infections()` uses to estimate reproduction numbers requires specification of a prior probability distribution for the initial reproduction number.
+This represents the user's initial belief of the value of the reproduction number, where there is no data yet to inform its value.
 By default this is assumed to be represented by a lognormal distribution with mean and standard deviation of 1.
 It can be changed using the `rt_opts()` function.
 For example, if the user believes that at the very start of the data the reproduction number was 2, with uncertainty in this belief represented by a standard deviation of 1, they would use

From 3351aa57878e068851c150228e7576fa0007e4cd Mon Sep 17 00:00:00 2001
From: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
Date: Wed, 20 May 2026 17:52:54 +0100
Subject: [PATCH 13/13] fix ci: provide empty init prior data items in
 simulate_infections

simulate_infections.stan includes the shared
data/estimate_infections_params.stan which now declares the init prior
data block. simulate_infections does not fit, so set the items to empty
defaults (n_init_priors = 0, empty arrays).

Co-authored-by: sbfnk <sebastian.funk@lshtm.ac.uk>
---
 R/simulate_infections.R | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/R/simulate_infections.R b/R/simulate_infections.R
index af35ee884..0288d3dc4 100644
--- a/R/simulate_infections.R
+++ b/R/simulate_infections.R
@@ -198,6 +198,14 @@ simulate_infections <- function(R,
   ## set empty params matrix - variable parameters not supported here
   stan_data$params <- array(dim = c(1, 0))
 
+  ## init priors: not used in forward simulation (no model fitting). Provide
+  ## empty defaults so the shared estimate_infections_params.stan data block
+  ## is satisfied.
+  stan_data$n_init_priors <- 0L
+  stan_data$init_param_ids <- array(integer(0))
+  stan_data$init_dists <- array(integer(0))
+  stan_data$init_dist_params <- array(numeric(0))
+
   ## day of week effect
   if (is.null(day_of_week_effect)) {
     day_of_week_effect <- rep(1, stan_data$week_effect)