Skip to content

Feature parity with Google's CausalImpact #758

@drbenvincent

Description

@drbenvincent

Feature parity with Google's CausalImpact

Context: Google's CausalImpact (R package) and its accompanying paper (Brodersen et al., 2015) define the gold standard for Bayesian time-series causal inference using structural time-series (BSTS) models. CausalPy's InterruptedTimeSeries and SyntheticControl experiments cover much of the same ground, but there are actionable gaps worth closing so that CausalPy can be a credible Python-native alternative. This issue tracks those gaps and cross-references existing issues/PRs where work is already underway.

Scope

CausalImpact focuses on a single use-case: estimating the causal impact of an intervention on a time series using BSTS with control covariates. CausalPy is broader (DiD, RD, IV, ...), so this issue is scoped specifically to time-series causal inference: ITS, Synthetic Control, and the BSTS model classes.

Already at parity (or better)

These CausalImpact features already have CausalPy equivalents -- no work needed:

CausalImpact feature CausalPy equivalent
Pre/post period specification (indices or dates) treatment_time accepts int, float, or pd.Timestamp
Three-period design (intervention ends, effect persists) treatment_end_time parameter, analyze_persistence() method (closed #559)
3-panel plot (observed+counterfactual, pointwise, cumulative) result.plot() produces the same 3 panels with HDI bands
Summary table (Average, Cumulative, CI, posterior prob) result.effect_summary() returns .table DataFrame and .text prose
Verbal/prose report effect_summary().text generates prose with average, cumulative, and relative effects
MCMC customization (niter) sample_kwargs dict passed to PyMC (draws, tune, chains, etc.)
Custom model support Inherit from PyMCModel or pass any sklearn RegressorMixin. Implemented in #740 and documented in PR #741
Adjustable credible interval width (alpha) effect_summary(alpha=...) and get_plot_data_bayesian(hdi_prob=...)
Multiple control time series as covariates Via patsy formula ("y ~ 1 + t + x1 + x2") or SC control_units
Relative (%) effects effect_summary(relative=True) with HDI intervals on relative effects

CausalPy also offers capabilities beyond CausalImpact: both Bayesian and OLS backends, piecewise ITS (#613), ROPE analysis, comparative ITS, windowed effect analysis, and three-period persistence analysis.


Outstanding gaps

Below are the concrete gaps, grouped by priority. Each item includes what CausalImpact does, what CausalPy currently has, what needs to be done, and links to related issues/PRs.

P0 -- Graduate the BSTS model to production-ready

What CausalImpact does: Its entire value proposition is a ready-to-use BSTS model (Section 2 of the paper) with:

  • Local linear trend (Eqs. 3-4 in the paper) with optional AR(1) slope dampening
  • Seasonal components (Eq. 5) with configurable period and duration, plus support for multiple seasonal components
  • Static regression with spike-and-slab priors for automatic covariate selection (Section 2.2)
  • Dynamic regression with time-varying coefficients (Eq. 6)
  • Diffuse empirical priors scaled by data variance (Section 2.2)

What CausalPy currently has: Two experimental model classes, both carrying FutureWarning:

  • BayesianBasisExpansionTimeSeries -- Fourier seasonality + trend changepoints (depends on pymc-marketing)
  • StateSpaceTimeSeries -- state-space model via pymc-extras (depends on pymc-extras.statespace)

Both were merged via #473 (original BSTS issue: #52, tech-debt tracked in #570 which is now closed). The refactoring file (BSTS_REFACTORING_CONCERNS.md) has been cleaned up, but the models remain experimental.

What needs to be done:

  • Choose the graduation path. Decide whether to promote BayesianBasisExpansionTimeSeries, StateSpaceTimeSeries, or both. Key consideration: StateSpaceTimeSeries is architecturally closest to CausalImpact (true state-space model), but BayesianBasisExpansionTimeSeries is more accessible. Both have outstanding API inconsistencies (Pay off BSTS tech debt #570 items). Document the decision.
  • Fix remaining API inconsistencies. The BSTS models use np.ndarray signatures where all other PyMCModel subclasses use xr.DataArray. This breaks Liskov Substitution and forces isinstance() checks in InterruptedTimeSeries. Align the data interface.
  • Add spike-and-slab covariate selection. CausalImpact's defining feature is automatic selection from many candidate control series using spike-and-slab priors (Section 2.2 of the paper, expected model size M, Zellner's g-prior). CausalPy already has VariableSelectionPrior with spike-and-slab and horseshoe priors, but these are only wired to InstrumentalVariable experiments. Wire them to the BSTS model + ITS experiment so users can pass many candidate covariates and let the model select.
  • Remove the FutureWarning. Once the API is stable, remove the experimental warning.
  • Add documentation. Currently Update docs to demonstrate new BSTS functionality #578 (open) and Good introduction to BSTS #696 (open) track this. Produce at least one notebook demonstrating the BSTS model with ITS, including trend + seasonality + covariates.
  • Add tests for edge cases. Short series, no seasonality, many covariates, missing seasonality component, etc.

Related: #52 (closed), #473 (merged), #570 (closed), #578 (open), #696 (open), #565 (open)


P1 -- Simpler entry point for time-series causal inference

What CausalImpact does: One-liner API with sensible defaults:

impact <- CausalImpact(data, pre.period, post.period)

The model (BSTS), prior specification, covariate selection, and MCMC configuration are all handled automatically. Users can override via model.args.

What CausalPy currently has: Users must separately choose a model class, write a formula, and configure sample_kwargs:

result = cp.InterruptedTimeSeries(
    data,
    treatment_time,
    formula="y ~ 1 + t",
    model=cp.pymc_models.LinearRegression(sample_kwargs={"draws": 1000}),
)

What needs to be done:

  • Add sensible defaults to InterruptedTimeSeries. If no model is provided, default to the graduated BSTS model with reasonable priors. If no formula is provided, auto-detect the outcome column (first column or user-specified) and use remaining columns as covariates -- mirroring CausalImpact's convention.
  • Consider a convenience constructor or factory. Something like:
    result = cp.InterruptedTimeSeries(data, treatment_time)
    # or
    result = cp.causal_impact(data, treatment_time)  # thin wrapper
    This doesn't replace the full API; it complements it for the common case.
  • Document the "quick start" path prominently in the ITS/BSTS notebook.

Related: #740 (closed), PR #741 (merged)


P2 -- Missing value handling in the response variable

What CausalImpact does: Explicitly supports NA in the response column (covariates must be complete). The Kalman filter in the BSTS model naturally handles missing observations by skipping the update step.

What CausalPy currently has: No explicit handling. Missing values in the response will cause errors or silent numerical issues.

What needs to be done:

  • For the state-space model (StateSpaceTimeSeries): Missing values should be handled natively by the Kalman filter. Verify this works and add a test.
  • For BayesianBasisExpansionTimeSeries and LinearRegression: Either mask missing observations or raise a clear error. Document the limitation.
  • Validate in the experiment class. InterruptedTimeSeries should check for NAs and dispatch appropriately (pass through for state-space models, error/warn for others).

P3 -- Detailed narrative report

What CausalImpact does: summary(impact, "report") generates a multi-paragraph plain-language interpretation, e.g.:

"During the post-intervention period, the response variable had an average value of approx. 117. By contrast, in the absence of an intervention, we would have expected an average response of 107. The 95% interval of this counterfactual prediction is [106, 107]. Subtracting this prediction from the observed response yields an estimate of the causal effect the intervention had on the response variable. This effect is 11 with a 95% interval of [9.8, 11]. Summing up the individual data points during the post-intervention period, the response variable had an overall value of 3511. By contrast, had the intervention not taken place, we would have expected a sum of 3196. The 95% interval of this prediction is [3174, 3217]."

What CausalPy currently has: effect_summary().text generates 2-3 sentences with numbers and HDI intervals. Functional but terse. Some report-generation pipeline groundwork has landed via #747 / PR #749, but this does not yet provide a CausalImpact-style detailed narrative report.

What needs to be done:

  • Add a report_style="detailed" option (or a separate generate_report() method) to effect_summary() that produces multi-paragraph CausalImpact-style prose. Include:
    • Plain-language description of observed vs counterfactual
    • Interpretation of whether the effect is statistically credible (HDI excludes zero? posterior probability > 0.95?)
    • Explicit statement about assumptions (covariates not affected by treatment, stable pre-period relationship)
    • Guidance on when to be cautious about the results
  • Consider integrating with maketables (Tabular output with maketables #579, PR maketables plug in  #600) for professional tabular output.

Related: #579 (open), PR #600 (open), #747 (closed), PR #749 (merged)


P4 -- Component decomposition plots

What CausalImpact does: The underlying bsts package can decompose and plot individual model components (trend, seasonality, regression contribution) separately. This is essential for model diagnosis and understanding what drives the counterfactual.

What CausalPy currently has: The standard 3-panel plot. No component-level visualization.

What needs to be done:

  • Add a plot_components() method to the BSTS model classes that decomposes and plots each component (trend, seasonality, regression) with posterior uncertainty bands.
  • Ensure the BSTS models expose component-level posteriors. The model's InferenceData object should contain named posterior groups for each component, not just the combined mu.

Related: #565 (time series decomposition plots for BSTS -- explicitly blocked on #52 which is now closed, so this is unblocked)


P5 -- Automatic data standardization

What CausalImpact does: Standardizes all columns before fitting (default standardize.data = TRUE), acting as an empirical Bayes approach to setting prior scales. The paper (Section 2.2) describes scaling priors by the sample variance of the target series.

What CausalPy currently has: No built-in standardization. Users must standardize manually.

What needs to be done:

  • Add a standardize=True option to the BSTS model or to InterruptedTimeSeries. When enabled, standardize covariates and outcome before fitting, then back-transform results for reporting. This improves numerical stability and makes default priors more robust.
  • Alternative: Scale the default BSTS priors by the empirical data variance (as CausalImpact does) rather than standardizing the data. Either approach achieves the same goal; the prior-scaling approach avoids user-visible data transformation.

P6 -- Predictor inclusion probability visualization

What CausalImpact does: plot(impact$model$bsts.model, "coefficients") shows posterior inclusion probabilities for each covariate, helping users understand which controls drive the counterfactual.

What CausalPy currently has: VariableSelectionPrior with get_inclusion_probabilities() exists, but is only wired to InstrumentalVariable and has no built-in plotting.

What needs to be done:

  • Wire VariableSelectionPrior to the BSTS model + ITS (overlaps with the P0 spike-and-slab task).
  • Add a plot_inclusion_probabilities() helper that visualizes posterior inclusion probabilities as a horizontal bar chart (matching CausalImpact's coefficient plot).
  • For horseshoe priors: Add a plot_shrinkage_factors() helper using the existing get_shrinkage_factors() method.

P7 -- Power analysis / experiment duration planning

What CausalImpact does (from the paper, Section 5): Recommends using the model on hypothetical interventions in the pre-period to estimate what effect size would be detectable. The paper (Section 3) includes power curve analysis.

What CausalPy currently has:

What needs to be done:

Related: #744 (closed), #569 (open), #721 (open), #722 (closed), PR #749 (merged)


Non-goals

These are CausalImpact features we intentionally do not need to replicate:

  • R-specific API conventions (S3 methods, zoo objects) -- CausalPy uses Python/pandas/xarray.
  • bsts package integration -- CausalPy uses PyMC, which is more flexible.
  • Single-model-only design -- CausalPy's dual Bayesian/OLS support is a feature.
  • C++ MCMC implementation -- PyMC's sampling backends (NUTS, JAX/NumPyro) are competitive.

Related issues and PRs (cross-reference index)

Issue/PR Relevance
#52 (closed) Original BSTS feature request
#473 (merged) BSTS model implementation
#559 (closed) Three-period ITS design
#565 (open) Component decomposition plots
#569 (open) Bayesian power analysis
#570 (closed) BSTS tech debt
#578 (open) BSTS documentation
#579 (open) maketables integration
#613 (closed) PiecewiseITS experiment
#696 (open) BSTS introduction notebook
#697 (open) Comparison with other causal libraries
#721 (open) Experiment duration tool
#722 (closed) Pipeline / workflow
#740 (closed) Custom PyMC models
#744 (closed) Placebo-in-time to core
#747 (closed) GenerateReport pipeline step
PR #600 (open) maketables plugin
PR #643 (open) Response type (expectation vs prediction intervals)
PR #741 (merged) Custom model guide
PR #749 (merged) Pipeline implementation

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions