Skip to content

nimble-dev/cppp

Repository files navigation

cppp

A package for Calibrated posterior predictive p-values.

It provides a general framework for a MCMC engine—to:

  1. Compute calibrated posterior p-values (cppp),
  2. Estimate their Monte Carlo variance using the idea of the transfer effective sample size (ESS)
  3. Can handle different MCMC engines. NIMBLE and R for now, other MCMC engines later.

Concept

Given data $y$, a model $p(\theta \mid y) \propto p(y \mid \theta) \pi (\theta)$, and a discrepancy function $D(y,\theta)$:

  1. Run an long MCMC chain to obtain draws from the posterior $p(\theta \mid y)$. With $M$ draws, we sample new datafrom the posterior predictive of the data $p(y^* \mid \theta_i)$ and compute

$$ \Delta_i = D(y^*_i, \theta_i) - D(y, \theta_i), $$

This chain of $\Delta = { \Delta_i }$ collects the observed discrepancies.

  1. Generate $r$ calibration replicates:

    • simulate new datasets $\tilde{y}_j$ from the model,
    • run short chains of length $\tilde{m} $ for $p(\theta \mid \tilde{y}_j)$,
    • compute short-run posterior-predictive p-values $\hat{p}_j$.
  2. Estimate the CPPP:

$$ \widehat{\text{cppp}} = \frac{1}{r}\sum_{j=1}^r \mathbf{1}{\hat{p}_j \le \hat{p}_{\text{obs}}}. $$

  1. Estimate the Monte Carlo variance using the transfer ESS idea: match each $\hat{p}_j$ to a quantile on the observed $\Delta$ -chain, compute the transfer autocorrelation, and estimate the cppp variance.

Package architecture

Main functions

Function Purpose
runCalibration() Generic function for calibration
runCalibrationNIMBLE() NIMBLE-specific setup that builds the generic inputs and calls runCalibration().

Helpers

Function Role
make_col_disc_fun() Returns a function that extracts an “online” discrepancy column.
make_offline_disc_fun() Returns a function that computes discrepancies “offline.”

Other possible helpers?

Function Role
make_data_sim_fun() Returns a function that simulates new datasets $\tilde y$.
make_MCMCfun() ??
compute_cppp() Computes cppp from $\hat p_{\text{obs}}$ and $\hat p_j$.
transfer_ess_variance() Estimates cppp variance via transfer ESS.

Data object

cpppResult (S3)

  • cppp estimate, se, confidence interval
  • observed ppp, replicated ppp
  • observed discrepancies, replicated discrepancies (optional?)
  • informations about the calibration procedure? number of replicates and mcmc per replicated
  • methods: print, summary, plot

Typical workflow

  1. Build your MCMC engine

    • For NIMBLE: configure a model and compiled MCMC object.
    • Otherwise: supply a function MCMC_fun(new_data, control) that runs an MCMC and returns samples.
  2. Provide a data simulator

    • new_data_fun() draws a dataset $\tilde{y}$ from the model posterior predictive.
  3. Provide a discrepancy handler

    • Online: the discrepancy or PPP is computed during MCMC; just extract the column.
    • Offline: compute $D(y,\theta)$ and $D(y^*,\theta)$ after sampling.
  4. Run calibration

    • Call runCalibration() with your functions and number of replicates $r$.
    • the function iteratively: simulate → run short chain → compute $\hat{p}_j$.
  5. Compute cppp and variance

    • compute_cppp(p_hat_obs, p_hat_cal)
    • transfer_ess_variance(delta_chain, p_hat_obs, p_hat_cal, m_tilde)

Example

# define simulator for new datasets
new_data_fun <- make_data_sim_fun(...)

# create a discrepancy extractor
disc_fun <- make_col_disc_fun("ppp_column")

# or an offline discrepancy
disc_fun <- make_offline_disc_fun(control = list(
  new_data_fun = new_data_fun,
  discrepancy  = discrepancy))


# orchestrate calibration
res <- runCalibration(MCMC_samples_obs, MCMC_fun, new_data_fun, disc_fun,
                      num_reps = 50, control = list(seed = 123))

# compute cppp and variance
cppp_val <- compute_cppp(res$p_hat_obs, res$p_hat_cal)
var_out  <- transfer_ess_variance(res$delta_chain, res$p_hat_obs, res$p_hat_cal,
                                  m_tilde = res$m_tilde, c = 1.3)

summary(var_out)

About

Calibrated posterior predictive p-values

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages