A package for Calibrated posterior predictive p-values.
It provides a general framework for a MCMC engine—to:
- Compute calibrated posterior p-values (cppp),
- Estimate their Monte Carlo variance using the idea of the transfer effective sample size (ESS)
- Can handle different MCMC engines. NIMBLE and R for now, other MCMC engines later.
Given data
- Run an long MCMC chain to obtain draws from the posterior
$p(\theta \mid y)$ . With$M$ draws, we sample new datafrom the posterior predictive of the data$p(y^* \mid \theta_i)$ and compute
This chain of
-
Generate
$r$ calibration replicates:- simulate new datasets
$\tilde{y}_j$ from the model, - run short chains of length $\tilde{m} $ for
$p(\theta \mid \tilde{y}_j)$ , - compute short-run posterior-predictive p-values
$\hat{p}_j$ .
- simulate new datasets
-
Estimate the CPPP:
- Estimate the Monte Carlo variance using the transfer ESS idea:
match each
$\hat{p}_j$ to a quantile on the observed$\Delta$ -chain, compute the transfer autocorrelation, and estimate the cppp variance.
| Function | Purpose |
|---|---|
runCalibration() |
Generic function for calibration |
runCalibrationNIMBLE() |
NIMBLE-specific setup that builds the generic inputs and calls runCalibration(). |
| Function | Role |
|---|---|
make_col_disc_fun() |
Returns a function that extracts an “online” discrepancy column. |
make_offline_disc_fun() |
Returns a function that computes discrepancies “offline.” |
Other possible helpers?
| Function | Role |
|---|---|
make_data_sim_fun() |
Returns a function that simulates new datasets |
make_MCMCfun() |
?? |
compute_cppp() |
Computes cppp from |
transfer_ess_variance() |
Estimates cppp variance via transfer ESS. |
cpppResult (S3)
- cppp estimate, se, confidence interval
- observed ppp, replicated ppp
- observed discrepancies, replicated discrepancies (optional?)
- informations about the calibration procedure? number of replicates and mcmc per replicated
- methods:
print,summary,plot
-
Build your MCMC engine
- For NIMBLE: configure a model and compiled MCMC object.
- Otherwise: supply a function
MCMC_fun(new_data, control)that runs an MCMC and returns samples.
-
Provide a data simulator
-
new_data_fun()draws a dataset$\tilde{y}$ from the model posterior predictive.
-
-
Provide a discrepancy handler
- Online: the discrepancy or PPP is computed during MCMC; just extract the column.
-
Offline: compute
$D(y,\theta)$ and$D(y^*,\theta)$ after sampling.
-
Run calibration
- Call
runCalibration()with your functions and number of replicates$r$ . - the function iteratively: simulate → run short chain → compute
$\hat{p}_j$ .
- Call
-
Compute cppp and variance
compute_cppp(p_hat_obs, p_hat_cal)transfer_ess_variance(delta_chain, p_hat_obs, p_hat_cal, m_tilde)
# define simulator for new datasets
new_data_fun <- make_data_sim_fun(...)
# create a discrepancy extractor
disc_fun <- make_col_disc_fun("ppp_column")
# or an offline discrepancy
disc_fun <- make_offline_disc_fun(control = list(
new_data_fun = new_data_fun,
discrepancy = discrepancy))
# orchestrate calibration
res <- runCalibration(MCMC_samples_obs, MCMC_fun, new_data_fun, disc_fun,
num_reps = 50, control = list(seed = 123))
# compute cppp and variance
cppp_val <- compute_cppp(res$p_hat_obs, res$p_hat_cal)
var_out <- transfer_ess_variance(res$delta_chain, res$p_hat_obs, res$p_hat_cal,
m_tilde = res$m_tilde, c = 1.3)
summary(var_out)