Model details and identification

This vignette describes the statistical model underlying meta_did(), how baseline normalisation works, and what each study design can and cannot identify.

The latent DiD model

metadid assumes that every study — regardless of design — arises from a common latent difference-in-differences (DiD) data-generating process. For study $i$ , outcomes in the control group follow

$\begin{pmatrix} Y_{i,c,\mathrm{pre}} \\ Y_{i,c,\mathrm{post}} \end{pmatrix} \sim \mathcal{N} \left[ \begin{pmatrix} \alpha_i \\ \alpha_i + \beta_i \end{pmatrix} , \begin{pmatrix} \sigma^2_{i,c,\mathrm{pre}} & \rho_{i,c}\,\sigma_{i,c,\mathrm{pre}}\,\sigma_{i,c,\mathrm{post}} \\ \rho_{i,c}\,\sigma_{i,c,\mathrm{pre}}\,\sigma_{i,c,\mathrm{post}} & \sigma^2_{i,c,\mathrm{post}} \end{pmatrix} \right],$

and outcomes in the treatment group follow

$\begin{pmatrix} Y_{i,t,\mathrm{pre}} \\ Y_{i,t,\mathrm{post}} \end{pmatrix} \sim \mathcal{N} \left[ \begin{pmatrix} \alpha_i + \gamma_i \\ \alpha_i + \gamma_i + \beta_i + \theta_i \end{pmatrix} , \begin{pmatrix} \sigma^2_{i,t,\mathrm{pre}} & \rho_{i,t}\,\sigma_{i,t,\mathrm{pre}}\,\sigma_{i,t,\mathrm{post}} \\ \rho_{i,t}\,\sigma_{i,t,\mathrm{pre}}\,\sigma_{i,t,\mathrm{post}} & \sigma^2_{i,t,\mathrm{post}} \end{pmatrix} \right].$

The parameters are:

$\alpha_i$ : baseline mean in the control group
$\beta_i$ : time trend shared across groups
$\gamma_i$ : baseline difference between treatment and control groups
$\theta_i$ : study-specific treatment effect (the DiD estimand)
$\rho_{i,c}$ , $\rho_{i,t}$ : pre/post correlations within each group
$\sigma_{i,g,t}$ : marginal standard deviations

The key identifying assumption is that, absent treatment, the treatment group would have followed the same time trend $\beta_i$ as the control group (i.e., the parallel trends assumption).

What each design observes

Different study designs observe different subsets of this latent bivariate structure:

Design	Groups observed	Time points observed	Cells of the 2×2 table
DiD	Control + Treatment	Pre + Post	All 4
DiD (change only)	Control + Treatment	Change scores only	Differences of 2 pairs
RCT	Control + Treatment	Post only	2
Pre-post	Treatment only	Pre + Post	2

Because each design sees fewer cells, it has less ability to separate the parameters $\alpha$ , $\beta$ , $\gamma$ , and $\theta$ .

Identification

DiD studies: fully identified

DiD studies observe all four cells of the 2×2 table. From the mean structure above, the four expected cell means are:

	Pre	Post
Control	$\alpha_i$	$\alpha_i + \beta_i$
Treatment	$\alpha_i + \gamma_i$	$\alpha_i + \gamma_i + \beta_i + \theta_i$

Taking the double difference of these expected values gives

$(\mu_{i,t,\mathrm{post}} - \mu_{i,t,\mathrm{pre}}) - (\mu_{i,c,\mathrm{post}} - \mu_{i,c,\mathrm{pre}}) = (\beta_i + \theta_i) - \beta_i = \theta_i.$

The time trend $\beta_i$ cancels, and the baseline difference $\gamma_i$ cancels, leaving $\theta_i$ cleanly identified. DiD studies are the anchor for the entire model.

DiD (change only) studies: identified but without levels

Some studies report only the mean change over time (post minus pre) in each arm, without reporting the level means separately. The expected change scores are $\beta_i$ (control) and $\beta_i + \theta_i$ (treatment), so the difference in change scores still identifies $\theta_i$ . However, because the pre-treatment levels are not observed, these studies cannot anchor their own normalisation baseline (see below).

RCT studies: treatment effect confounded with baseline differences

RCT studies observe only post-treatment outcomes for both arms. The expected difference between arms is

$\mu_{i,t,\mathrm{post}} - \mu_{i,c,\mathrm{post}} = (\alpha_i + \gamma_i + \beta_i + \theta_i) - (\alpha_i + \beta_i) = \gamma_i + \theta_i.$

Without pre-treatment data, $\theta_i$ is confounded with the baseline difference $\gamma_i$ . Randomisation makes $\gamma_i$ small in expectation, but the model cannot separate the two from the data of a single RCT alone.

Pre-post studies: treatment effect confounded with time trends

Pre-post studies observe only the treatment arm at both time points. The expected change over time is

$\mu_{i,t,\mathrm{post}} - \mu_{i,t,\mathrm{pre}} = (\alpha_i + \gamma_i + \beta_i + \theta_i) - (\alpha_i + \gamma_i) = \beta_i + \theta_i.$

Without a control arm, $\theta_i$ is confounded with the time trend $\beta_i$ .

Summary

Design	Identifies $\theta_i$ ?	Confound
DiD	Yes	—
DiD (change only)	Yes	—
RCT	No	Baseline group difference $\gamma_i$
Pre-post	No	Time trend $\beta_i$

Without DiD studies, the treatment effect is not identified from the data. meta_did() will stop with an error if no DiD studies are present. This check can be overridden with allow_no_did = TRUE, but the resulting posterior will be driven primarily by the priors rather than the data.

When DiD studies are present alongside RCT or pre-post studies, the hierarchical model propagates information: the time trend distribution estimated from DiD studies informs the pre-post decomposition, and the baseline structure from DiD studies informs the RCT decomposition. This cross-design borrowing is the core value of the metadid approach.

Differenced-form likelihoods

For RCT and pre-post designs, the model uses differenced-form likelihoods that eliminate nuisance parameters algebraically:

RCT: the likelihood is based on the treatment–control difference in post-treatment means, so the time trend $\beta_i$ cancels. The remaining parameters are the treatment effect $\theta_i$ and the baseline difference $\gamma_i$ (borrowed from DiD studies).
Pre-post (default): the likelihood is based on the within-subject post–pre difference, so the baseline $\alpha_i + \gamma_i$ cancels. The remaining parameters are the treatment effect $\theta_i$ and the time trend $\beta_i$ (borrowed from DiD studies).

For pre-post studies, a non-differenced (bivariate normal) form is available via meta_did_general(pp_likelihood = "bivariate"). This retains the pre/post correlation $\rho_i$ as an estimable parameter, contributing to the hierarchical $\rho$ model, at the cost of estimating additional nuisance parameters.

Baseline normalisation

By default (normalise_by_baseline = TRUE), meta_did() normalises all outcome data by a design-appropriate baseline mean before fitting. This places all studies on a common relative scale, making the treatment effect interpretable as a proportional change.

How normalisation works per design

Design	Normalisation denominator	Effect on parameters
DiD	Pre-treatment control mean	$\alpha_i = 1$
DiD (change only)	Grand mean of DiD pre-control means	Shared rescaling (see below)
Pre-post	Pre-treatment treatment mean	$\alpha_i + \gamma_i = 1$
RCT	Post-treatment control mean	$\alpha_i + \beta_i = 1$

For DiD and pre-post studies, normalisation divides by a pre-treatment quantity, so the relevant baseline parameter is known to be exactly 1 and is fixed rather than estimated. The time trend and treatment effect remain as free parameters.

DiD change-only studies report change scores (post minus pre) rather than level means, so they lack a study-specific pre-treatment level to normalise by. Instead, normalisation divides the change scores by the grand mean of the pre-control means across the full DiD studies in the meta-analysis. This places the change scores on the same relative scale as the other designs, but requires that at least some full DiD studies are present.

For RCT studies, there is no pre-treatment data, so normalisation divides by the post-treatment control mean ( $\alpha_i + \beta_i$ ). After normalisation, the control post-mean is fixed at 1. The normalised treatment post-mean is

$\frac{\alpha_i + \gamma_i + \beta_i + \theta_i}{\alpha_i + \beta_i} = 1 + \frac{\gamma_i + \theta_i}{\alpha_i + \beta_i}.$

When the model assumes equal baselines ( $\gamma_i = 0$ ), the normalised treatment–control difference reduces to $\phi_i = \theta_i / (\alpha_i + \beta_i)$ , the apparent effect. This is what the normalised data directly measures.

To recover the treatment effect on the same scale as DiD studies (normalised by $\alpha_i$ ), we need to undo the RCT-specific normalisation. Writing $\tilde\theta_i = \theta_i / \alpha_i$ and $\tilde\beta_i = \beta_i / \alpha_i$ for the DiD-normalised quantities:

$\phi_i = \frac{\theta_i}{\alpha_i + \beta_i} = \frac{\theta_i / \alpha_i}{1 + \beta_i / \alpha_i} = \frac{\tilde\theta_i}{1 + \tilde\beta_i},$

so the normalised treatment effect is

$\tilde\theta_i = \phi_i \cdot (1 + \tilde\beta_i).$

The model is reparameterised so that $\phi_i$ (apparent effect) and $\tilde\beta_i$ (normalised time trend) are the sampled parameters, and $\tilde\theta_i$ is derived via the formula above. A Jacobian correction $|1 + \tilde\beta_i|$ is applied to the log-posterior to account for this change of variables. The hierarchical prior on $\tilde\beta_i$ — informed primarily by DiD studies, which directly identify time trends — provides the regularisation needed to separate $\tilde\theta_i$ from $\tilde\beta_i$ .

In the naive model (meta_did_naive()), the time trend is forced to zero for RCT studies, so $\phi_i = \tilde\theta_i$ and the reparameterisation is bypassed. This is equivalent to the standard assumption that the post-treatment control–treatment difference is an unbiased estimate of the treatment effect.

When data are not normalised, the RCT baselines $\alpha_i$ and $\alpha_i + \gamma_i$ are free parameters drawn from the same hierarchical distributions as in DiD studies, allowing the baseline difference $\gamma_i$ to be informed by DiD evidence. By default, the model allows $\gamma_i \neq 0$ ; this can be constrained to zero if randomisation is assumed to eliminate baseline imbalances.

Interpreting normalised treatment effects

After normalisation, the population treatment effect $\mu_\theta$ is expressed in units relative to the baseline. For example, $\mu_\theta = -0.33$ means a 33% reduction relative to the baseline level.

Hierarchical structure

Study-specific treatment effects are drawn from a population distribution:

$\theta_i \sim \mathcal{N}(\mu_\theta, \tau_\theta^2),$

where $\mu_\theta$ is the overall treatment effect (the primary quantity of interest) and $\tau_\theta$ captures between-study heterogeneity.

Other study-level parameters (time trends $\beta_i$ , baseline differences $\gamma_i$ ) similarly share population-level priors. Pre-post correlations $\rho$ can be modelled hierarchically via a Fisher- $z$ transform when hierarchical_rho = TRUE.

Design effects

When design_effects = TRUE, the model allows the population treatment effect mean to differ systematically by design:

$\mu_{\theta,\text{RCT}} = \mu_\theta + \delta_{\text{RCT}}, \quad \mu_{\theta,\text{PP}} = \mu_\theta + \delta_{\text{PP}}$

where $\delta_{\text{RCT}}$ and $\delta_{\text{PP}}$ are estimated offsets. This relaxes the assumption that all designs estimate exactly the same estimand, which may be appropriate when selection effects or time trends differ systematically across designs.

Robust heterogeneity

When robust_heterogeneity = TRUE, the treatment effect distribution uses a Student- $t$ instead of a normal:

$\theta_i \sim t_\nu(\mu_\theta, \tau_\theta^2)$

where $\nu$ (the degrees of freedom) is estimated. This accommodates outlier studies that would otherwise inflate $\tau_\theta$ .

Controlling assumptions with `meta_did_general()`

The meta_did_general() function provides explicit control over how nuisance parameters are handled for non-DiD designs, via three arguments:

time_trend: Controls the time trend βi\beta_i for RCT and pre-post studies.
- "pooled" (default): hierarchical prior shared across designs, informed by DiD studies.
- "fixed_zero": $\beta_i = 0$ for RCT and pre-post studies. For pre-post studies, this attributes all pre-post change to treatment. For RCTs, it bypasses the time trend reparameterisation described above.
baseline_imbalance: Controls the baseline difference γi\gamma_i for RCT studies.
- "estimated" (default): $\gamma_i$ is estimated, with information borrowed from DiD studies when normalised.
- "fixed_zero": $\gamma_i = 0$ , assuming randomisation eliminates baseline imbalances.
pp_likelihood: Controls the likelihood form for pre-post studies.
- "differenced" (default): uses the post-minus-pre difference, eliminating the baseline algebraically. The pre/post correlation $\rho_i$ is not separately estimable.
- "bivariate": uses the full bivariate normal likelihood for the (pre, post) pair. This retains $\rho_i$ as an estimable parameter, contributing to the hierarchical $\rho$ model, at the cost of estimating additional nuisance parameters.

These settings can be combined independently. For example, one might trust the randomisation assumption (baseline_imbalance = "fixed_zero") while still borrowing time trend information from DiD studies (time_trend = "pooled").

The default settings of meta_did_general() are identical to meta_did(). Setting time_trend and baseline_imbalance to "fixed_zero" reproduces the behaviour of the deprecated meta_did_naive().

Meta-regression with covariates

When study-level covariates are available (e.g., intervention dose, year of publication), the treatment effect mean can be modelled as a linear function of those covariates. If $\mathbf{x}_i$ is a $K$ -vector of covariate values for study $i$ , the hierarchical prior becomes

$\theta_i \sim \mathcal{N}\!\left( \mu_\theta + \mathbf{x}_i^\top \boldsymbol{\beta},\; \tau_\theta^2 \right),$

where $\boldsymbol{\beta}$ is a vector of meta-regression coefficients estimated jointly with all other parameters. When robust_heterogeneity = TRUE, the normal is replaced by a Student- $t$ as before. The same covariate adjustment applies across all study designs (DiD, RCT, and pre-post), with design-specific offsets ( $\delta_{\text{RCT}}$ , $\delta_{\text{PP}}$ ) added when design_effects = TRUE.

Covariate centering

By default (center_covariates = TRUE), covariates are mean-centered across all studies in the meta-analysis before fitting. This has a useful interpretive consequence: $\mu_\theta$ represents the population treatment effect at the average covariate values, rather than at $\mathbf{x} = 0$ (which may not be a meaningful reference point).

The centering values are stored in the fitted object (fit$cov_centers) and are needed to reconstruct predictions on the original covariate scale.

Specifying covariates in `meta_did()`

Covariates are passed as a one-sided formula:

fit <- meta_did(
  summary_data = my_data,
  covariates   = ~ dose + year
)

The covariate columns must be present in summary_data (and/or individual_data), must be numeric, and must be constant within each study. The prior on $\boldsymbol{\beta}$ defaults to $\mathcal{N}(0, 10)$ per coefficient and can be adjusted via set_priors(beta_cov = normal(0, sd)).

For a worked example including simulation and recovery, see vignette("covariates").

Prior specification

Priors can be customised via set_priors(). See ?set_priors for the full list of parameters and their defaults.