Skip to contents

This vignette describes the statistical model underlying meta_did(), how baseline normalisation works, and what each study design can and cannot identify.

The latent DiD model

metadid assumes that every study — regardless of design — arises from a common latent difference-in-differences (DiD) data-generating process. For study ii, outcomes in the control group follow

(Yi,c,preYi,c,post)𝒩[(αiαi+βi),(σi,c,pre2ρi,cσi,c,preσi,c,postρi,cσi,c,preσi,c,postσi,c,post2)], \begin{pmatrix} Y_{i,c,\mathrm{pre}} \\ Y_{i,c,\mathrm{post}} \end{pmatrix} \sim \mathcal{N} \left[ \begin{pmatrix} \alpha_i \\ \alpha_i + \beta_i \end{pmatrix} , \begin{pmatrix} \sigma^2_{i,c,\mathrm{pre}} & \rho_{i,c}\,\sigma_{i,c,\mathrm{pre}}\,\sigma_{i,c,\mathrm{post}} \\ \rho_{i,c}\,\sigma_{i,c,\mathrm{pre}}\,\sigma_{i,c,\mathrm{post}} & \sigma^2_{i,c,\mathrm{post}} \end{pmatrix} \right],

and outcomes in the treatment group follow

(Yi,t,preYi,t,post)𝒩[(αi+γiαi+γi+βi+θi),(σi,t,pre2ρi,tσi,t,preσi,t,postρi,tσi,t,preσi,t,postσi,t,post2)]. \begin{pmatrix} Y_{i,t,\mathrm{pre}} \\ Y_{i,t,\mathrm{post}} \end{pmatrix} \sim \mathcal{N} \left[ \begin{pmatrix} \alpha_i + \gamma_i \\ \alpha_i + \gamma_i + \beta_i + \theta_i \end{pmatrix} , \begin{pmatrix} \sigma^2_{i,t,\mathrm{pre}} & \rho_{i,t}\,\sigma_{i,t,\mathrm{pre}}\,\sigma_{i,t,\mathrm{post}} \\ \rho_{i,t}\,\sigma_{i,t,\mathrm{pre}}\,\sigma_{i,t,\mathrm{post}} & \sigma^2_{i,t,\mathrm{post}} \end{pmatrix} \right].

The parameters are:

  • αi\alpha_i: baseline mean in the control group
  • βi\beta_i: time trend shared across groups
  • γi\gamma_i: baseline difference between treatment and control groups
  • θi\theta_i: study-specific treatment effect (the DiD estimand)
  • ρi,c\rho_{i,c}, ρi,t\rho_{i,t}: pre/post correlations within each group
  • σi,g,t\sigma_{i,g,t}: marginal standard deviations

The key identifying assumption is that, absent treatment, the treatment group would have followed the same time trend βi\beta_i as the control group (i.e., the parallel trends assumption).

What each design observes

Different study designs observe different subsets of this latent bivariate structure:

Design Groups observed Time points observed Cells of the 2×2 table
DiD Control + Treatment Pre + Post All 4
DiD (change only) Control + Treatment Change scores only Differences of 2 pairs
RCT Control + Treatment Post only 2
Pre-post Treatment only Pre + Post 2

Because each design sees fewer cells, it has less ability to separate the parameters α\alpha, β\beta, γ\gamma, and θ\theta.

Identification

DiD studies: fully identified

DiD studies observe all four cells of the 2×2 table. From the mean structure above, the four expected cell means are:

Pre Post
Control αi\alpha_i αi+βi\alpha_i + \beta_i
Treatment αi+γi\alpha_i + \gamma_i αi+γi+βi+θi\alpha_i + \gamma_i + \beta_i + \theta_i

Taking the double difference of these expected values gives

(μi,t,postμi,t,pre)(μi,c,postμi,c,pre)=(βi+θi)βi=θi. (\mu_{i,t,\mathrm{post}} - \mu_{i,t,\mathrm{pre}}) - (\mu_{i,c,\mathrm{post}} - \mu_{i,c,\mathrm{pre}}) = (\beta_i + \theta_i) - \beta_i = \theta_i.

The time trend βi\beta_i cancels, and the baseline difference γi\gamma_i cancels, leaving θi\theta_i cleanly identified. DiD studies are the anchor for the entire model.

DiD (change only) studies: identified but without levels

Some studies report only the mean change over time (post minus pre) in each arm, without reporting the level means separately. The expected change scores are βi\beta_i (control) and βi+θi\beta_i + \theta_i (treatment), so the difference in change scores still identifies θi\theta_i. However, because the pre-treatment levels are not observed, these studies cannot anchor their own normalisation baseline (see below).

RCT studies: treatment effect confounded with baseline differences

RCT studies observe only post-treatment outcomes for both arms. The expected difference between arms is

μi,t,postμi,c,post=(αi+γi+βi+θi)(αi+βi)=γi+θi. \mu_{i,t,\mathrm{post}} - \mu_{i,c,\mathrm{post}} = (\alpha_i + \gamma_i + \beta_i + \theta_i) - (\alpha_i + \beta_i) = \gamma_i + \theta_i.

Without pre-treatment data, θi\theta_i is confounded with the baseline difference γi\gamma_i. Randomisation makes γi\gamma_i small in expectation, but the model cannot separate the two from the data of a single RCT alone.

Pre-post studies: treatment effect confounded with time trends

Pre-post studies observe only the treatment arm at both time points. The expected change over time is

μi,t,postμi,t,pre=(αi+γi+βi+θi)(αi+γi)=βi+θi. \mu_{i,t,\mathrm{post}} - \mu_{i,t,\mathrm{pre}} = (\alpha_i + \gamma_i + \beta_i + \theta_i) - (\alpha_i + \gamma_i) = \beta_i + \theta_i.

Without a control arm, θi\theta_i is confounded with the time trend βi\beta_i.

Summary

Design Identifies θi\theta_i? Confound
DiD Yes
DiD (change only) Yes
RCT No Baseline group difference γi\gamma_i
Pre-post No Time trend βi\beta_i

Without DiD studies, the treatment effect is not identified from the data. meta_did() will stop with an error if no DiD studies are present. This check can be overridden with allow_no_did = TRUE, but the resulting posterior will be driven primarily by the priors rather than the data.

When DiD studies are present alongside RCT or pre-post studies, the hierarchical model propagates information: the time trend distribution estimated from DiD studies informs the pre-post decomposition, and the baseline structure from DiD studies informs the RCT decomposition. This cross-design borrowing is the core value of the metadid approach.

Differenced-form likelihoods

For RCT and pre-post designs, the model uses differenced-form likelihoods that eliminate nuisance parameters algebraically:

  • RCT: the likelihood is based on the treatment–control difference in post-treatment means, so the time trend βi\beta_i cancels. The remaining parameters are the treatment effect θi\theta_i and the baseline difference γi\gamma_i (borrowed from DiD studies).
  • Pre-post (default): the likelihood is based on the within-subject post–pre difference, so the baseline αi+γi\alpha_i + \gamma_i cancels. The remaining parameters are the treatment effect θi\theta_i and the time trend βi\beta_i (borrowed from DiD studies).

For pre-post studies, a non-differenced (bivariate normal) form is available via meta_did_general(pp_likelihood = "bivariate"). This retains the pre/post correlation ρi\rho_i as an estimable parameter, contributing to the hierarchical ρ\rho model, at the cost of estimating additional nuisance parameters.

Baseline normalisation

By default (normalise_by_baseline = TRUE), meta_did() normalises all outcome data by a design-appropriate baseline mean before fitting. This places all studies on a common relative scale, making the treatment effect interpretable as a proportional change.

How normalisation works per design

Design Normalisation denominator Effect on parameters
DiD Pre-treatment control mean αi=1\alpha_i = 1
DiD (change only) Grand mean of DiD pre-control means Shared rescaling (see below)
Pre-post Pre-treatment treatment mean αi+γi=1\alpha_i + \gamma_i = 1
RCT Post-treatment control mean αi+βi=1\alpha_i + \beta_i = 1

For DiD and pre-post studies, normalisation divides by a pre-treatment quantity, so the relevant baseline parameter is known to be exactly 1 and is fixed rather than estimated. The time trend and treatment effect remain as free parameters.

DiD change-only studies report change scores (post minus pre) rather than level means, so they lack a study-specific pre-treatment level to normalise by. Instead, normalisation divides the change scores by the grand mean of the pre-control means across the full DiD studies in the meta-analysis. This places the change scores on the same relative scale as the other designs, but requires that at least some full DiD studies are present.

For RCT studies, there is no pre-treatment data, so normalisation divides by the post-treatment control mean (αi+βi\alpha_i + \beta_i). After normalisation, the control post-mean is fixed at 1. The normalised treatment post-mean is

αi+γi+βi+θiαi+βi=1+γi+θiαi+βi. \frac{\alpha_i + \gamma_i + \beta_i + \theta_i}{\alpha_i + \beta_i} = 1 + \frac{\gamma_i + \theta_i}{\alpha_i + \beta_i}.

When the model assumes equal baselines (γi=0\gamma_i = 0), the normalised treatment–control difference reduces to ϕi=θi/(αi+βi)\phi_i = \theta_i / (\alpha_i + \beta_i), the apparent effect. This is what the normalised data directly measures.

To recover the treatment effect on the same scale as DiD studies (normalised by αi\alpha_i), we need to undo the RCT-specific normalisation. Writing θ̃i=θi/αi\tilde\theta_i = \theta_i / \alpha_i and β̃i=βi/αi\tilde\beta_i = \beta_i / \alpha_i for the DiD-normalised quantities:

ϕi=θiαi+βi=θi/αi1+βi/αi=θ̃i1+β̃i, \phi_i = \frac{\theta_i}{\alpha_i + \beta_i} = \frac{\theta_i / \alpha_i}{1 + \beta_i / \alpha_i} = \frac{\tilde\theta_i}{1 + \tilde\beta_i},

so the normalised treatment effect is

θ̃i=ϕi(1+β̃i). \tilde\theta_i = \phi_i \cdot (1 + \tilde\beta_i).

The model is reparameterised so that ϕi\phi_i (apparent effect) and β̃i\tilde\beta_i (normalised time trend) are the sampled parameters, and θ̃i\tilde\theta_i is derived via the formula above. A Jacobian correction |1+β̃i||1 + \tilde\beta_i| is applied to the log-posterior to account for this change of variables. The hierarchical prior on β̃i\tilde\beta_i — informed primarily by DiD studies, which directly identify time trends — provides the regularisation needed to separate θ̃i\tilde\theta_i from β̃i\tilde\beta_i.

In the naive model (meta_did_naive()), the time trend is forced to zero for RCT studies, so ϕi=θ̃i\phi_i = \tilde\theta_i and the reparameterisation is bypassed. This is equivalent to the standard assumption that the post-treatment control–treatment difference is an unbiased estimate of the treatment effect.

When data are not normalised, the RCT baselines αi\alpha_i and αi+γi\alpha_i + \gamma_i are free parameters drawn from the same hierarchical distributions as in DiD studies, allowing the baseline difference γi\gamma_i to be informed by DiD evidence. By default, the model allows γi0\gamma_i \neq 0; this can be constrained to zero if randomisation is assumed to eliminate baseline imbalances.

Interpreting normalised treatment effects

After normalisation, the population treatment effect μθ\mu_\theta is expressed in units relative to the baseline. For example, μθ=0.33\mu_\theta = -0.33 means a 33% reduction relative to the baseline level.

Hierarchical structure

Study-specific treatment effects are drawn from a population distribution:

θi𝒩(μθ,τθ2), \theta_i \sim \mathcal{N}(\mu_\theta, \tau_\theta^2),

where μθ\mu_\theta is the overall treatment effect (the primary quantity of interest) and τθ\tau_\theta captures between-study heterogeneity.

Other study-level parameters (time trends βi\beta_i, baseline differences γi\gamma_i) similarly share population-level priors. Pre-post correlations ρ\rho can be modelled hierarchically via a Fisher-zz transform when hierarchical_rho = TRUE.

Design effects

When design_effects = TRUE, the model allows the population treatment effect mean to differ systematically by design:

μθ,RCT=μθ+δRCT,μθ,PP=μθ+δPP \mu_{\theta,\text{RCT}} = \mu_\theta + \delta_{\text{RCT}}, \quad \mu_{\theta,\text{PP}} = \mu_\theta + \delta_{\text{PP}}

where δRCT\delta_{\text{RCT}} and δPP\delta_{\text{PP}} are estimated offsets. This relaxes the assumption that all designs estimate exactly the same estimand, which may be appropriate when selection effects or time trends differ systematically across designs.

Robust heterogeneity

When robust_heterogeneity = TRUE, the treatment effect distribution uses a Student-tt instead of a normal:

θitν(μθ,τθ2) \theta_i \sim t_\nu(\mu_\theta, \tau_\theta^2)

where ν\nu (the degrees of freedom) is estimated. This accommodates outlier studies that would otherwise inflate τθ\tau_\theta.

Controlling assumptions with meta_did_general()

The meta_did_general() function provides explicit control over how nuisance parameters are handled for non-DiD designs, via three arguments:

  • time_trend: Controls the time trend βi\beta_i for RCT and pre-post studies.
    • "pooled" (default): hierarchical prior shared across designs, informed by DiD studies.
    • "fixed_zero": βi=0\beta_i = 0 for RCT and pre-post studies. For pre-post studies, this attributes all pre-post change to treatment. For RCTs, it bypasses the time trend reparameterisation described above.
  • baseline_imbalance: Controls the baseline difference γi\gamma_i for RCT studies.
    • "estimated" (default): γi\gamma_i is estimated, with information borrowed from DiD studies when normalised.
    • "fixed_zero": γi=0\gamma_i = 0, assuming randomisation eliminates baseline imbalances.
  • pp_likelihood: Controls the likelihood form for pre-post studies.
    • "differenced" (default): uses the post-minus-pre difference, eliminating the baseline algebraically. The pre/post correlation ρi\rho_i is not separately estimable.
    • "bivariate": uses the full bivariate normal likelihood for the (pre, post) pair. This retains ρi\rho_i as an estimable parameter, contributing to the hierarchical ρ\rho model, at the cost of estimating additional nuisance parameters.

These settings can be combined independently. For example, one might trust the randomisation assumption (baseline_imbalance = "fixed_zero") while still borrowing time trend information from DiD studies (time_trend = "pooled").

The default settings of meta_did_general() are identical to meta_did(). Setting time_trend and baseline_imbalance to "fixed_zero" reproduces the behaviour of the deprecated meta_did_naive().

Meta-regression with covariates

When study-level covariates are available (e.g., intervention dose, year of publication), the treatment effect mean can be modelled as a linear function of those covariates. If 𝐱i\mathbf{x}_i is a KK-vector of covariate values for study ii, the hierarchical prior becomes

θi𝒩(μθ+𝐱i𝛃,τθ2), \theta_i \sim \mathcal{N}\!\left( \mu_\theta + \mathbf{x}_i^\top \boldsymbol{\beta},\; \tau_\theta^2 \right),

where 𝛃\boldsymbol{\beta} is a vector of meta-regression coefficients estimated jointly with all other parameters. When robust_heterogeneity = TRUE, the normal is replaced by a Student-tt as before. The same covariate adjustment applies across all study designs (DiD, RCT, and pre-post), with design-specific offsets (δRCT\delta_{\text{RCT}}, δPP\delta_{\text{PP}}) added when design_effects = TRUE.

Covariate centering

By default (center_covariates = TRUE), covariates are mean-centered across all studies in the meta-analysis before fitting. This has a useful interpretive consequence: μθ\mu_\theta represents the population treatment effect at the average covariate values, rather than at 𝐱=0\mathbf{x} = 0 (which may not be a meaningful reference point).

The centering values are stored in the fitted object (fit$cov_centers) and are needed to reconstruct predictions on the original covariate scale.

Specifying covariates in meta_did()

Covariates are passed as a one-sided formula:

fit <- meta_did(
  summary_data = my_data,
  covariates   = ~ dose + year
)

The covariate columns must be present in summary_data (and/or individual_data), must be numeric, and must be constant within each study. The prior on 𝛃\boldsymbol{\beta} defaults to 𝒩(0,10)\mathcal{N}(0, 10) per coefficient and can be adjusted via set_priors(beta_cov = normal(0, sd)).

For a worked example including simulation and recovery, see vignette("covariates").

Prior specification

Priors can be customised via set_priors(). See ?set_priors for the full list of parameters and their defaults.