RTMB Internals and Inference Algorithms

This article explains how BayesRTMB represents models internally, how those models are converted into RTMB automatic differentiation objects, and how the same model object is routed to MCMC, MAP estimation, variational inference, or classical estimation.

Most users do not need to know these details in order to fit ordinary models. Wrapper functions such as rtmb_lm() and rtmb_glmer(), together with the basic rtmb_code() and rtmb_model() workflow, are usually enough.

The internal design becomes useful when you want to:

1. Overall Flow

At a high level, BayesRTMB proceeds as follows.

rtmb_code()
  -> rtmb_model()
    -> RTMB_Model
      -> build_ad_obj()
        -> RTMB::MakeADFun()
          -> sample() / optimize() / variational() / classic()
            -> MCMC_Fit / MAP_Fit / VB_Fit / Classic_Fit

Users usually write only rtmb_code() and call rtmb_model().

rtmb_model() checks the model code and data, then creates an RTMB_Model R6 object. This object stores the model definition, data, parameter declarations, transformation blocks, generated-quantity blocks, and wrapper metadata.

When an inference method is called, BayesRTMB builds the automatic differentiation object needed by RTMB::MakeADFun(). The resulting fit is returned as a method-specific object.

BayesRTMB is organized around the functional priority MCMC, MAP, VB, and then classical estimation. MCMC is the central method when the full posterior distribution matters. MAP provides fast point estimation and approximate checks. VB provides approximate inference for larger models. Classical estimation is a supplementary route for flat-prior wrapper models.

Method Main use Return value
sample() MCMC with NUTS MCMC_Fit
optimize() MAP / ML / marginal MAP MAP_Fit
variational() Approximate posterior inference with ADVI VB_Fit
classic() Classical estimation for supported flat-prior wrapper models Classic_Fit

2. Blocks in rtmb_code()

rtmb_code() describes a model as a set of named blocks.

code <- rtmb_code(
  data = {
    # Declare data names used by the model
  },
  setup = {
    # Preprocess data
  },
  parameters = {
    # Declare estimable parameters
  },
  transform = {
    # Build linear predictors and derived quantities
  },
  model = {
    # Write likelihood and prior distributions
  },
  generate = {
    # Compute generated quantities after fitting
  }
)

Not every block is required. The minimal model usually needs only parameters and model.

data

The data block declares data names used by the model. rtmb_model() checks whether these names exist in the supplied data list.

setup

The setup block is evaluated once before model evaluation. It is the right place to compute data dimensions, design matrices, centered predictors, group indices, and constants used later in the model.

Wrapper functions often place preprocessing in setup so that print_code() shows enough information to reproduce the model.

parameters

The parameters block declares estimable parameters with Dim().

parameters = {
  alpha <- Dim()
  beta  <- Dim(P)
  sigma <- Dim(lower = 0)
  r     <- Dim(G, random = TRUE)
}

Dim() stores dimension, constraints, initial values, and whether a parameter should be treated as a random effect.

transform

The transform block creates derived quantities from parameters and data.

transform = {
  mu <- alpha + X %*% beta
}

Quantities created in transform can be used in the model block and can also appear in summaries. For MAP and classical estimation, standard errors and intervals can be calculated by the delta method or simulation-based propagation.

Only primary parameters declared in parameters can be used as Laplace-random targets. Derived quantities created in transform cannot be specified directly as marginal variables.

model

The model block contains likelihood and prior distributions.

model = {
  beta ~ normal(0, 10)
  sigma ~ exponential(1)
  Y ~ normal(mu, sigma)
}

BayesRTMB encourages Stan-like sampling syntax. Internally, this is converted into additions to the log probability.

Conceptually, the following two lines describe the same contribution.

Y ~ normal(mu, sigma)
lp <- lp + normal_lpdf(Y, mu, sigma)

In ordinary rtmb_code(model = { ... }), you do not need to initialize lp. BayesRTMB prepares it internally when constructing the executable log-probability function.

generate

The generate block computes quantities after fitting.

generate = {
  y_rep <- rnorm(length(Y), mu, sigma)
}

Generated quantities are not part of likelihood evaluation or optimization. They are useful for posterior predictive checks, predictions, simulations, residuals, item information, factor scores, and other post-estimation quantities.

3. Parameter Representation

Users usually think about parameters on their natural constrained scale:

Optimization and HMC are easier on unconstrained real-valued spaces. BayesRTMB therefore maintains two representations.

Representation Meaning
constrained scale Natural user-facing scale
unconstrained scale Scale used by optimizers and samplers

For example, sigma <- Dim(lower = 0) is displayed as a positive sigma, but internally it is transformed from an unconstrained value.

Important helper functions include:

These transformations are central to fixed, map, random effects, Laplace approximation, and printed summaries.

4. Jacobian Correction

When constrained parameters are transformed to an unconstrained space, density calculations can require a Jacobian adjustment.

If y is the natural constrained parameter and x is the unconstrained parameter, the relation is:

\[ \log p_X(x) = \log p_Y(y) + \log \left| \frac{dy}{dx} \right| \]

BayesRTMB users write models on the natural scale. Internally, functions such as to_unconstrained(), to_constrained(), and calc_log_jacobian() add the required correction depending on the inference route.

The target is controlled by jacobian_target.

jacobian_target Meaning
"all" Apply correction to all transformed parameters
"random" Apply correction to the random side integrated by Laplace approximation
"none" Do not add the correction

For MCMC, BayesRTMB samples in the unconstrained space, so the posterior density generally uses jacobian_target = "all".

For optimize() with Laplace approximation, the density transformation needed for the integrated random side uses jacobian_target = "random".

For ordinary optimization without Laplace approximation, internal objective construction usually uses jacobian_target = "none".

Most users do not need to set this option directly. The distinction matters when understanding profile likelihoods, Laplace approximation, or low-level internals.

5. RTMB Automatic Differentiation Objects

RTMB provides an R interface to the TMB automatic differentiation engine.

BayesRTMB calls RTMB::MakeADFun() through RTMB_Model$build_ad_obj(). This creates an object that can evaluate function values, gradients, and Hessian-related quantities.

The model code is not compiled from a separate Stan-like language. Instead, the R model function is evaluated with AD objects. Arithmetic, matrix operations, distribution functions, and transformations are recorded as an automatic differentiation graph.

rtmb_model() performs pre-checks before fitting:

These checks aim to catch structural problems before they become harder-to-read RTMB/TMB errors.

6. Inference Methods

sample

sample() runs MCMC using NUTS.

fit <- mdl$sample()

MCMC is the central method when you want the full posterior distribution, not only a point estimate. It supports posterior means, posterior medians, intervals, diagnostics, posterior predictive checks, and bridge sampling.

MCMC samples are drawn on the unconstrained scale and returned on the natural constrained scale.

optimize

optimize() performs MAP, ML, or marginal MAP estimation.

fit <- mdl$optimize()

With priors, the result is MAP estimation. With flat-prior models, it is interpreted as maximum likelihood or a closely related estimate.

optimize() is useful for checking model behavior before MCMC, and for obtaining fast estimates for hierarchical models with Laplace approximation. Confidence intervals and log marginal likelihood approximations should be interpreted as approximations on the unconstrained parameter space.

Common options include:

variational

variational() approximates the posterior distribution with ADVI.

fit <- mdl$variational()

VB can be useful for large models or as an initial approximate check before MCMC. It is faster than MCMC, but it is approximate and can underestimate uncertainty. When possible, compare final uncertainty assessments with MCMC.

classic

classic() treats supported wrapper models with prior_flat() as classical frequentist analyses.

fit <- mdl$classic()

It is available only for wrapper-derived models that can be interpreted as classical analyses under flat priors.

Classic_Fit supports frequentist post-processing where appropriate, such as degrees of freedom, p values, AIC, BIC, ANOVA, likelihood-ratio tests, correlation tests, and contingency-table tests.

Functionally, classic() is lower priority than MCMC, MAP, and VB. It is a supplementary path for checking standard analyses and comparing Bayesian and frequentist workflows.

7. optimize() vs classic()

optimize() and classic() share low-level RTMB machinery, but their responsibilities differ.

optimize() is the API for MAP / ML / marginal MAP. It can be used with informative priors.

classic() is the API for frequentist interpretation of supported wrapper models with flat priors.

Internally, .fit_rtmb() builds the MakeADFun() object, runs optimization, and returns raw components such as sdreport() output. It does not itself construct user-facing fit objects.

Layer Role
.fit_rtmb() Returns raw RTMB optimization components
.inference_pipeline() Constructs MAP_Fit for optimize()
.classic_pipeline() Constructs Classic_Fit for classic()

This separation lets the same optimization engine support different summaries, degrees of freedom, prior correction, and post-processing behavior.

8. Laplace Approximation and random

In hierarchical, latent-variable, and mixed models, it is often better to integrate over some parameters rather than optimize all parameters as fixed effects.

Parameters declared with random = TRUE are eligible for Laplace approximation.

parameters = {
  alpha <- Dim()
  tau   <- Dim(lower = 0)
  r     <- Dim(G, random = TRUE)
}

With optimize(laplace = TRUE), BayesRTMB passes random-effect parameters to RTMB::MakeADFun(random = ...). In optimize(), laplace = TRUE is the default.

optimize(marginal = ...) can temporarily treat primary parameters as random so that marginal MAP / empirical Bayes style estimation can be performed. In flat-prior models, marginalizing fixed effects can correspond to restricted maximum likelihood (REML)-like estimation.

fit <- mdl$optimize(marginal = c("mean", "delta"))

Two internal sets are important:

Name Meaning
marginal_vars Primary parameter names requested by the user or wrapper through optimize(marginal = ...)
laplace_random_vars All variables actually passed to MakeADFun(random = ...)

In mixed models, both original random effects and primary parameters selected by marginal may be integrated by Laplace approximation. Therefore, laplace_random_vars can be broader than marginal_vars.

9. Standard Errors and Intervals

optimize() and classic() can report standard errors and intervals, not only point estimates.

Important se_method options include:

se_method Meaning
"wald" Wald standard errors based on sdreport() and the delta method
"sampling" Simulation-based error propagation from approximate normal samples
"none" Do not compute standard errors or intervals

se_method = "none" is a lightweight mode. It does not mean that standard errors are zero; it means they are not calculated.

Derived quantities created in transform can be handled by the delta method or by simulation-based propagation. Generated quantities are naturally handled from MCMC draws or by post-estimation generation.

If sdreport() cannot provide reliable standard errors, BayesRTMB emits diagnostic messages. Examples include non-positive-definite Hessians, unavailable covariance matrices, Hessian fallbacks, and generalized-inverse fallbacks.

10. Writing AD-Friendly Code

RTMB model code looks like ordinary R code, but internally it is evaluated with automatic differentiation objects. Some ordinary R patterns can therefore be problematic.

Avoid parameter-dependent if statements

Avoid branches where the computation path changes based on a parameter value.

# Avoid this pattern
if (sigma > 1) {
  lp <- lp + normal_lpdf(y, 0, sigma)
} else {
  lp <- lp + normal_lpdf(y, 0, 1)
}

Prefer smooth functions, constrained parameterizations, or explicit AD-compatible alternatives.

Be careful with apply-style functions

Functions such as apply(), lapply(), sapply(), and ifelse() may not always behave as expected with AD objects. BayesRTMB catches some cases early, but explicit loops or vectorized arithmetic are often safer.

Use blocks for their intended purposes

Data-only preprocessing belongs in setup.

Derived quantities based on parameters belong in transform.

Likelihood and priors belong in model.

Generated quantities belong in generate.

Keeping this separation makes print_code() readable and improves the stability of pre-checks and automatic differentiation.

11. Relationship to Stan, TMB, and RTMB

BayesRTMB uses RTMB as its computational backend. RTMB is an R interface to TMB’s automatic differentiation engine.

BayesRTMB is similar to Stan in that users write probabilistic models with constrained parameters and priors. It differs in that model code is written as R expressions through rtmb_code(), rather than in a separate Stan language file.

Topic BayesRTMB Stan
Model language R through rtmb_code() Stan language
Automatic differentiation RTMB / TMB Stan Math
Constraints Dim() and internal transformations Stan parameter constraints
Inference MAP / classic / NUTS / ADVI MAP / NUTS / ADVI and others
Wrappers rtmb_lm(), rtmb_glmer(), and others Usually hand-written models
Generated model inspection print_code() Stan code itself

12. Summary

BayesRTMB’s internal structure can be summarized as follows.

For practical model-writing examples, see Writing Model Codes. To understand wrapper-generated models, see Wrapper Functions and print_code(). For a detailed mixed-model and GLMM workflow, see Hierarchical Models and GLMMs.