This article explains how BayesRTMB represents models internally, how those models are converted into RTMB automatic differentiation objects, and how the same model object is routed to MCMC, MAP estimation, variational inference, or classical estimation.

Most users do not need to know these details in order to fit ordinary models. Wrapper functions such as rtmb_lm() and rtmb_glmer(), together with the basic rtmb_code() and rtmb_model() workflow, are usually enough.

1. Overall Flow

At a high level, BayesRTMB proceeds as follows.

rtmb_code()
  -> rtmb_model()
    -> RTMB_Model
      -> build_ad_obj()
        -> RTMB::MakeADFun()
          -> sample() / optimize() / variational() / classic()
            -> MCMC_Fit / MAP_Fit / VB_Fit / Classic_Fit

Users usually write only rtmb_code() and call rtmb_model().

rtmb_model() checks the model code and data, then creates an RTMB_Model R6 object. This object stores the model definition, data, parameter declarations, transformation blocks, generated-quantity blocks, and wrapper metadata.

When an inference method is called, BayesRTMB builds the automatic differentiation object needed by RTMB::MakeADFun(). The resulting fit is returned as a method-specific object.

BayesRTMB is organized around the functional priority MCMC, MAP, VB, and then classical estimation. MCMC is the central method when the full posterior distribution matters. MAP provides fast point estimation and approximate checks. VB provides approximate inference for larger models. Classical estimation is a supplementary route for flat-prior wrapper models.

Method	Main use	Return value
`sample()`	MCMC with NUTS	`MCMC_Fit`
`optimize()`	MAP / ML / marginal MAP	`MAP_Fit`
`variational()`	Approximate posterior inference with ADVI	`VB_Fit`
`classic()`	Classical estimation for supported flat-prior wrapper models	`Classic_Fit`

2. Blocks in `rtmb_code()`

rtmb_code() describes a model as a set of named blocks.

code <- rtmb_code(
  data = {
    # Declare data names used by the model
  },
  setup = {
    # Preprocess data
  },
  parameters = {
    # Declare estimable parameters
  },
  transform = {
    # Build linear predictors and derived quantities
  },
  model = {
    # Write likelihood and prior distributions
  },
  generate = {
    # Compute generated quantities after fitting
  }
)

Not every block is required. The minimal model usually needs only parameters and model.

data

The data block declares data names used by the model. rtmb_model() checks whether these names exist in the supplied data list.

setup

The setup block is evaluated once before model evaluation. It is the right place to compute data dimensions, design matrices, centered predictors, group indices, and constants used later in the model.

Wrapper functions often place preprocessing in setup so that print_code() shows enough information to reproduce the model.

parameters

The parameters block declares estimable parameters with Dim().

parameters = {
  alpha <- Dim()
  beta  <- Dim(P)
  sigma <- Dim(lower = 0)
  r     <- Dim(G, random = TRUE)
}

Dim() stores dimension, constraints, initial values, and whether a parameter should be treated as a random effect.

transform

The transform block creates derived quantities from parameters and data.

transform = {
  mu <- alpha + X %*% beta
}

Quantities created in transform can be used in the model block and can also appear in summaries. For MAP and classical estimation, standard errors and intervals can be calculated by the delta method or simulation-based propagation.

Only primary parameters declared in parameters can be used as Laplace-random targets. Derived quantities created in transform cannot be specified directly as marginal variables.

model

The model block contains likelihood and prior distributions.

model = {
  beta ~ normal(0, 10)
  sigma ~ exponential(1)
  Y ~ normal(mu, sigma)
}

BayesRTMB encourages Stan-like sampling syntax. Internally, this is converted into additions to the log probability.

Conceptually, the following two lines describe the same contribution.

Y ~ normal(mu, sigma)

lp <- lp + normal_lpdf(Y, mu, sigma)

In ordinary rtmb_code(model = { ... }), you do not need to initialize lp. BayesRTMB prepares it internally when constructing the executable log-probability function.

generate

The generate block computes quantities after fitting.

generate = {
  y_rep <- rnorm(length(Y), mu, sigma)
}

Generated quantities are not part of likelihood evaluation or optimization. They are useful for posterior predictive checks, predictions, simulations, residuals, item information, factor scores, and other post-estimation quantities.

3. Parameter Representation

Users usually think about parameters on their natural constrained scale:

sigma is positive;
theta is between 0 and 1;
ordered cutpoints must be increasing;
simplex vectors must sum to 1.

Optimization and HMC are easier on unconstrained real-valued spaces. BayesRTMB therefore maintains two representations.

Representation	Meaning
constrained scale	Natural user-facing scale
unconstrained scale	Scale used by optimizers and samplers

For example, sigma <- Dim(lower = 0) is displayed as a positive sigma, but internally it is transformed from an unconstrained value.

Important helper functions include:

to_unconstrained();
to_constrained();
constrained_vector_to_list();
unconstrained_vector_to_list();
generate_flat_names().

These transformations are central to fixed, map, random effects, Laplace approximation, and printed summaries.

4. Jacobian Correction

When constrained parameters are transformed to an unconstrained space, density calculations can require a Jacobian adjustment.

If y is the natural constrained parameter and x is the unconstrained parameter, the relation is:

\[ \log p_X(x) = \log p_Y(y) + \log \left| \frac{dy}{dx} \right| \]

BayesRTMB users write models on the natural scale. Internally, functions such as to_unconstrained(), to_constrained(), and calc_log_jacobian() add the required correction depending on the inference route.

The target is controlled by jacobian_target.

`jacobian_target`	Meaning
`"all"`	Apply correction to all transformed parameters
`"random"`	Apply correction to the random side integrated by Laplace approximation
`"none"`	Do not add the correction

For MCMC, BayesRTMB samples in the unconstrained space, so the posterior density generally uses jacobian_target = "all".

For optimize() with Laplace approximation, the density transformation needed for the integrated random side uses jacobian_target = "random".

For ordinary optimization without Laplace approximation, internal objective construction usually uses jacobian_target = "none".

Most users do not need to set this option directly. The distinction matters when understanding profile likelihoods, Laplace approximation, or low-level internals.

5. RTMB Automatic Differentiation Objects

RTMB provides an R interface to the TMB automatic differentiation engine.

BayesRTMB calls RTMB::MakeADFun() through RTMB_Model$build_ad_obj(). This creates an object that can evaluate function values, gradients, and Hessian-related quantities.

The model code is not compiled from a separate Stan-like language. Instead, the R model function is evaluated with AD objects. Arithmetic, matrix operations, distribution functions, and transformations are recorded as an automatic differentiation graph.

rtmb_model() performs pre-checks before fitting:

undefined variables in setup or parameters;
whether model and transform can run with initial values;
whether log_prob returns a scalar;
whether MakeADFun() can compute gradients.

These checks aim to catch structural problems before they become harder-to-read RTMB/TMB errors.

6. Inference Methods

sample

sample() runs MCMC using NUTS.

fit <- mdl$sample()

MCMC is the central method when you want the full posterior distribution, not only a point estimate. It supports posterior means, posterior medians, intervals, diagnostics, posterior predictive checks, and bridge sampling.

MCMC samples are drawn on the unconstrained scale and returned on the natural constrained scale.

BayesRTMB’s NUTS implementation has two proposal-selection variants. The default, nuts_variant = "multinomial", uses Stan-style Hamiltonian-weighted proposal selection over the built trajectory. The alternative, nuts_variant = "slice", uses slice-based proposal selection. For ordinary use, the default is intended to be the main path.

The mass matrix can be controlled with metric. The available choices are "diag", "dense", "hybrid", and "auto". A diagonal metric is cheaper and often sufficient, while a dense metric can improve sampling when parameters are strongly correlated. The hybrid metric uses a dense block for non-random parameters and a diagonal block for random parameters. The default "auto" starts from the hybrid strategy and can fall back to a simpler metric during warmup when the dense block is not useful or is numerically unstable.

Warmup learns both step size and metric information. metric_adaptation = "stan_window" uses a Stan-style windowed adaptation schedule and is the default. The diagnose() method reports the effective metric, warmup summaries, divergences, tree depth, leapfrog counts, and other information useful for checking whether the sampler is behaving well.

optimize

optimize() performs MAP, ML, or marginal MAP estimation.

fit <- mdl$optimize()

With priors, the result is MAP estimation. With flat-prior models, it is interpreted as maximum likelihood or a closely related estimate.

optimize() is useful for checking model behavior before MCMC, and for obtaining fast estimates for hierarchical models with Laplace approximation. Confidence intervals and log marginal likelihood approximations should be interpreted as approximations on the unconstrained parameter space.

Common options include:

laplace;
marginal;
se_method;
df_method;
map;
fixed.

variational

variational() approximates the posterior distribution with ADVI.

fit <- mdl$variational()

VB can be useful for large models or as an initial approximate check before MCMC. It is faster than MCMC, but it is approximate and can underestimate uncertainty. When possible, compare final uncertainty assessments with MCMC.

classic

classic() treats supported wrapper models with prior_flat() as classical frequentist analyses.

fit <- mdl$classic()

It is available only for wrapper-derived models that can be interpreted as classical analyses under flat priors.

Classic_Fit supports frequentist post-processing where appropriate, such as degrees of freedom, p values, AIC, BIC, ANOVA, likelihood-ratio tests, correlation tests, and contingency-table tests.

Functionally, classic() is lower priority than MCMC, MAP, and VB. It is a supplementary path for checking standard analyses and comparing Bayesian and frequentist workflows.

7. `optimize()` vs `classic()`

optimize() and classic() share low-level RTMB machinery, but their responsibilities differ.

optimize() is the API for MAP / ML / marginal MAP. It can be used with informative priors.

classic() is the API for frequentist interpretation of supported wrapper models with flat priors.

Internally, .fit_rtmb() builds the MakeADFun() object, runs optimization, and returns raw components such as sdreport() output. It does not itself construct user-facing fit objects.

Layer	Role
`.fit_rtmb()`	Returns raw RTMB optimization components
`.inference_pipeline()`	Constructs `MAP_Fit` for `optimize()`
`.classic_pipeline()`	Constructs `Classic_Fit` for `classic()`

This separation lets the same optimization engine support different summaries, degrees of freedom, prior correction, and post-processing behavior.

8. Laplace Approximation and `random`

In hierarchical, latent-variable, and mixed models, it is often better to integrate over some parameters rather than optimize all parameters as fixed effects.

Parameters declared with random = TRUE are eligible for Laplace approximation.

parameters = {
  alpha <- Dim()
  tau   <- Dim(lower = 0)
  r     <- Dim(G, random = TRUE)
}

With optimize(laplace = TRUE), BayesRTMB passes random-effect parameters to RTMB::MakeADFun(random = ...). In optimize(), laplace = TRUE is the default.

optimize(marginal = ...) can temporarily treat primary parameters as random so that marginal MAP / empirical Bayes style estimation can be performed. In flat-prior models, marginalizing fixed effects can correspond to restricted maximum likelihood (REML)-like estimation.

fit <- mdl$optimize(marginal = c("mean", "delta"))

Two internal sets are important:

Name	Meaning
`marginal_vars`	Primary parameter names requested by the user or wrapper through `optimize(marginal = ...)`
`laplace_random_vars`	All variables actually passed to `MakeADFun(random = ...)`

In mixed models, both original random effects and primary parameters selected by marginal may be integrated by Laplace approximation. Therefore, laplace_random_vars can be broader than marginal_vars.

9. Standard Errors and Intervals

optimize() and classic() can report standard errors and intervals, not only point estimates.

Important se_method options include:

`se_method`	Meaning
`"wald"`	Wald standard errors based on `sdreport()` and the delta method
`"sampling"`	Simulation-based error propagation from approximate normal samples
`"none"`	Do not compute standard errors or intervals

se_method = "none" is a lightweight mode. It does not mean that standard errors are zero; it means they are not calculated.

Derived quantities created in transform can be handled by the delta method or by simulation-based propagation. Generated quantities are naturally handled from MCMC draws or by post-estimation generation.

If sdreport() cannot provide reliable standard errors, BayesRTMB emits diagnostic messages. Examples include non-positive-definite Hessians, unavailable covariance matrices, Hessian fallbacks, and generalized-inverse fallbacks.

10. Writing AD-Friendly Code

RTMB model code looks like ordinary R code, but internally it is evaluated with automatic differentiation objects. Some ordinary R patterns can therefore be problematic.

Avoid parameter-dependent `if` statements

Avoid branches where the computation path changes based on a parameter value.

# Avoid this pattern
if (sigma > 1) {
  lp <- lp + normal_lpdf(y, 0, sigma)
} else {
  lp <- lp + normal_lpdf(y, 0, 1)
}

Prefer smooth functions, constrained parameterizations, or explicit AD-compatible alternatives.

Be careful with apply-style functions

Functions such as apply(), lapply(), sapply(), and ifelse() may not always behave as expected with AD objects. BayesRTMB catches some cases early, but explicit loops or vectorized arithmetic are often safer.

Use blocks for their intended purposes

Data-only preprocessing belongs in setup.

Derived quantities based on parameters belong in transform.

Likelihood and priors belong in model.

Generated quantities belong in generate.

Keeping this separation makes print_code() readable and improves the stability of pre-checks and automatic differentiation.

11. Relationship to Stan, TMB, and RTMB

BayesRTMB uses RTMB as its computational backend. RTMB is an R interface to TMB’s automatic differentiation engine.

BayesRTMB is similar to Stan in that users write probabilistic models with constrained parameters and priors. It differs in that model code is written as R expressions through rtmb_code(), rather than in a separate Stan language file.

Topic	BayesRTMB	Stan
Model language	R through `rtmb_code()`	Stan language
Automatic differentiation	RTMB / TMB	Stan Math
Constraints	`Dim()` and internal transformations	Stan parameter constraints
Inference	MAP / classic / NUTS / ADVI	MAP / NUTS / ADVI and others
Wrappers	`rtmb_lm()`, `rtmb_glmer()`, and others	Usually hand-written models
Generated model inspection	`print_code()`	Stan code itself

12. Summary

BayesRTMB’s internal structure can be summarized as follows.

rtmb_code() describes a model as a block-structured object.
rtmb_model() checks the data and code, then creates an RTMB_Model.
RTMB_Model stores model code, data, parameter declarations, transformations, generated quantities, and wrapper metadata.
build_ad_obj() creates the automatic differentiation object used by RTMB::MakeADFun().
sample(), optimize(), variational(), and classic() run different inference methods from the same model representation.
optimize() and classic() share a low-level engine, but their responsibilities and return objects are distinct.
For Laplace approximation, the distinction between random = TRUE, marginal_vars, and laplace_random_vars is important.

For practical model-writing examples, see Writing Model Codes. To understand wrapper-generated models, see Wrapper Functions and print_code(). For a detailed mixed-model and GLMM workflow, see Hierarchical Models and GLMMs. For a compact user-facing reference, see Analysis Reference.

RTMB Internals and Inference Algorithms