The *car() functions (rcar(),
car(), mcar(), mstcar()) build
the necessary features needed to run RSTr and places them in the
specified directory. In this vignette, we will talk in detail about each
model, the arguments used in *car() functions, and how to
use them.
*car() functionsThe current version of RSTr features four models to choose from: the Besag-York-Mollié (BYM) CAR model (also known as just the CAR model), the Restricted CAR (RCAR) model, the Multivariate CAR (MCAR) model, and the Multivariate Spatiotemporal CAR (MSTCAR) model. We will now compare these models and their use cases.
The CAR model (function call car()) is the basis of all
models used in RSTr. The premise of the CAR is to spatially smooth
estimates across spatial regions using a random effects estimator
Z. The intensity of its smoothing in a region is based on
the event and population counts of that region and the counts in its
neighboring regions.
The CAR model only smooths across spatial regions and not across
sociodemographic groups or time periods. While datasets that include
these stratifications can be run with car(), the user would
be effectively running several concurrent CAR models. The CAR model is
recommended if the user is only has data for one sociodemographic group
of interest and one time-period.
The RCAR model (function call rcar()) is the most recent
BYM implementation in RSTr. The RCAR model follows the same general
paradigm as the CAR model, but prevents oversmoothing by capping the
spatial and non-spatial variance. Even though the RCAR only smooths
across spatial regions, the estimates generated by rcar()
are nuanced and strike a happy medium between the use of crude rate
estimates and the oversmoothing of the standard CAR model.
To learn more about restricted CAR models, read
vignette("RSTr-informativeness").
The MCAR model (function call mcar()) is an extension of
the CAR model: whereas the CAR model can only smooth over spatial
regions, the MCAR model can smooth over spatial regions and
sociodemographic groups. The MCAR model is ideal for datasets that
include multiple sociodemographic groups. A restricted MCAR (RMCAR)
model is currently under development and will be implemented in RSTr
once its methodology is finalized.
The MSTCAR model (function call mstcar()) is an
extension of the MCAR model, allowing for smoothing over spatial
regions, sociodemographic groups, and time periods. The MSTCAR model is
ideal for investigating trends in rate estimates over a specified time
period.
*car() functionAll *car() functions provide several arguments:
name: The name of the folder your model information
lives in;
data: The list object containing the
event Y and population n data. For more
information on data setup, read
vignette("RSTr-event");
adjacency: The adjacency structure for your event
and population data. For more information on adjacency structure setup,
read vignette("RSTr-adjacency");
dir: The directory where the model folder lives. By
default, this saves into your temporary directory, so the model
information will be lost after the R session ends. Should you want to
save your model to be analyzed at a later date or ensure that your
samples are intact if R crashes during runtime, specify a different
directory;
seed: Allows the user to specify the random seed
used for replication purposes;
perc_ci: A number between 0 and 1 which specifies
the desired credible interval to use when calculating the relative
precision of estimates. By default, set to 0.95;
iterations: The number of iterations to run the
model for;
show_plots: If set to FALSE, hides
traceplots during model execution;
verbose: If set to FALSE, hides the
progress bar and messages in the console;
ignore_checks: If set to TRUE, skips
model validation;
method: Chooses whether the event data is either
Binomial ("binomial") or Poisson ("poisson")
distributed. By default, RSTr uses Binomial updates for event
data;
impute_lb: Specifies a lower bound for imputed data
for event information that is missing or suppressed;
impute_ub: Specifies an upper bound for imputed data
for event information that is missing or suppressed;
inits: This is a list of initial values
for each parameter. This can be specified by the user or generated by
default. For more information on specification of inits,
see vignette("RSTr-initialvalues"); and
priors: This is a list of all prior
information for each parameter. This can be specified by the user or
generated by default. For more information on specification of
priors, see vignette("RSTr-priors").
Most of these arguments are not needed, as the model has defaults for
many of them. rcar() and mstcar() have
additional arguments only used by them:
A: In the RCAR model, describes the limit of the
smoothing intensity between regions;
m0: In the RCAR model, specifies the baseline
neighbor count; and
update_rho: In the MSTCAR model, allows for updates
of the temporal correlation parameter rho. By default, RSTr
does not update rho.
If you run into errors when trying to initialize your model, read
vignette("RSTr-troubleshoot"). Below, we will go into
detail regarding what each argument does specifically and what to keep
in mind when setting these values.
inits argumentinits is a list specifying the starting
values for parameters in the model. Details around the initial value
parameters can be found in
vignette("RSTr-initialvalues").
priors argumentpriors behaves similar to inits, except
that it contains all information related to parameter priors. Details
around the initial value parameters can be found in
vignette("RSTr-priors").
method argumentmethod offers two values: "binomial" and
"poisson". These values determine how the data is
transformed and how the lambda Metropolis update is
performed: "binomial" treats the event data as
Binomial-distributed and "poisson" treats the event data as
Poisson-distributed. Depending on your use case, you’ll want to choose
between the two: for example, if you are working with very small
mortality rates, "poisson" will work well, but if you are
working with birth rates, for example, then "binomial" will
work better. Note that "binomial" works in most general use
cases and "poisson" only works well for datasets with small
rates under approximately 1%.
m0 and Am0 and A are two components that determine
the intensity of the smoothing of Restricted CAR models. m0
should be a positive scalar, and the size of A is dependent
on the group/time structure of your data: A will be a
positive scalar for region-only models, a vector of size
n_group for region-group models, and a matrix of size
n_group x n_time for region-group-time models.
Note, however, that these informativeness restriction measures are
currently only developed for the CAR model, and restrictions for more
complex models will be added to the RSTr package as their respective
methods are developed.
update_rho argumentIn the MSTCAR model, update_rho is a
logical that specifies whether to calculate estimates for
the temporal correlation rho. By default, it is set to
FALSE. In empirical testing, this estimate was found to not
be very sensitive to changes when specified prudently and also increases
runtime by an order of magnitude due to its complexity.
seed argumentBecause of the stochastic nature of Bayesian inference and the
inherent instability of the MSTCAR model, replicability is extremely
important. seed allows the user to specify a seed for
generating similar estimates.
ignore_checks argumentAs development continues on RSTr, there are occasions where the
checks performed on the inputs of *car() throw an error,
even though you may be certain that all of your inputs are behaving as
expected. To override the checks, you can use the
ignore_checks argument. Set this to TRUE to
skip this step.
Initialization is one of the most important steps of running the
model, as it’s where virtually all choices regarding the model are made.
In this vignette, we explored each available type of CAR model in RSTr,
the arguments of the *car() functions, and how to
appropriately choose values for each argument.