| Type: | Package |
| Title: | Methods for Optimizing Scales of Effect |
| Version: | 0.7.0 |
| Description: | A tool for optimizing scales of effect when modeling ecological processes in space. Specifically, the scale parameter of a distance-weighted kernel distribution is identified for all environmental layers included in the model. Includes functions to assist in model selection, model evaluation, efficient transformation of raster surfaces using fast Fourier transformation, and projecting models. For more details see Peterman (2026) <doi:10.1007/s10980-025-02267-x>. |
| License: | GPL-3 |
| Encoding: | UTF-8 |
| LazyData: | true |
| Imports: | Rcpp, Matrix, cowplot, dplyr, fields, ggplot2, insight, stats, utils, unmarked, exactextractr, crayon, parallel, optimParallel, AICcmodavg, methods, pscl |
| LinkingTo: | Rcpp, RcppArmadillo |
| Depends: | R (≥ 4.3), terra, sf |
| Suggests: | knitr, landscapemetrics, geodiv, rmarkdown, MASS, nlme, pkgload, survival, testthat (≥ 3.0.0) |
| Config/testthat/edition: | 3 |
| URL: | https://github.com/wpeterman/multiScaleR |
| BugReports: | https://github.com/wpeterman/multiScaleR/issues |
| BuildVignettes: | true |
| VignetteBuilder: | knitr |
| Author: | Bill Peterman |
| Maintainer: | Bill Peterman <Peterman.73@osu.edu> |
| Config/roxygen2/version: | 8.0.0 |
| NeedsCompilation: | yes |
| Packaged: | 2026-06-30 20:39:51 UTC; peterman.73 |
| Repository: | CRAN |
| Date/Publication: | 2026-07-01 22:30:14 UTC |
multiScaleR
Description
This package is for optimizing scales of effect when modeling ecological processes in space. Specifically, the scale parameter of a distance-weighted kernel distribution is identified for all environmental layers included in the model.
Details
Author(s)
Maintainer: Bill Peterman Peterman.73@osu.edu (ORCID)
Authors:
Bill Peterman Peterman.73@osu.edu (ORCID)
See Also
Useful links:
Report bugs at https://github.com/wpeterman/multiScaleR/issues
multiScaleR model selection
Description
Creates a model selection table ranked by AIC or AICc for a
list of fitted models. Mixed lists containing both multiScaleR
objects and plain model objects (e.g., glm) are supported, allowing
comparison of scale-optimized models against fixed-scale alternatives.
Usage
aic_tab(mod_list,
AICc = TRUE,
mod_names = NULL,
verbose = FALSE,
...)
Arguments
mod_list |
List containing fitted |
AICc |
Logical. If |
mod_names |
Optional character vector of model names. Length must equal
|
verbose |
Logical. If |
... |
Additional arguments (not used). |
Details
The table is constructed using aictabCustom from
the AICcmodavg package. For multiScaleR objects, the optimized
(refitted) model stored in opt_mod is used for log-likelihood and
parameter count. The sigma parameter (and shape for "expow") is added
to K automatically. For plain model objects in a mixed list, sigma is not
added to K.
Value
A data frame of class "aictab" (from the AICcmodavg
package) sorted by ascending AIC(c), containing:
ModnamesModel name (from
mod_namesor auto-generated).KNumber of estimated parameters (regression coefficients + sigma, + shape for
"expow"kernel).AICcorAICInformation criterion value.
Delta_AICcorDelta_AICDifference from the top model.
ModelLikRelative likelihood (
exp(-0.5 * Delta)).AICcWtorAICWtAkaike weight.
LLLog-likelihood.
Cum.WtCumulative Akaike weight.
Author(s)
Bill Peterman
Examples
## Simulate data
set.seed(555)
points <- vect(cbind(c(5,7,9,11,13),
c(13,11,9,7,5)))
mat_list <- list(r1 = rast(matrix(rnorm(20^2),
nrow = 20)),
r2 = rast(matrix(rnorm(20^2),
nrow = 20)))
rast_stack <- rast(mat_list)
kernel_inputs <- kernel_prep(pts = points,
raster_stack = rast_stack,
max_D = 5,
kernel = 'gaussian',
sigma = NULL)
## Example response data
y <- rnorm(5)
## Create data frame with raster variables
dat <- data.frame(y = y,
kernel_inputs$kernel_dat)
mod1 <- glm(y ~ r1,
data = dat)
mod2 <- glm(y ~ r2,
data = dat)
mod3 <- glm(y ~ r1 + r2,
data = dat)
## NOTE: This code is only for demonstration
## Optimization results will have no meaning
opt_mod1 <- multiScale_optim(fitted_mod = mod1,
kernel_inputs = kernel_inputs,
par = NULL,
n_cores = NULL)
opt_mod2 <- multiScale_optim(fitted_mod = mod2,
kernel_inputs = kernel_inputs,
par = NULL,
n_cores = NULL)
opt_mod3 <- multiScale_optim(fitted_mod = mod3,
kernel_inputs = kernel_inputs,
par = NULL,
n_cores = NULL)
## AIC table
mod_list <- list(opt_mod1, opt_mod2, opt_mod3)
aic_tab(mod_list = mod_list,
AICc = FALSE)
## AICc table with specified names
aic_tab(mod_list = mod_list,
AICc = TRUE,
mod_names = c('mod1', 'mod2', 'mod3'))
multiScaleR model selection
Description
Creates a model selection table ranked by BIC for a list of
fitted models. Mixed lists containing both multiScaleR objects and
plain model objects are supported.
Usage
bic_tab(mod_list,
mod_names = NULL,
verbose = FALSE,
...)
Arguments
mod_list |
List containing fitted |
mod_names |
Optional character vector of model names. Length must equal
|
verbose |
Logical. If |
... |
Additional arguments (not used). |
Details
The table is constructed using bictabCustom from
the AICcmodavg package. BIC penalizes model complexity more heavily
than AIC as sample size grows, and is often preferred when the goal is
identifying the single best-supported model rather than model averaging.
Sigma (and shape for "expow") is counted in K.
Value
A data frame of class "bictab" (from the AICcmodavg
package) sorted by ascending BIC, containing:
ModnamesModel name.
KNumber of estimated parameters (regression coefficients + sigma, + shape for
"expow"kernel).BICBayesian information criterion value.
Delta_BICDifference from the top model.
BICWtBIC weight.
Cum.WtCumulative BIC weight.
LLLog-likelihood.
Author(s)
Bill Peterman
Examples
## Simulate data
set.seed(555)
points <- vect(cbind(c(5,7,9,11,13),
c(13,11,9,7,5)))
mat_list <- list(r1 = rast(matrix(rnorm(20^2),
nrow = 20)),
r2 = rast(matrix(rnorm(20^2),
nrow = 20)))
rast_stack <- rast(mat_list)
kernel_inputs <- kernel_prep(pts = points,
raster_stack = rast_stack,
max_D = 5,
kernel = 'gaussian',
sigma = NULL)
## Example response data
y <- rnorm(5)
## Create data frame with raster variables
dat <- data.frame(y = y,
kernel_inputs$kernel_dat)
mod1 <- glm(y ~ r1,
data = dat)
mod2 <- glm(y ~ r2,
data = dat)
mod3 <- glm(y ~ r1 + r2,
data = dat)
## NOTE: This code is only for demonstration
## Optimization results will have no meaning
opt_mod1 <- multiScale_optim(fitted_mod = mod1,
kernel_inputs = kernel_inputs,
par = NULL,
n_cores = NULL)
opt_mod2 <- multiScale_optim(fitted_mod = mod2,
kernel_inputs = kernel_inputs,
par = NULL,
n_cores = NULL)
opt_mod3 <- multiScale_optim(fitted_mod = mod3,
kernel_inputs = kernel_inputs,
par = NULL,
n_cores = NULL)
## BIC table
mod_list <- list(opt_mod1, opt_mod2, opt_mod3)
bic_tab(mod_list = mod_list)
## BIC table with specified names
bic_tab(mod_list = mod_list,
mod_names = c('mod1', 'mod2', 'mod3'))
CI Function
Description
Internal function for calculating confidence intervals
Usage
ci_func(x, df, min_D = NULL, names = NULL, as_dist = FALSE, ...)
Arguments
x |
Estimates |
df |
Degrees of Freedom |
min_D |
minimum distance, Default: NULL |
names |
Names of fitted variables, Default: NULL |
as_dist |
Calculate distance confidence interval; Default = FALSE |
... |
Not used |
Value
data frame with calculated confidence intervals
CI Function
Description
Internal function for calculating confidence intervals
Usage
ci_func_r(x, df, min_D = NULL, names = NULL, as_dist = FALSE, ...)
Arguments
x |
Estimates |
df |
Degrees of Freedom |
min_D |
minimum distance, Default: NULL |
names |
Names of fitted variables, Default: NULL |
as_dist |
Calculate distance confidence interval; Default = FALSE |
... |
Not used |
Value
data frame with calculated confidence intervals
Identify the originating package of an S3 or S4 model object and load it on a PSOCK cluster
Description
Identify the originating package of an S3 or S4 model object and load it on a PSOCK cluster
Usage
cluster_prep(model, cl)
Arguments
model |
An R model object (S3 or S4) |
cl |
A PSOCK cluster (created with parallel::makeCluster) |
Details
For internal use
Value
Invisibly returns the package name loaded
Example data frame
Description
Example count data to be used for optimizing scales of effect
Usage
data(count_data)
Format
A data frame with 75 rows and 2 columns. Data were simulated from a Poisson distribution with an intercept of 0.5, a 'hab' effect of 0.75, and scale of effect (sigma) of 75.
- y
–> Simulated counts at spatial locations
- hab
–> Scaled and centered weighted mean values from the 'hab' raster at each of the 'pts'
Retrieve diagnostics from multiScaleR objects
Description
Returns structured warning/diagnostic information stored on fitted
multiScaleR objects. Diagnostics are populated automatically during
multiScale_optim and flag potential issues with the
optimization result.
Usage
diagnostics(object, ...)
## S3 method for class 'multiScaleR'
diagnostics(object, ...)
Arguments
object |
An object to inspect. Must be of class |
... |
Additional arguments passed to methods. |
Details
All three diagnostics are evaluated automatically at the end of
multiScale_optim. Console warnings are printed when any
diagnostic is triggered. diagnostics() provides programmatic access
to the same information without re-running the model.
When max_distance is triggered, consider re-running
kernel_prep with a larger max_D value. The suggested
minimum is stored in suggested_max_D.
When sigma_precision is triggered, the variable may not have a
meaningful scale of effect, or the data may be insufficient to estimate it
precisely. Interpret affected covariates with caution.
Value
A named list with up to three elements:
max_distanceA list describing whether the estimated scale of effect approaches or exceeds
max_D. Fields include:triggered(logical),variables(names of affected covariates),effective_distance(estimated 90% kernel distance per covariate),max_D(the limit used),ratio(max_D / effective_distance), andsuggested_max_D(a recommended minimummax_Dvalue). Triggered whenmax_D / effective_distance < 2.sigma_precisionA list describing whether sigma estimates are imprecise. Fields include:
triggered(logical),variables(names of affected covariates),estimate(sigma means),se(standard errors), andse_to_mean(ratio of SE to mean). Triggered whenSE / Mean >= 0.5for any covariate.NULLwhen no precision concern was flagged.shape_precisionIdentical structure to
sigma_precisionbut for the shape parameter of the exponential power kernel.NULLunlesskernel = "expow"was used and a precision concern arose.
See Also
Examples
data('pts')
data('count_data')
hab <- terra::rast(system.file('extdata', 'hab.tif', package = 'multiScaleR'))
kernel_inputs <- kernel_prep(pts = pts,
raster_stack = hab,
max_D = 250,
kernel = 'gaussian')
mod <- glm(y ~ hab, family = poisson, data = count_data)
opt <- multiScale_optim(fitted_mod = mod, kernel_inputs = kernel_inputs)
## Access diagnostics
diag <- diagnostics(opt)
diag$max_distance$triggered
diag$sigma_precision
Estimate Memory Requirements for Multiscale Optimization
Description
Estimate the memory footprint of a kernel_prep object and
provide a conservative recommendation for parallel worker counts. The helper
inspects the stored kernel inputs, optionally builds the same optimization
context used by multiScale_optim, detects the number of
physical CPU cores, and estimates a maximum n_cores value while
leaving a user-defined number of cores free.
Usage
estimate_multiscale_ram(
kernel_inputs,
fitted_mod = NULL,
join_by = NULL,
refit_fn = NULL,
n_cores = NULL,
PSOCK = (.Platform$OS.type != "unix"),
safety_factor = 1.5,
ram_fraction = 0.75,
leave_free = 2L
)
Arguments
kernel_inputs |
A |
fitted_mod |
Optional fitted model object. When supplied, the helper
also measures the optimization context that would be serialized to
parallel workers during |
join_by |
Optional data frame passed through to
|
refit_fn |
Optional custom refit function passed through to
|
n_cores |
Optional positive integer. When supplied, the helper also reports the estimated peak memory for that specific worker count. When omitted, the reported peak estimate uses the recommended maximum worker count. |
PSOCK |
Logical. Should parallel runs be budgeted as PSOCK workers.
Default: |
safety_factor |
Numeric scalar |
ram_fraction |
Numeric scalar in |
leave_free |
Integer scalar |
Details
The estimate is advisory rather than exact. It is based on
utils::object.size() and therefore reflects
the size of stored R objects, not all transient allocations made by
exactextractr, model refits, or BLAS/LAPACK.
The recommendation is intentionally conservative. Even on Unix-like systems where forking can share memory copy-on-write, worker counts are budgeted as though each worker may require its own copy of the optimization payload.
If fitted_mod is omitted, the RAM recommendation is based on
kernel_inputs alone and may understate the true optimization payload.
Value
A list of class "multiScaleR_ram" with:
requested_n_coresWorker count used for the
peak_parallel_estimate.recommended_n_coresConservative recommended maximum worker count after applying both CPU and RAM limits.
max_cores_by_cpuMaximum worker count allowed by detected physical CPU cores after leaving
leave_freecores unused.max_cores_by_ramMaximum worker count allowed by the RAM budget, or
NAif total system RAM could not be detected.backendCharacter string identifying the assumed parallel backend budget:
"PSOCK"or"fork".systemNamed list describing detected physical cores, logical cores, total RAM, usable RAM, and the reservation settings.
point_summaryOne-row data frame summarizing the number of points, extracted cells per point, source raster layer count, whether cell-level data were retained (
cell_data_stored), and the number of distance bins (nbins).component_bytesData frame reporting estimated byte sizes for the major stored objects and worker bundles.
notesCharacter vector with interpretation notes and any detection warnings.
Examples
library(terra)
pts <- vect(cbind(c(3, 5, 7),
c(7, 5, 3)))
r <- rast(matrix(rnorm(100), nrow = 10))
names(r) <- "hab"
kernel_inputs <- kernel_prep(
pts = pts,
raster_stack = r,
max_D = 2,
kernel = "gaussian",
verbose = FALSE
)
estimate_multiscale_ram(kernel_inputs)
Extract the Namespace from a Model Call
Description
This function attempts to extract the package namespace from the call used to fit a model object, assuming the function was called using the 'pkg::fun()' syntax.
Usage
extract_namespace(x)
Arguments
x |
A fitted model object. |
Details
For internal use
Value
A character string with the namespace (package name), or 'NULL' if the namespace cannot be determined.
Fast Fourier Transformation
Description
Kernel smoothing with fft
Usage
fft_convolution(x, kernel, fun = "mean", na.rm = TRUE)
Arguments
x |
A matrix of the raster layer |
kernel |
A weight matrix for the smoothing |
fun |
Either "mean" or "sum" |
na.rm |
Logical; should NA values be removed from mean calculation? |
Value
A matrix
Example raster
Description
Example habitat raster for optimizing scales of effect
Format
A binary SpatRaster object
- hab
–> A binary raster
Examples
hab <- terra::rast(system.file("extdata",
"hab.tif", package = 'multiScaleR'))
Distance at Cumulative Kernel Proportion
Description
Compute the distance at which a given cumulative density is reached for several kernel types.
Usage
k_dist(
sigma,
prob = 0.95,
kernel = c("gaussian", "exp", "expow", "fixed"),
beta = NULL
)
Arguments
sigma |
Numeric. Scale parameter. For Gaussian and exponential, this is standard deviation or decay rate. For expow, this is the kernel bandwidth. |
prob |
Numeric. Desired cumulative proportion (e.g., 0.95). |
kernel |
Character. One of "gaussian", "exp", "expow", or "fixed". |
beta |
Numeric. Shape parameter for exponential power kernel. Ignored unless kernel = "expow". |
Value
Numeric. Distance at which the cumulative kernel density reaches the specified proportion.
Scale Distance
Description
Estimates the distance at which a specified cumulative proportion
of the kernel density function is reached. Can be used with a fitted
multiScaleR object to report optimized scale distances, or with
manually supplied kernel parameters to explore kernel behavior.
Usage
kernel_dist(model, prob = 0.9, ...)
Arguments
model |
A fitted |
prob |
Numeric between 0 and 1 (exclusive). Cumulative kernel density
threshold used to define the effective distance. Default: |
... |
Additional parameters used when
|
Details
The effective distance depends on both the kernel type and the scale parameter
sigma:
-
Gaussian: uses the inverse normal CDF, so the 90% distance is approximately 1.65 sigma.
-
Negative exponential: uses
-sigma * log(1 - prob). -
Fixed buffer: returns
sigma * prob(the fraction of the buffer radius). -
Exponential power: integrates the density numerically; both
sigmaandbeta(shape) must be specified.
Confidence intervals for the fitted model case are derived from the
Hessian-based standard errors (or profile-likelihood intervals when
profile_sigma has been run and stored on the object).
Value
When model is provided: a data frame with one row per optimized
covariate and three columns: Mean (distance at the estimated sigma),
Lower (distance at the lower 95% CI of sigma), and Upper
(distance at the upper 95% CI of sigma). Values are rounded to two decimal
places.
When kernel parameters are supplied directly via ...: a single numeric
value giving the distance at which the cumulative kernel density first reaches
prob.
See Also
Examples
## Using package data
data('pts')
data('count_data')
hab <- terra::rast(system.file('extdata',
'hab.tif', package = 'multiScaleR'))
kernel_inputs <- kernel_prep(pts = pts,
raster_stack = hab,
max_D = 250,
kernel = 'gaussian')
mod <- glm(y ~ hab,
family = poisson,
data = count_data)
## Optimize scale
opt <- multiScale_optim(fitted_mod = mod,
kernel_inputs = kernel_inputs)
## Uses of `kernel_dist`
kernel_dist(model = opt)
kernel_dist(model = opt, prob = 0.95)
kernel_dist(sigma = 500, kernel = 'gaussian', prob = 0.95)
kernel_dist(sigma = 100, prob = 0.975, kernel = "exp")
kernel_dist(sigma = 100, prob = 0.95, kernel = "expow", beta = 1.5)
kernel_dist(sigma = 100, kernel = "fixed")
Kernel Scale Preparation
Description
Prepares the data inputs required for multiscale kernel
optimization. Extracts raster values within a buffer around each point,
computes pairwise distances, and calculates initial kernel-weighted covariate
values. The result is passed directly to multiScale_optim.
Usage
kernel_prep(
pts,
raster_stack,
max_D,
kernel = c("gaussian", "exp", "expow", "fixed"),
scale_vars = NULL,
sigma = NULL,
shape = NULL,
projected = TRUE,
progress = FALSE,
verbose = TRUE,
bin = TRUE,
nbins = 256L,
store_cell_data = TRUE
)
Arguments
pts |
Spatial point locations as a |
raster_stack |
One or more raster layers of class |
max_D |
Positive numeric. The maximum radius (in the same units as the
projection of |
kernel |
Kernel function to be used for weighting raster values by distance. One of:
|
scale_vars |
Optional variable specifications created with
|
sigma |
Optional numeric vector of initial sigma values for
optimization. Length must equal the number of covariates with
|
shape |
Optional numeric vector of initial shape values when
|
projected |
Logical. Whether |
progress |
Logical. Print progress bars to the console during extraction
and distance calculations. Default: |
verbose |
Logical. Print status messages during preparation. Default:
|
bin |
Logical. If |
nbins |
Positive integer. Number of equal-width distance bins used when
|
store_cell_data |
Logical. If |
Details
Point locations and raster layers must share a defined CRS and both must be
projected. All units (including max_D and any user-supplied
sigma) must be in the same linear unit as the projection (typically
metres or feet).
The max_D buffer should be large enough to encompass the plausible
range of the true scale of effect. A common rule of thumb is to set
max_D to at least 2–3 times the largest expected sigma. After running
multiScale_optim, the max_distance diagnostic will warn
if the estimated scale approaches max_D.
Initial sigma values do not need to be precise. The optimizer will
refine them. Provide explicit starting values only if the default
(max_D / 2) leads to convergence problems.
Row names from pts are preserved throughout the returned object so
that downstream model data frames can be joined back to the original point
order.
Value
A list of class "multiScaleR_data" containing:
kernel_datData frame of scaled (mean-centered, unit-variance) initial kernel-weighted covariate values, one row per point and one column per covariate. Row names match
ptsrow names or sequential integers. Use this directly to fit the initial model passed tomultiScale_optim.d_listNamed list (one element per point) of numeric distance vectors from the point to every raster cell within the buffer.
raw_covNamed list (one element per point) of matrices containing raw raster cell values within the buffer, aligned with
d_list. Each point is stored as a dense or sparse matrix, whichever is smaller.NULLin lean mode.kernelCharacter string identifying the kernel used.
sigmaNumeric vector of initial sigma values on the internal (scaled) parameter space.
shapeNumeric vector of initial shape values, or
NULL.min_DNumeric. Approximate minimum inter-cell distance, used as the lower bound for sigma during optimization.
max_DNumeric. The
max_Dvalue supplied by the user.n_covsInteger. Number of covariates with scale being optimized.
unit_convNumeric. Internal distance scaling factor (equals
max_D).scale_varsData frame of class
"multiScaleR_vars"describing all covariate specifications.resolutionNumeric. Raster cell resolution.
n_colsInteger. Number of raster columns (used for adjacency landscape metrics).
scl_paramsNamed list with elements
meanandsdthe centering and scaling parameters applied tokernel_dat. Stored for use bykernel_scale.rasterwhenscale_center = TRUE.binnedWhen
bin = TRUE, a list of precomputed distance-binned summaries (representative bin distances and per-covariate value/count matrices) used to accelerate optimization.NULLwhenbin = FALSE.cell_data_storedLogical flag indicating whether the per-cell
d_list/raw_covdata were retained.FALSEonly in lean mode (store_cell_data = FALSE).
Examples
library(terra)
pts <- vect(cbind(c(3,5,7),
c(7,5,3)))
mat_list <- list(r1 = rast(matrix(rnorm(100),
nrow = 10)),
r2 = rast(matrix(rnorm(100),
nrow = 10)))
rast_stack <- rast(mat_list)
kernel_inputs <- kernel_prep(pts = pts,
raster_stack = rast_stack,
max_D = 2,
kernel = 'gaussian',
sigma = NULL)
Apply kernel smoothing to raster layers
Description
Applies a kernel smoothing function to one or more raster
layers, producing a spatially weighted mean (or landscape metric) at the
scale identified by multiScale_optim. Primarily used to
generate prediction rasters for spatial model projection.
Usage
kernel_scale.raster(
raster_stack,
sigma = NULL,
multiScaleR = NULL,
shape = NULL,
kernel = c("gaussian", "exp", "expow", "fixed"),
scale_vars = NULL,
pct_wt = 0.975,
fft = TRUE,
scale_center = FALSE,
clamp = FALSE,
pct_mx = 0,
na.rm = TRUE,
verbose = TRUE,
...
)
Arguments
raster_stack |
A |
sigma |
Numeric vector of kernel scale parameter(s) in the same units
as the raster projection (e.g., metres). One value per raster layer.
Ignored when |
multiScaleR |
A fitted object of class |
shape |
Numeric vector of shape parameters for the exponential power
kernel ( |
kernel |
Character. Kernel function used for smoothing. One of
|
scale_vars |
Optional variable specifications created with
|
pct_wt |
Numeric between 0 and 1 (exclusive). Cumulative kernel density
cutoff used to determine the focal window size for smoothing. A larger
value (e.g., |
fft |
Logical. If |
scale_center |
Logical. If |
clamp |
Logical. If |
pct_mx |
Numeric between -0.99 and 0.99. When |
na.rm |
Logical. If |
verbose |
Logical. Print layer-level progress messages. Default:
|
... |
Reserved for deprecated arguments. Currently only |
Details
Typical usage after running multiScale_optim:
opt_hab <- kernel_scale.raster(raster_stack = hab, multiScaleR = opt)
plot(c(hab, opt_hab))
## Scale and center for prediction
opt_hab_sc <- kernel_scale.raster(raster_stack = hab,
multiScaleR = opt,
scale_center = TRUE)
preds <- terra::predict(opt_hab_sc, opt$opt_mod, type = "response")
FFT vs. focal smoothing
The FFT convolution (fft = TRUE) is substantially faster for large
rasters or wide kernels. It produces minor edge artefacts near raster
boundaries, typically within one kernel-width of the edge. For analyses
focused on interior areas this is usually negligible. Use
fft = FALSE for exact focal smoothing when edge accuracy matters.
Dummy rasters for site-level covariates
When a fitted multiScaleR object is supplied, the function inspects
the model frame to find any predictors that are not represented by raster
layers (e.g., survey effort, habitat type measured in the field). These
cannot be projected spatially and are therefore assigned constant zero
rasters, which correspond to the reference value for centered and scaled
covariates. These dummy layers are required for terra::predict to
work but do not represent real spatial variation. Replace them manually for
non-zero projection scenarios. Categorical predictors are skipped with a
warning.
Value
A SpatRaster with one layer per covariate defined in
scale_vars (or per raster layer when scale_vars is not used).
Layer names match the covariate names from the model. When a fitted
multiScaleR object is provided and the model contains site-level
predictors (i.e., predictors not derived from raster layers), constant
("dummy") raster layers filled with zeros are appended to make the output
compatible with terra::predict.
Examples
## Not Run
r1 <- rast(matrix(rnorm(25^2),
nrow = 25))
r1_s <- kernel_scale.raster(r1,
sigma = 4,
kernel = 'gaussian')
plot(c(r1, r1_s))
Kernel scaling function
Description
Function for internal use with optim
Usage
kernel_scale_fn(
par,
d_list,
cov_df,
kernel,
fitted_mod,
join_by = NULL,
mod_return = NULL,
opt_context = NULL,
cov_w = NULL
)
Arguments
par |
list of parameters |
d_list |
List of distance vectors |
cov_df |
List of data frames with values extracted from rasters |
kernel |
Kernel used |
fitted_mod |
fitted model object |
join_by |
Data frame to join unmarked frame during optimization |
mod_return |
Default: NULL |
opt_context |
Cached optimization context created internally by 'multiScale_optim()' |
cov_w |
Optional precomputed (unscaled) kernel-weighted covariate matrix. When supplied, the per-covariate weighting step is skipped and this matrix is scaled and refit directly. Used by 'profile_sigma()' to reuse the columns of covariates that are held at their optimum. Default: NULL (compute internally). |
Details
For internal use
Value
Either estimated parameters or the fitted model using provided parameters
Simulated raster
Description
Raster data for use with vignette example
Format
'landscape_rast'
A spatRaster object with three surfaces:
- land1
–> A binary landscape surface with low autocorrelation
- land2
–> A continuous landscape surface with low autocorrelation
- land3
–> A continuous landscape surface with high autocorrelation
Examples
land_rast <- terra::rast(system.file("extdata",
"landscape.tif", package = 'multiScaleR'))
Example data frame
Description
Example count data to be used vignette document example
Usage
data(landscape_counts)
Format
A data frame with 100 rows and 2 columns. Data were simulated from a Poisson distribution with an intercept of 0.25; land1 effect = -0.5; site effect = 0.3; land2 effect = 0.7. True simulated Gaussian scale effects (sigma): land1 = 250; land2 = 500. For use with package vignette.
- counts
–> Simulated counts at spatial locations
- site
–> A habitat variable measured at the site
Define multiScaleR covariate transformations
Description
These helpers define named model covariates as transformations of one or
more source raster layers. They are optional; if omitted,
kernel_prep preserves the default behavior where each raster
layer becomes one optimized kernel-weighted covariate with the same name.
Use msr_vars() to collect one or more kernel_var(),
landscape_var(), or surface_var() specifications into a single
object that is passed to the scale_vars argument of
kernel_prep and kernel_scale.raster.
Usage
msr_vars(...)
kernel_var(source)
landscape_var(
source,
metric,
radius = NULL,
base = exp(1),
classes_max = NULL,
class = NULL
)
surface_var(source, metric, radius = NULL, weighted = FALSE)
Arguments
... |
Named |
source |
Character scalar. Name of the source raster layer in
|
metric |
Character scalar. The metric to compute within the circular
buffer. For |
radius |
Optional positive numeric. Fixed buffer radius in the same
units as the projection. When |
base |
Positive numeric (not equal to 1). Logarithm base used by every
entropy-based metric: the diversity metrics ( |
classes_max |
Optional positive numeric. Maximum number of possible
patch types in the landscape, used only for relative patch richness
( |
class |
Optional integer-like scalar. The focal class for the class-level
metrics ( |
weighted |
Logical, for |
Details
Kernel vs. landscape covariates
kernel_var(source) defines a kernel-weighted mean of the continuous
raster values within a circular neighborhood. The neighborhood radius (sigma)
is optimized by multiScale_optim.
landscape_var(source, metric) derives a landscape ecology metric from
a categorical (or thresholded) raster layer within a circular buffer. Raster
classes must be encoded as finite integer-like values. The buffer radius can
be fixed or optimized.
surface_var(source, metric) derives a continuous surface texture
metric from a continuous raster layer (elevation, NDVI, canopy height,
temperature, moisture, and similar surfaces) within a circular buffer. These
metrics summarize within-neighborhood heterogeneity rather than an average
value, so they complement kernel_var() (which summarizes the mean).
The buffer radius can be fixed or optimized. Note that these covariates
measure variability, not central tendency: changing the radius changes both
the ecological scale and the statistical quantity being summarized.
By default a surface metric uses a hard-edged neighborhood. Setting
weighted = TRUE instead weights each cell by the model's distance
kernel and optimizes the kernel scale (sigma), which estimates the scale of
effect of heterogeneity with a smooth likelihood (see the weighted
argument). This applies to the distribution metrics ("sa",
"sq", "ssk", "sku").
Supported landscape metrics
Composition metrics (require a categorical raster):
"shdi"Shannon diversity index.
"shei"Shannon evenness index.
"sidi"Simpson diversity index.
"siei"Simpson evenness index. Can be uninformative for optimized scales when local evenness is uniform across sample points.
"msidi"Modified Simpson diversity index.
"msiei"Modified Simpson evenness index. Can be uninformative for optimized scales when local evenness is uniform across sample points.
"pr"Patch richness (number of distinct classes).
"prd"Patch richness density (pr per 100 ha).
"rpr"Relative patch richness (pr / classes_max * 100).
"ta"Total area of the buffer (ha).
Edge metrics (require adjacency information; cell IDs are cached internally):
"ed"Edge density (m/ha).
"te"Total edge length (m).
"lsi"Landscape shape index.
Adjacency metrics (configuration; built from cell-pair counts):
"ai"Aggregation index (%). Area-weighted percentage of like adjacencies relative to the maximum possible for each class. Higher values mean classes are more spatially aggregated.
"pladj"Proportion of like adjacencies (%). Percentage of all cell-pair boundaries that fall between cells of the same class.
"contag"Contagion index (%). Ranges from near 0 (maximum interspersion of classes) to 100 (a single class). Requires at least 2 classes within the buffer.
"iji"Interspersion and juxtaposition index (%). The evenness of the adjacencies among the different classes, ignoring like adjacencies. Ranges from near 0 (one class pair dominates the between-class boundaries) to 100 (every class is equally adjacent to every other). Undefined, and returned as
NA, for fewer than 3 classes. For optimized scales, prescreen for usable variation because local windows with too few classes or between-class adjacencies can produce allNAvalues.
Information theory metrics (configuration and composition; built from the
class-pair co-occurrence matrix following Nowosad and Stepinski, 2019). The
base argument sets the logarithm base; set base = 2 for bits, as
in landscapemetrics:
"ent"Marginal entropy, H(x). The diversity of the cell values entering adjacencies; it increases with the number and evenness of classes.
"joinent"Joint entropy, H(x, y). The diversity of the class-pair adjacencies themselves; the total complexity of the pattern.
"condent"Conditional entropy, H(y | x). The configurational complexity that remains once composition is accounted for: the uncertainty about a neighbor's class given the focal cell's class.
"mutinf"Mutual information, I(y, x). How much knowing a cell's class tells you about its neighbor; it rises as like classes cluster together. This is the configuration signal separated from composition.
"relmutinf"Relative mutual information, I(y, x) / H(x). Mutual information rescaled by marginal entropy to the range 0 to 1, so that configuration can be compared across landscapes of differing composition. It is 1 when mutual information is 0 (the
landscapemetricsconvention) and is unaffected bybase.
Class-level metrics (describe a single focal class; require the class
argument):
"clumpy"Clumpiness index for the focal class. Ranges from -1 (maximally disaggregated) through 0 (randomly distributed) to 1 (maximally clumped), correcting the proportion of like adjacencies for the class's own abundance. Returned as
NAwhere the class fills or is absent from the buffer."pland"Percentage of landscape (%). The focal class's share of the valid buffer cells.
"ca"Class area (hectares) of the focal class within the buffer.
Adjacency and information-theory projections are class-count limited to avoid
accidental use of continuous rasters and to keep FFT projection times bounded.
The current ceilings are 200 classes for the like-adjacency metrics
("ai", "pladj", "clumpy") and 50 classes for the metrics
built from the full class-pair matrix ("contag", "iji", and the
information-theory family), whose cost scales with the square of the class
count.
Before optimizing the scale of "iji", "siei", or "msiei",
inspect a pilot kernel_prep() result or a few fixed radii. These
metrics can be undefined or nearly constant when buffers contain too few
classes, too few between-class adjacencies, or nearly identical evenness at
all sample points. In that case, use a larger radius, increase max_D,
or choose a more stable diversity, entropy, or edge metric.
Supported surface texture metrics
Surface metrics (require a continuous raster):
"sa"Average roughness: the mean absolute deviation of the neighborhood values from their mean.
"sq"Root mean square (RMS) roughness: the sample standard deviation of the neighborhood values. More sensitive to large deviations than
"sa"."ssk"Skewness: the asymmetry of the neighborhood value distribution. Positive when occasional high values dominate, negative when occasional low values do.
"sku"Kurtosis (excess): the peakedness and tail weight of the neighborhood value distribution relative to a normal distribution (which has excess kurtosis 0).
"sdq"Root mean square slope: the RMS of the local gradient magnitude, a measure of how steeply the surface changes (terrain or structural ruggedness).
"sdr"Surface area ratio: the percentage by which the true 3D surface area exceeds the flat projected area. Behaves as a continuous edge-density analog (more relief gives a larger ratio).
The dispersion and shape metrics ("sa", "sq", "ssk",
"sku") match the geodiv package (sq uses an N - 1
denominator as stats::sd() does; ssk uses the bias-adjusted
skewness and sku the excess kurtosis, matching geodiv's
defaults). "sdq" is reported as a true slope (the gradient is divided
by the cell resolution); geodiv::sdq() instead uses a cell spacing of
one, so the two agree after multiplying by the resolution. "sdr" is
computed in physical units (real elevation values and the real cell size),
so a tilted plane of slope s returns (sqrt(1 + s^2) - 1) * 100;
geodiv::sdr() instead rescales the surface to a unit range and so is
dimensionless. The physical definition is the interpretable one for terrain
and, unlike the rescaled version, can be projected by FFT.
Raster projection of "sq", "ssk", "sku", "sdq",
and "sdr" uses fast Fourier transform (FFT) convolution; projection of
"sa" uses an exact compiled masked focal pass, because average
roughness has no closed-form FFT decomposition. "sdq" and "sdr"
use each cell's neighbors, so (like the landscape edge metrics) they cache
raster cell IDs internally, and their projection agrees with point extraction
to within a small boundary tolerance rather than bit-for-bit.
Mixing covariate types
Multiple covariate types can be combined in one msr_vars() call. For
example, a kernel-weighted mean of forest cover and a fixed-radius edge
density metric can both be defined and passed together to
kernel_prep:
vars <- msr_vars(
forest_mean = kernel_var("forest"),
forest_ed500 = landscape_var("forest", metric = "ed", radius = 500)
)
kernel_inputs <- kernel_prep(pts, rasters, max_D = 1000,
scale_vars = vars)
Covariates with a fixed radius are computed once and not re-evaluated
during optimization, which can meaningfully reduce computation time.
Value
msr_vars()A data frame of class
"multiScaleR_vars"with one row per covariate. Columns arecovariate(the name assigned in...),source(source raster layer name),type("kernel","landscape", or"surface"),metric(landscape or surface metric, orNA),radius(fixed radius orNAwhen optimized),optimize(logical indicating whether the scale is estimated),weighted(logical, the kernel-weighted form of a surface metric),base(logarithm base for entropy metrics),classes_max(richness denominator for"rpr", orNA), andclass(focal class for the class-level metrics, orNA).kernel_var()An internal list of class
"multiScaleR_var"representing a kernel-weighted mean covariate. Pass one or more of these insidemsr_vars().landscape_var()An internal list of class
"multiScaleR_var"representing a landscape metric covariate. Pass one or more of these insidemsr_vars().surface_var()An internal list of class
"multiScaleR_var"representing a continuous surface texture metric covariate. Pass one or more of these insidemsr_vars().
References
Nowosad, J., & Stepinski, T. F. (2019). Information theory as a consistent framework for quantification and classification of landscape patterns. Landscape Ecology, 34(9), 2091-2101. doi:10.1007/s10980-019-00830-x
See Also
kernel_prep, multiScale_optim,
kernel_scale.raster. For a worked walkthrough of the landscape
metrics see vignette("landscape_metric_covariates", package = "multiScaleR"),
and for the continuous-surface metrics see
vignette("surface_metric_covariates", package = "multiScaleR").
Examples
## Kernel-weighted mean only (equivalent to default behavior)
vars <- msr_vars(
forest_prop = kernel_var("forest")
)
## Combining kernel, landscape, and surface covariates
vars <- msr_vars(
forest_prop = kernel_var("forest"),
forest_ed = landscape_var("forest", metric = "ed"),
cover_shdi_500 = landscape_var("landcover", metric = "shdi", radius = 500),
elev_sd = surface_var("elevation", metric = "sq")
)
## landscape_var with natural-log Shannon diversity (the default)
landscape_var("landcover", metric = "shdi")
## landscape_var with log2 Shannon diversity and a fixed 250 m radius
landscape_var("landcover", metric = "shdi", radius = 250, base = 2)
## Configuration: interspersion (landscape-level), no class argument
landscape_var("landcover", metric = "iji", radius = 500)
## Information theory in bits (set base = 2 to match landscapemetrics)
landscape_var("landcover", metric = "mutinf", radius = 500, base = 2)
## Class-level metrics require `class`: clumpiness and cover of class 3
landscape_var("landcover", metric = "clumpy", radius = 500, class = 3)
landscape_var("landcover", metric = "pland", radius = 500, class = 3)
## surface_var: optimized RMS roughness, and fixed-radius average roughness
surface_var("elevation", metric = "sq")
surface_var("ndvi", metric = "sa", radius = 300)
## Kernel-weighted roughness: optimize the scale of effect of heterogeneity
surface_var("elevation", metric = "sq", weighted = TRUE)
## Define a single optimized kernel-weighted mean covariate
kv <- kernel_var("forest")
Multiscale optimization
Description
Identifies the kernel scale of effect (sigma) for each raster
covariate by maximizing the log-likelihood of a fitted statistical model.
Repeatedly replaces kernel-weighted covariate values at different scales and
refits the model, using optim (or optimParallel for parallel
execution) with the L-BFGS-B algorithm.
Usage
multiScale_optim(
fitted_mod,
kernel_inputs,
join_by = NULL,
par = NULL,
start_strategy = c("single", "screen"),
screen_n_sigma = 5,
screen_n_jitter = 6,
screen_maxit = 8,
screen_jitter_sd = 0.5,
n_cores = NULL,
PSOCK = FALSE,
verbose = TRUE,
refit_fn = NULL
)
Arguments
fitted_mod |
A fitted model object whose covariates include the
kernel-weighted variables defined in |
kernel_inputs |
A list of class |
join_by |
Default: |
par |
Optional numeric vector of starting values for the optimizer.
Values must be divided by |
start_strategy |
Character. How to choose starting values when
|
screen_n_sigma |
Positive integer. Number of log-spaced sigma values
evaluated per covariate during the prescreen when
|
screen_n_jitter |
Non-negative integer. Number of jittered candidate
starts evaluated with short screening optimizations after the marginal
prescreen. Default: |
screen_maxit |
Positive integer. Maximum iterations used for each short
screening optimization. Default: |
screen_jitter_sd |
Non-negative numeric scalar. Standard deviation of
multiplicative sigma jitter on the log scale during screening. Default:
|
n_cores |
Positive integer. Number of cores for parallel optimization
via |
PSOCK |
Logical. On Windows, a PSOCK cluster is always used. On Unix,
a FORK cluster is used by default (faster). Set |
verbose |
Logical. Print optimization status and warnings to the
console. Default: |
refit_fn |
Optional function for refitting |
Details
Optimization approach
The optimizer uses the L-BFGS-B algorithm (bounded quasi-Newton) to
maximize the log-likelihood over sigma (and shape for "expow") while
holding all regression coefficients at their fitted values for each candidate
scale. Standard errors are Hessian-based approximations from the outer
optimization. Summary methods report profile-likelihood confidence intervals
for sigma when profile_sigma has been run on the object;
otherwise they fall back to Hessian-based intervals.
Binned acceleration
When kernel_inputs was created by kernel_prep with
bin = TRUE (the default), kernel-weighted covariates are evaluated
from precomputed distance-binned summaries rather than by iterating every
buffer cell on each optimizer evaluation. This makes the per-evaluation cost
independent of max_D and typically speeds up optimization by an order
of magnitude or more for large buffers, with a negligible binning
approximation. Inputs created with store_cell_data = FALSE ("lean"
mode) carry only the binned summaries; these still optimize, profile, and
refit normally. Landscape-metric covariates always use the exact per-cell
path and therefore require cell-level data.
Parallel optimization
To ensure that fitted model function calls are properly serialized for
parallel workers, fit models using fully namespace-qualified functions. For
example, fit a negative binomial model as
fitted_mod <- MASS::glm.nb(y ~ x, data = df) rather than loading the
namespace implicitly.
Joining unmarked multi-year data
When using unmarked models where sites are surveyed across multiple
years but the spatial covariates are constant across years, provide a
join_by data frame to match each site's kernel-weighted covariate
value to its observations. The column name in join_by must match a
column in the data used to fit the unmarked model.
Custom refit functions
By default, multiScale_optim() refits the model using
stats::update() (standard models) or the unmarked equivalent.
When the default path is insufficient, supply refit_fn. The function
must accept arguments model, data, and context, and
return a fitted model whose log-likelihood can be extracted by
stats::logLik() or insight::get_loglikelihood().
A minimal custom refit function:
refit_fn <- function(model, data, context) {
stats::update(model, data = data)
}
For models requiring explicit call reconstruction:
refit_fn <- function(model, data, context) {
call <- model$call
call$data <- quote(data)
eval(call, envir = list(data = data), enclos = parent.frame())
}
With PSOCK clusters, refit_fn must serialize cleanly. Prefer
namespace-qualified calls (e.g., stats::update()) and avoid
closures over large local objects.
Wrapper model objects
Some packages return wrapper objects around another fitted model (e.g.,
amt::fit_clogit() wraps a survival::clogit() fit in its
model component). When possible, multiScale_optim() unwraps
these automatically. For amt::fit_clogit(), pass model = TRUE
when fitting so the nested clogit model retains its model frame.
When 'start_strategy = "screen"' and 'par' is left as 'NULL', 'multiScale_optim()' first scouts the sigma space with one-dimensional log-spaced scans, then optionally tests a few jittered starts using short, Hessian-free optimizations. The marginal sigma scans are serial, but the short screening attempts use 'n_cores' when parallel optimization is requested. These screening steps are used only to choose one starting vector for the single full optimization. On Windows, those parallel screening attempts use PSOCK workers, so the same PSOCK serialization guidance applies as in the full optimization. For reproducible screened starts, call 'set.seed()' before 'multiScale_optim()'.
Value
A list of class "multiScaleR" containing:
scale_estData frame with one row per optimized covariate and two columns:
Mean(optimized sigma on the original projection scale) andSE(Hessian-based standard error). Row names are covariate names.shape_estData frame of the same structure as
scale_estfor the shape parameter, orNULLwhenkernel != "expow".optim_resultsThe raw list returned by
stats::optimoroptimParallel::optimParallel, includingpar,value(negative log-likelihood),hessian, and convergence codes. Also containspar_unscale(sigma values back on the projection scale) andhessian_unscale.opt_modThe refitted model at the optimized sigma values. This is the model object to use for inference, prediction, and
plot_marginal_effects.fitted_mod_originalThe original
fitted_modpassed by the user, stored for reference.min_DNumeric. Lower bound for sigma during optimization.
max_DNumeric. Upper bound for sigma during optimization.
kernel_inputsThe
kernel_inputslist (minusmin_Dandmax_D, which are stored separately).scl_paramsNamed list with
meanandsdvectors for each covariate: the centering and scaling parameters from the optimized kernel data. Used bykernel_scale.rasterwhenscale_center = TRUE.join_byThe
join_bydata frame, orNULL.opt_contextInternal optimization context object storing the model-class-specific refit logic. Retained for use by
profile_sigma.profile_scale_estProfile-likelihood CI data frame, or
NULLuntilprofile_sigmahas been run.diagnosticsList of diagnostic objects; see
diagnostics.next_runList of recommended values for a follow-up fit. Includes
max_Dfor the nextkernel_prepcall, optimizedstart_sigmavalues in map units, optionalstart_shapevalues forkernel = "expow", andstart_paron the internal scaled parameter space expected bymultiScale_optim.n_coresisNULL; a conservative parallel worker-count suggestion is available on demand fromestimate_multiscale_ram(it is no longer computed automatically, as that query can be slow and shells out to an external process).warn_messageInteger vector of triggered warning codes: 1 = max-distance, 2 = sigma precision, 3 = shape precision.
callThe matched call.
See Also
Examples
set.seed(555)
points <- vect(cbind(c(5,7,9,11,13),
c(13,11,9,7,5)))
mat_list <- list(r1 = rast(matrix(rnorm(20^2),
nrow = 20)),
r2 = rast(matrix(rnorm(20^2),
nrow = 20)))
rast_stack <- rast(mat_list)
kernel_inputs <- kernel_prep(pts = points,
raster_stack = rast_stack,
max_D = 5,
kernel = 'gaussian',
sigma = NULL)
## Example response data
y <- rnorm(5)
## Create data frame with raster variables
dat <- data.frame(y = y,
kernel_inputs$kernel_dat)
mod1 <- glm(y ~ r1 + r2,
data = dat)
## NOTE: This code is only for demonstration
## Optimization results will have no meaning
opt_mod <- multiScale_optim(fitted_mod = mod1,
kernel_inputs = kernel_inputs,
par = NULL,
n_cores = NULL)
## Using package data
data('pts')
data('count_data')
hab <- terra::rast(system.file('extdata',
'hab.tif', package = 'multiScaleR'))
kernel_inputs <- kernel_prep(pts = pts,
raster_stack = hab,
max_D = 250,
kernel = 'gaussian')
mod <- glm(y ~ hab,
family = poisson,
data = count_data)
## Optimize scale
opt <- multiScale_optim(fitted_mod = mod,
kernel_inputs = kernel_inputs)
## Summary of fitted model
summary(opt)
## 'True' parameter values data were simulated from:
# hab scale = 75
# Intercept = 0.5,
# hab slope estimate = 0.75
## Plot and visualize kernel density
plot(opt)
## Apply optimized kernel to the environmental raster
opt_hab <- kernel_scale.raster(hab, multiScaleR = opt)
plot(c(hab, opt_hab))
## Project model; scale & center
opt_hab.s_c <- kernel_scale.raster(raster_stack = hab,
multiScaleR = opt,
scale_center = TRUE)
mod_pred <- predict(opt_hab.s_c, opt$opt_mod, type = 'response')
plot(mod_pred)
## Custom refit hook for model classes that need explicit control.
## This example still uses glm(), but the same pattern can be used for
## classes whose default update path is not sufficient.
refit_glm <- function(model, data, context) {
stats::update(model, data = data)
}
opt_custom <- multiScale_optim(fitted_mod = mod,
kernel_inputs = kernel_inputs,
refit_fn = refit_glm)
Plot method for multiScaleR objects
Description
Plots the optimized kernel weight distribution for each spatial covariate in
a fitted multiScaleR object, with optional annotations showing the
effective scale of effect and its 95% confidence interval.
Usage
## S3 method for class 'multiScaleR'
plot(x, ...)
Arguments
x |
A fitted |
... |
Optional named arguments to customize the plot:
|
Value
A list of ggplot2 objects (one per covariate with a
non-NaN standard error), returned invisibly. Plots are printed as
a side effect.
See Also
Examples
## Using package data
data('pts')
data('count_data')
hab <- terra::rast(system.file('extdata',
'hab.tif', package = 'multiScaleR'))
kernel_inputs <- kernel_prep(pts = pts,
raster_stack = hab,
max_D = 250,
kernel = 'gaussian')
mod <- glm(y ~ hab,
family = poisson,
data = count_data)
## Optimize scale
opt <- multiScale_optim(fitted_mod = mod,
kernel_inputs = kernel_inputs)
plot(opt)
plot(opt, prob = 0.95)
plot(opt, scale_dist = FALSE)
plot(opt, scale_dist = TRUE, add_label = FALSE)
Plot Sigma Profile
Description
Plots the profiled log-likelihood or AICc across sigma values for each spatial covariate. The optimized sigma is marked with a vertical red line.
Usage
## S3 method for class 'sigma_profile'
plot(x, ...)
Arguments
x |
An object of class |
... |
Additional arguments (not currently used). |
Value
A list of ggplot objects (one per covariate), returned
invisibly. Plots are printed as a side effect.
See Also
Plot kernel densities
Description
Visualizes the weight (density) of a kernel function as a
function of distance, without requiring a fitted multiScaleR object.
Useful for exploring how different kernel types and sigma values translate
into spatial influence zones before running multiScale_optim.
To plot the kernel from a fitted model, use plot(opt) instead.
Usage
plot_kernel(
prob = 0.9,
sigma,
beta = NULL,
kernel,
scale_dist = TRUE,
add_label = TRUE,
...
)
Arguments
prob |
Numeric between 0 and 1 (exclusive). Cumulative density threshold
used to mark the effective distance on the plot. Default: |
sigma |
Positive numeric. The scale parameter of the kernel, in the
same units as the projection used with |
beta |
Positive numeric. Shape parameter for the exponential power
kernel ( |
kernel |
Character. The kernel function to visualize. One of:
|
scale_dist |
Logical. If |
add_label |
Logical. If |
... |
Not used. |
Details
The x-axis range is set to cover 99.9% of the cumulative density so that
the tails of the distribution are visible. The prob marker is rounded
to the nearest 10 distance units for display purposes.
For fitted-model kernel plots (with confidence intervals), use
plot(multiScaleR_object) instead.
Value
A ggplot2 object showing kernel weight (y-axis) as a function
of distance (x-axis). The object is printed as a side effect and returned
invisibly.
Examples
#' ## General use of plot method
plot_kernel(prob = 0.95,
sigma = 100,
kernel = 'gaussian')
plot_kernel(prob = 0.95,
sigma = 100,
kernel = 'exp')
plot_kernel(prob = 0.95,
sigma = 100,
kernel = 'fixed')
plot_kernel(prob = 0.95,
sigma = 100,
beta = 2.5,
kernel = 'expow')
Plot Marginal Effects from a Fitted Model
Description
Generates marginal effect plots with 95% confidence intervals for each
covariate in the optimized model stored within a multiScaleR object.
Each panel sweeps one covariate across its observed range while holding all
others at their sample mean.
Usage
plot_marginal_effects(
x,
ylab = "Estimated response",
length.out = 100,
type = "state",
link = FALSE
)
Arguments
x |
A fitted |
ylab |
Character scalar. Y-axis label for all marginal effect panels.
Default: |
length.out |
Positive integer. Number of equally spaced points at which
to evaluate the marginal effect curve for each covariate. Default: |
type |
Character. Prediction type for |
link |
Logical. If |
Details
Marginal effects are computed by constructing a newdata data frame
where the focal covariate sweeps from its minimum to maximum observed value
and all other predictors are held at their sample mean. For kernel-scaled
covariates, whose scaled mean is 0, this is equivalent to holding them at
their average landscape value.
For glm and most GLM-family models, predictions are transformed via
the inverse link function automatically (e.g., exp() for Poisson
log-link models). Confidence intervals are constructed from
predict(..., se.fit = TRUE) using \pm 1.96 \times SE.
For unmarked models, predict(mod, newdata, type = type) is
used, and lower/upper bounds come from the lower and upper
columns of the returned data frame.
For models with interaction terms, a message is printed explaining that each panel represents a conditional slice (at the mean of the interacting variable) rather than the full interaction surface.
HLfit (spaMM) and zeroinfl (pscl) models are handled with
class-specific prediction calls.
When x$opt_mod is a wrapper object around another fitted model,
plot_marginal_effects() uses the nested analysis model for predictor
discovery, data recovery, and prediction when possible.
Value
A named list of ggplot2 objects, one per covariate. Plots are
printed as a side effect and the list is returned invisibly. Each plot
shows:
The predicted response (solid line) across the observed covariate range, with x-axis values back-transformed to the original (unscaled) units.
A shaded ribbon for the 95% confidence interval (when available from the model's
predictmethod).A subtitle noting any polynomial (
I(x^2)) or interaction terms that involve the covariate.
Examples
data('pts')
data('count_data')
hab <- terra::rast(system.file('extdata', 'hab.tif', package = 'multiScaleR'))
kernel_inputs <- kernel_prep(pts = pts,
raster_stack = hab,
max_D = 250,
kernel = 'gaussian')
mod <- glm(y ~ hab, family = poisson, data = count_data)
opt <- multiScale_optim(fitted_mod = mod, kernel_inputs = kernel_inputs)
## Default marginal effect plot on the response scale
plot_marginal_effects(opt)
## Custom y-axis label
plot_marginal_effects(opt, ylab = "Predicted abundance")
## Finer evaluation grid
plot_marginal_effects(opt, length.out = 200)
Print method for multiScaleR
Description
Print method for objects of class multiScaleR.
Usage
## S3 method for class 'multiScaleR'
print(x, ...)
Arguments
x |
A |
... |
Ignored |
Value
Invisibly returns the input multiScaleR object
Print method for multiScaleR_data
Description
Print method for objects of class multiScaleR_data.
Usage
## S3 method for class 'multiScaleR_data'
print(x, ...)
Arguments
x |
A |
... |
Ignored |
Value
Invisibly returns the input multiScaleR_data object
Print method for summary_multiScaleR
Description
Print method for objects of class summary_multiScaleR.
Usage
## S3 method for class 'summary_multiScaleR'
print(x, ...)
Arguments
x |
A |
... |
Ignored |
Value
Invisibly returns the input summary_multiScaleR object
Profile Model Fit Across Sigma Parameter Space
Description
Evaluates the model log-likelihood and AICc at a series of sigma values spanning the optimization range for each spatial covariate. This provides a diagnostic view of the likelihood surface and helps assess whether the optimized sigma is at a clear minimum or on a flat plateau.
Usage
profile_sigma(
x,
n_pts = 10,
metric = c("AICc", "LL"),
verbose = TRUE,
n_cores = NULL,
spacing = c("log", "linear"),
sigma_values = NULL,
sigma_range = NULL
)
Arguments
x |
A fitted |
n_pts |
Positive integer (>= 3). Number of sigma values to evaluate
for each covariate along the profile grid. More points give a smoother
profile at the cost of additional model refits. Default: |
metric |
Character. Metric to display on the y-axis of the profile.
One of |
verbose |
Logical. Print per-covariate progress messages. Default:
|
n_cores |
Optional positive integer. Number of CPU cores to use when
profiling covariates in parallel. Parallel profiling is applied across
covariates with a PSOCK cluster. Default: |
spacing |
Character. Spacing of the automatically generated sigma grid.
|
sigma_values |
Optional positive numeric vector of sigma values (in the
original projection units) to evaluate directly. When supplied,
|
sigma_range |
Optional positive numeric vector of length 2 giving the
minimum and maximum sigma values (in projection units) for the generated
profile grid. Defaults to the optimization range stored in |
Details
For each spatial covariate, sigma is varied across a sequence of candidate
values, while all other sigma values are held at their optimized values. By
default this is a log-spaced sequence from the minimum to maximum distance
considered during optimization. Users can instead request linear spacing with
spacing = "linear" or supply exact values with sigma_values.
At each evaluation point the model is refit and the log-likelihood extracted.
AICc is computed from the log-likelihood, the number of regression parameters
(including sigma), and the number of observations.
Log-spacing concentrates evaluation points at smaller sigma values where the likelihood surface often changes more rapidly, and spaces them out at larger sigma values where the surface tends to be flatter.
Linear spacing can be useful when the profile needs equal resolution across a
specific sigma interval. User-supplied sigma_values are sorted and
duplicate values are removed before profiling to avoid redundant refits.
Value
A list of class sigma_profile containing:
- profiles
A data frame with columns
variable,sigma,LL, andAICc.- opt_sigma
A named numeric vector of the optimized sigma for each covariate.
- metric
The metric used for profiling.
- spacing
The profile grid type:
"log","linear", or"custom".- sigma_grid
The sigma values evaluated for each covariate.
See Also
plot.sigma_profile, multiScale_optim
Examples
data('pts')
data('count_data')
hab <- terra::rast(system.file('extdata',
'hab.tif', package = 'multiScaleR'))
kernel_inputs <- kernel_prep(pts = pts,
raster_stack = hab,
max_D = 250,
kernel = 'gaussian')
mod <- glm(y ~ hab,
family = poisson,
data = count_data)
opt <- multiScale_optim(fitted_mod = mod,
kernel_inputs = kernel_inputs)
## Profile sigma
prof <- profile_sigma(opt)
plot(prof)
## More evaluation points
prof <- profile_sigma(opt, n_pts = 20)
plot(prof)
## Linearly spaced values between explicit limits
prof <- profile_sigma(opt, n_pts = 8, spacing = "linear",
sigma_range = c(25, 250))
plot(prof)
## User-supplied sigma values
prof <- profile_sigma(opt, sigma_values = c(25, 50, 100, 150, 250))
plot(prof)
## Profile log-likelihood instead of AICc
prof <- profile_sigma(opt, metric = "LL")
plot(prof)
Spatial sample points
Description
Example point file for optimizing scales of effect
Usage
data(pts)
Format
An sf class point object:
- pts
–> spatial location of points
Scale and Center Raster Layers Using Model Parameters
Description
Standardizes raster covariates in a 'terra::SpatRaster' using mean and standard deviation values extracted from a fitted model 'multiScaleR' model.
Usage
scale_center_raster(r, multiScaleR, clamp = FALSE, pct_mx = 0)
Arguments
r |
A 'terra::SpatRaster' containing covariate layers to be scaled. All layer names must match those used in the 'multiScaleR' model. |
multiScaleR |
A 'multiScaleR' object created using 'kernel_prep' or 'multiScale_optim'. |
clamp |
Logical. If 'TRUE', scaled values are clamped to the covariate range in the model data. |
pct_mx |
Numeric. If 'clamp' is 'TRUE', this value specifies the amount (percentage; positive or negative) by which to expand/contract the min/max range when clamping. Default = 0. |
Value
A 'terra::SpatRaster' with each layer scaled and optionally clamped.
Scale Function
Description
Scaling function to be applied to rasters
Usage
scale_type(
d,
kernel = c("gaussian", "exp", "expow", "fixed"),
sigma,
shape = NULL,
r_stack.df = NULL,
output = NULL
)
Arguments
d |
Vector of distances |
kernel |
Kernel function to be used ('gaussian', 'exp', 'fixed', 'expow'; Default: 'gaussian') |
sigma |
Scaling parameter |
shape |
Shape parameter if using exponential power kernel |
r_stack.df |
Dataframe values extracted from rasters |
output |
If NULL, a vector of weights is returned, otherwise a weighted raster values are returned, Default: NULL |
Details
DETAILS
Value
A vector of weights or vector of weighted raster values
Examples
### TO BE COMPLETED ###
Scale Function
Description
Scaling function to be applied to rasters
Usage
scale_type_r(
d,
kernel = c("gaussian", "exp", "expow", "fixed"),
sigma,
shape = NULL,
r_stack.df = NULL,
output = NULL
)
Arguments
d |
Vector of distances |
kernel |
Kernel function to be used ('gaussian', 'exp', 'fixed', 'expow'; Default: 'gaussian') |
sigma |
Scaling parameter |
shape |
Shape parameter if using exponential power kernel |
r_stack.df |
Dataframe values extracted from rasters |
output |
If NULL, a vector of weights is returned, otherwise a weighted raster values are returned, Default: NULL |
Details
DETAILS
Value
A vector of weights or vector of weighted raster values
Examples
### TO BE COMPLETED ###
Simulate data for optimizing scales of effect
Description
Generates simulated response data with known kernel-weighted
landscape covariates at a controlled scale of effect. Useful for testing and
demonstrating multiScale_optim with data where the true
parameters are known.
Usage
sim_dat(
alpha = 1,
beta = NULL,
kernel = c("gaussian", "exp", "expow", "fixed"),
type = c("count", "count_nb", "occ", "gaussian"),
StDev = 0.5,
n_points = 50,
min_D = NULL,
raster_stack = NULL,
sigma = NULL,
shape = NULL,
max_D = NULL,
user_seed = NULL,
...
)
Arguments
alpha |
Numeric scalar. Intercept term for the linear predictor.
Default: |
beta |
Numeric vector of slope coefficients, one per raster layer in
|
kernel |
Character. Kernel function used to weight raster values by
distance when generating the true covariate values. One of
|
type |
Character. Distribution of the simulated response variable. One of:
|
StDev |
Positive numeric. Dispersion parameter for |
n_points |
Positive integer or a |
min_D |
Positive numeric. Minimum inter-point spacing on the hexagonal
grid. If |
raster_stack |
A |
sigma |
Positive numeric vector. True kernel scale parameters, one per
raster layer. These are the values that |
shape |
Positive numeric vector. Shape parameters for the exponential
power kernel ( |
max_D |
Positive numeric. Maximum buffer radius for
|
user_seed |
Optional integer seed for reproducibility. Default:
|
... |
Additional arguments. Not currently used. |
Details
Points are distributed on a hexagonal grid across the interior of the raster
extent (85% of the range in each direction to avoid edge effects), then
randomly subsampled to n_points. The min_D spacing controls
the grid resolution; if the grid produces fewer points than n_points,
min_D is reduced iteratively by 3% until enough points are generated.
Kernel-weighted covariate values are computed using
kernel_prep at the specified sigma (and shape)
values. These represent the "true" covariate values that the optimization
should recover.
Value
A named list with three elements:
obsNumeric vector of length
n_pointscontaining the simulated response values.dfData frame with
n_pointsrows and one column per raster layer plus aycolumn. The raster columns contain the true kernel-weighted mean values, scaled to zero mean and unit variance. This data frame can be used to fit a model formultiScale_optim.ptsAn
sfPOINT object withn_pointsrows. Anobscolumn is appended containing the simulated response values. Pass this tokernel_prepas theptsargument.
Examples
rs <- sim_rast()
rs <- terra::subset(rs, c(1,3))
s_dat <- sim_dat(alpha = 0.5,
beta = c(0.75,-0.75),
kernel = 'gaussian',
sigma = c(75, 150),
type = 'count',
raster_stack = rs,
max_D = 400)
plot(s_dat$df$y ~ s_dat$df$bin1)
plot(s_dat$df$y ~ s_dat$df$cont1)
Simulate data for optimizing scales of effect with 'unmarked'
Description
Generates simulated replicated detection/non-detection or count
data formatted for use with the unmarked package, with known
kernel-weighted landscape covariates at a controlled scale of effect.
Usage
sim_dat_unmarked(
alpha = 1,
beta = NULL,
kernel = c("gaussian", "exp", "expow", "fixed"),
type = c("count", "count_nb", "occ"),
StDev = 0.5,
n_points = 50,
n_surv = 3,
det = 0.5,
min_D = NULL,
raster_stack = NULL,
sigma = NULL,
shape = NULL,
max_D = NULL,
user_seed = NULL,
...
)
Arguments
alpha |
Numeric scalar. Intercept for the abundance/occupancy linear
predictor. Default: |
beta |
Numeric vector of slope coefficients, one per raster layer in
|
kernel |
Character. Kernel function used to weight raster values by
distance. One of |
type |
Character. Response type to simulate. One of:
|
StDev |
Positive numeric. Dispersion (size) parameter for
|
n_points |
Positive integer. Number of spatial sample sites. Points are
placed on a hexagonal grid and randomly subsampled. Default: |
n_surv |
Positive integer. Number of repeated surveys per site, forming
the columns of the returned observation matrix. Default: |
det |
Numeric between 0 and 1. Per-survey detection probability applied
via binomial thinning of the true abundance/occupancy. Default: |
min_D |
Positive numeric. Minimum inter-point spacing on the hexagonal
grid. If |
raster_stack |
A |
sigma |
Positive numeric vector. True kernel scale parameters, one per raster layer. Must be in the same units as the raster projection. |
shape |
Positive numeric vector. Shape parameters for
|
max_D |
Positive numeric. Maximum buffer radius for
|
user_seed |
Optional integer seed for reproducibility. Default:
|
... |
Additional arguments. Not currently used. |
Details
Sites are distributed on a hexagonal grid across the interior of the raster
extent (85% of the x/y range) and then randomly subsampled. The true
abundance or occupancy at each site is a function of the kernel-weighted
landscape covariates at the specified scale. Repeated surveys introduce
imperfect detection controlled by det.
Use the returned y matrix and df (as siteCovs) to
construct an unmarkedFrame, fit a model, and then pass both the
fitted model and fresh kernel_prep output to
multiScale_optim.
Value
A named list with three elements:
yInteger matrix of dimensions
n_points x n_survcontaining the simulated replicated observations. Pass as theyargument when constructing anunmarkedFrame.dfData frame with
n_pointsrows. Theycolumn contains the simulated true abundance/occupancy (before detection thinning). Remaining columns are the true kernel-weighted covariate values, scaled to zero mean and unit variance.ptsAn
sfPOINT object withn_pointsrows. Anobscolumn is appended containing the true abundance/occupancy values. Pass tokernel_prepaspts.
Examples
rs <- sim_rast(dim = 50, user_seed = 123)
rs <- terra::subset(rs, c(1, 3))
s_dat <- sim_dat_unmarked(alpha = 1,
beta = c(0.75, -0.75),
kernel = 'gaussian',
sigma = c(40, 80),
n_points = 20,
n_surv = 3,
det = 0.5,
type = 'count',
raster_stack = rs,
max_D = 220,
user_seed = 123)
plot(s_dat$df$y ~ s_dat$df$bin1)
plot(s_dat$df$y ~ s_dat$df$cont1)
## unmarked analysis
kernel_inputs <- kernel_prep(pts = s_dat$pts,
raster_stack = rs,
max_D = 220,
kernel = 'gaus',
verbose = FALSE)
umf <- unmarked::unmarkedFramePCount(y = s_dat$y,
siteCovs = kernel_inputs$kernel_dat)
## Base unmarked model
mod0 <- unmarked::pcount(~1 ~ bin1 + cont1,
data = umf,
K = 30)
## The unmarked optimization is slower than the simulation and base model
## fit, so it is skipped in automated example checks.
if (interactive()) {
opt1 <- multiScale_optim(fitted_mod = mod0,
kernel_inputs = kernel_inputs)
summary(opt1)
}
Simulate spatially autocorrelated raster surfaces
Description
Creates a SpatRaster stack of four simulated landscape surfaces for
use with sim_dat and sim_dat_unmarked. The stack
contains two binary (0/1) and two continuous (0–1) surfaces with differing
spatial autocorrelation ranges, allowing simulation of multiscale ecological
processes.
Usage
sim_rast(
dim = 100,
resolution = 10,
autocorr_range1 = NULL,
autocorr_range2 = NULL,
sill = 10,
plot = FALSE,
user_seed = NULL,
...
)
Arguments
dim |
Positive integer. Number of cells on each side of the square
raster. The output will be a |
resolution |
Positive numeric. Cell size (edge length) in map units.
The raster extent will be |
autocorr_range1 |
Optional positive numeric. Spatial autocorrelation
range (in map cells) for surfaces 1 and 3 ( |
autocorr_range2 |
Optional positive numeric. Autocorrelation range for
surfaces 2 and 4 ( |
sill |
Positive numeric. Partial sill (variance) of the Gaussian
random field used for all four surfaces. Default: |
plot |
Logical. If |
user_seed |
Optional integer seed for reproducibility. Different
transformations of |
... |
Additional arguments. Not currently used. |
Details
Each surface is generated as a Gaussian random field using a fast Fourier transform (FFT) circulant embedding approach with an exponential covariance function. The underlying continuous fields are normalized to [0, 1] before thresholding (binary surfaces) or returning directly (continuous surfaces).
The two autocorrelation ranges allow simulation of covariates that operate at different spatial scales, a common scenario in landscape ecology where some resources are patchily distributed at fine scales and others vary broadly across the study area.
When user_seed is provided, independent but reproducible seeds are
derived for each surface as multiples of user_seed.
Value
A SpatRaster with four named layers:
bin1Binary (0/1) surface with fine-scale autocorrelation (
autocorr_range1). Values of 1 where the underlying continuous field exceeds its 55th percentile.bin2Binary (0/1) surface with broad-scale autocorrelation (
autocorr_range2). Values of 1 where the underlying continuous field falls below its 40th percentile.cont1Continuous surface rescaled to [0, 1] with fine-scale autocorrelation (75% of
autocorr_range1).cont2Continuous surface rescaled to [0, 1] with broad-scale autocorrelation (125% of
autocorr_range2).
All layers span the same spatial extent:
[0, dim * resolution] x [0, dim * resolution].
Summarize multiScaleR objects
Description
Summarizes output from multiScale_optim.
Usage
## S3 method for class 'multiScaleR'
summary(object, profile = FALSE, ...)
Arguments
object |
An object of class |
profile |
Logical. If |
... |
Optional arguments passed to the method (e.g., |
Value
An object of class summary_multiScaleR. Confidence limits for ‘sigma' default to the package’s existing Wald-style limits. If profile = TRUE, profile likelihood is used when feasible; if profiling fails, the summary falls back to Wald-style limits.
Spatial sample points
Description
Example point file for use with vignette document example
Usage
data(surv_pts)
Format
An sf class point object:
- pts
–> 100 spatial point locations