% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/bootdht.R
\name{bootdht}
\alias{bootdht}
\title{Bootstrap uncertainty estimation for distance sampling models}
\usage{
bootdht(
  model,
  flatfile,
  resample_strata = FALSE,
  resample_obs = FALSE,
  resample_transects = TRUE,
  nboot = 100,
  summary_fun = bootdht_Nhat_summarize,
  convert_units = 1,
  select_adjustments = FALSE,
  sample_fraction = 1,
  multipliers = NULL,
  progress_bar = "base",
  cores = 1,
  convert.units = NULL
)
}
\arguments{
\item{model}{a model fitted by \code{\link{ds}} or a list of models}

\item{flatfile}{Data provided in the flatfile format. See \code{\link{flatfile}} for
details. Please note, it is a current limitation of bootdht that all
Sample.Label identifiers must be unique across all strata, i.e.transect
ids must not be re-used from one strata to another. An easy way to achieve
this is to paste together the stratum names and transect ids.}

\item{resample_strata}{should resampling happen at the stratum
(\code{Region.Label}) level? (Default \code{FALSE})}

\item{resample_obs}{should resampling happen at the observation (\code{object})
level? (Default \code{FALSE})}

\item{resample_transects}{should resampling happen at the transect
(\code{Sample.Label}) level? (Default \code{TRUE})}

\item{nboot}{number of bootstrap replicates}

\item{summary_fun}{function that is used to obtain summary statistics from
the bootstrap, see Summary Functions below. By default
\code{\link{bootdht_Nhat_summarize}} is used, which just extracts abundance estimates.}

\item{convert_units}{conversion between units for abundance estimation, see
"Units", below. (Defaults to 1, implying all of the units are "correct"
already.) This takes precedence over any unit conversion stored in \code{model}.}

\item{select_adjustments}{select the number of adjustments in each
bootstrap, when \code{FALSE} the exact detection function specified in \code{model} is
fitted to each replicate. Setting this option to \code{TRUE} can significantly
increase the runtime for the bootstrap. Note that for this to work \code{model}
must have been fitted with \code{adjustment!=NULL}.}

\item{sample_fraction}{what proportion of the transects was covered (e.g.,
0.5 for one-sided line transects).}

\item{multipliers}{\code{list} of multipliers. See "Multipliers" below.}

\item{progress_bar}{which progress bar should be used? Default "base" uses
\code{txtProgressBar}, "none" suppresses output, "progress" uses the
\code{progress} package, if installed.}

\item{cores}{number of CPU cores to use to compute the estimates. See "Parallelization" below.}

\item{convert.units}{deprecated, see same argument with underscore, above.}
}
\description{
Performs a bootstrap for simple distance sampling models using the same data
structures as \code{\link[mrds:dht]{dht}}. Note that only geographical stratification
as supported in \code{dht} is allowed.
}
\section{Summary Functions}{

The function \code{summary_fun} allows the user to specify what summary
statistics should be recorded from each bootstrap. The function should take
two arguments, \code{ests} and \code{fit}. The former is the output from
\code{dht2}, giving tables of estimates. The latter is the fitted detection
function object. The function is called once fitting and estimation has been
performed and should return a \code{data.frame}. Those \code{data.frame}s
are then concatenated using \code{rbind}. One can make these functions
return any information within those objects, for example abundance or
density estimates or the AIC for each model. See Examples below.
}

\section{Multipliers}{

It is often the case that we cannot measure distances to individuals or
groups directly, but instead need to estimate distances to something they
produce (e.g., for whales, their blows; for elephants their dung) -- this is
referred to as indirect sampling. We may need to use estimates of production
rate and decay rate for these estimates (in the case of dung or nests) or
just production rates (in the case of songbird calls or whale blows). We
refer to these conversions between "number of cues" and "number of animals"
as "multipliers".

The \code{multipliers} argument is a \code{list}, with 3 possible elements (\code{creation}
and \code{decay}). Each element of which is either:
\itemize{
\item \code{data.frame} and must have at least a column named \code{rate}, which abundance
estimates will be divided by (the term "multiplier" is a misnomer, but
kept for compatibility with Distance for Windows). Additional columns can
be added to give the standard error and degrees of freedom for the rate
if known as \code{SE} and \code{df}, respectively. You can use a multirow
\code{data.frame} to have different rates for different geographical areas
(for example). In this case the rows need to have a column (or columns)
to \code{merge} with the data (for example \code{Region.Label}).
\item a \code{function} which will return a single estimate of the relevant
multiplier. See \code{\link{make_activity_fn}} for a helper function for use with the
\code{activity} package.
}
}

\section{Model selection}{

Model selection can be performed on a per-replicate basis within the
bootstrap. This has three variations:
\enumerate{
\item when \code{select_adjustments} is \code{TRUE} then adjustment terms are selected
by AIC within each bootstrap replicate (provided that \code{model} had the
\code{order} and \code{adjustment} options set to non-\code{NULL}.
\item if \code{model} is a list of fitted detection functions, each of these is
fitted to each replicate and results generated from the one with the
lowest AIC.
\item when \code{select_adjustments} is \code{TRUE} and \code{model} is a list of fitted
detection functions, each model fitted to each replicate and number of
adjustments is selected via AIC.
This last option can be extremely time consuming.
}
}

\section{Parallelization}{

If \code{cores}>1 then the \code{parallel}/\code{doParallel}/\code{foreach}/\code{doRNG} packages
will be used to run the computation over multiple cores of the computer. To
use this component you need to install those packages using:
\code{install.packages(c("foreach", "doParallel", "doRNG"))} It is advised that
you do not set \code{cores} to be greater than one less than the number of cores
on your machine. The \code{doRNG} package is required to make analyses
reproducible (\code{\link{set.seed}} can be used to ensure the same answers).

It is also hard to debug any issues in \code{summary_fun} so it is best to run a
small number of bootstraps first in parallel to check that things work. On
Windows systems \code{summary_fun} does not have access to the global environment
when running in parallel, so all computations must be made using only its
\code{ests} and \code{fit} arguments (i.e., you can not use R objects from elsewhere
in that function, even if they are available to you from the console).

Another consequence of the global environment being unavailable inside
parallel bootstraps is that any starting values in the model object passed
in to \code{bootdht} must be hard coded (otherwise you get back 0 successful
bootstraps). For a worked example showing this, see the camera trap distance
sampling online example at
\url{https://examples.distancesampling.org/Distance-cameratraps/camera-distill.html}.
}

\examples{
\dontrun{
# fit a model to the minke data
data(minke)
mod1 <- ds(minke)

# summary function to save the abundance estimate
Nhat_summarize <- function(ests, fit) {
  return(data.frame(Nhat=ests$individuals$N$Estimate))
}

# perform 5 bootstraps
bootout <- bootdht(mod1, flatfile=minke, summary_fun=Nhat_summarize, nboot=5)

# obtain basic summary information
summary(bootout)
}
}
\seealso{
\code{\link{summary.dht_bootstrap}} for how to summarize the results,
\code{\link{bootdht_Nhat_summarize}} and \code{\link{bootdht_Dhat_summarize}} for an examples of
summary functions.
}
