% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/dofuture_OP.R
\name{\%dofuture\%}
\alias{\%dofuture\%}
\title{Loop over a Foreach Expression using Futures}
\usage{
foreach \%dofuture\% expr
}
\arguments{
\item{foreach}{A \code{foreach} object created by \code{\link[foreach:foreach]{foreach::foreach()}}
and \code{\link[foreach:foreach]{foreach::times()}}.}

\item{expr}{An R expression.}
}
\value{
The value of the foreach call.
}
\description{
Loop over a Foreach Expression using Futures
}
\details{
This is a replacement for \verb{\%dopar\%} of the \pkg{foreach} package
that leverages the \pkg{future} framework.

When using \verb{\%dofuture\%}:
\itemize{
\item there is no need to use \code{registerDoFuture()}
\item there is no need to use \verb{\%dorng\%} of the \strong{doRNG} package
(but you need to specify \code{.options.future = list(seed = TRUE)}
whenever using random numbers in the \code{expr} expression)
\item global variables and packages are identified automatically by
the \pkg{future} framework
\item errors are relayed as-is (with \verb{\%dopar\%} they captured and modified)
}
}
\section{Global variables and packages}{

When using \verb{\%dofuture\%}, the future framework identifies globals and
packages automatically (via static code inspection).  However, there
are cases where it fails to find some of the globals or packages. When
this happens, one can specify the \code{\link[future:future]{future::future()}} arguments \code{globals}
and \code{packages} via foreach argument \code{.options.future}.  For example,
if you specify argument
\code{.options.future = list(globals = structure(TRUE, ignore = "b", add = "a"))}
then globals are automatically identified (\code{TRUE}), but it ignores \code{b} and
always adds \code{a}.

An alternative to specifying the \code{globals} and the \code{packages} options via
\code{.options.future}, is to use the \code{\link[future:\%globals\%]{\%globals\%}}
and the \code{\link[future:\%packages\%]{\%packages\%}} operators.
See the examples for an illustration.

For further details and instructions, see \code{\link[future:future]{future::future()}}.
}

\section{Random Number Generation (RNG)}{

The \verb{\%dofuture\%} uses the future ecosystem to generate proper random
numbers in parallel in the same way they are generated in, for instance,
\pkg{future.apply}. For this to work, you need to specify
\code{.options.future = list(seed = TRUE)}.  For example,

\if{html}{\out{<div class="sourceCode r">}}\preformatted{y <- foreach(i = 1:3, .options.future = list(seed = TRUE)) \%dofuture\% \{
  rnorm(1)
\}
}\if{html}{\out{</div>}}

Unless \code{seed} is \code{FALSE} or \code{NULL}, this guarantees that the exact same
sequence of random numbers are generated \emph{given the same initial
seed / RNG state} - this regardless of type of future backend, number of
workers, and scheduling ("chunking") strategy.

RNG reproducibility is achieved by pregenerating the random seeds for all
iterations by using L'Ecuyer-CMRG RNG streams.  In each
iteration, these seeds are set before evaluating the foreach expression.
\emph{Note, for large number of iterations this may introduce a large overhead.}

If \code{seed = TRUE}, then \code{\link[base:Random]{.Random.seed}}
is used if it holds a L'Ecuyer-CMRG RNG seed, otherwise one is created
randomly.

If \code{seed = FALSE}, it is expected that none of the foreach iterations
use random number generation.
If they do, then an informative warning or error is produces depending
on settings. See \link[future:future]{future::future} for more details.
Using \code{seed = NULL}, is like \code{seed = FALSE} but without the check
whether random numbers were generated or not.

As input, \code{seed} may also take a fixed initial seed (integer),
either as a full L'Ecuyer-CMRG RNG seed (vector of 1+6 integers), or
as a seed generating such a full L'Ecuyer-CMRG seed. This seed will
be used to generated one L'Ecuyer-CMRG RNG stream for each iteration.

An alternative to specifying the \code{seed} option via \code{.options.future},
is to use the \code{\link[future:\%seed\%]{\%seed\%}} operator.  See
the examples for an illustration.

For further details and instructions, see
\code{\link[future.apply:future_lapply]{future.apply::future_lapply()}}.
}

\section{Load balancing ("chunking")}{

Whether load balancing ("chunking") should take place or not can be
controlled by specifying either argument
\verb{.options.future = list(scheduling = <ratio>)} or
\verb{.options.future = list(chunk.size = <count>)} to \code{foreach()}.

The value \code{chunk.size} specifies the average number of elements
processed per future ("chunks").
If \code{+Inf}, then all elements are processed in a single future (one worker).
If \code{NULL}, then argument \code{future.scheduling} is used.

The value \code{scheduling} specifies the average number of futures
("chunks") that each worker processes.
If \code{0.0}, then a single future is used to process all iterations;
none of the other workers are not used.
If \code{1.0} or \code{TRUE}, then one future per worker is used.
If \code{2.0}, then each worker will process two futures (if there are
enough iterations).
If \code{+Inf} or \code{FALSE}, then one future per iteration is used.
The default value is \code{scheduling = 1.0}.

For further details and instructions, see
\code{\link[future.apply:future_lapply]{future.apply::future_lapply()}}.
}

\section{Control processing order of iterations}{

Attribute \code{ordering} of \code{chunk.size} or \code{scheduling} can be used to
control the ordering the elements are iterated over, which only affects
the processing order and \emph{not} the order values are returned.
This attribute can take the following values:
\itemize{
\item index vector - an numeric vector of length \code{nX}.
\item function     - an function taking one argument which is called as
\code{ordering(nX)} and which must return an
index vector of length \code{nX}, e.g.
\code{function(n) rev(seq_len(n))} for reverse ordering.
\item \code{"random"}   - this will randomize the ordering via random index
vector \code{sample.int(nX)}.
}

where \code{nX} is the number of foreach iterations to be done.

For example,
\code{.options.future = list(scheduling = structure(2.0, ordering = "random"))}.

\emph{Note}, when elements are processed out of order, then captured standard
output and conditions are also relayed in that order, that is, out of order.

For further details and instructions, see
\code{\link[future.apply:future_lapply]{future.apply::future_lapply()}}.
}

\section{Reporting on progress}{

How to report on progress is a frequently asked question, especially
in long-running tasks and parallel processing.  The \strong{foreach}
framework does \emph{not} have a built-in mechanism for progress
reporting(*).

When using \strong{doFuture}, and the Futureverse in general, for
processing, the \strong{progressr} package can be used to signal progress
updates in a near-live fashion.  There is special argument related to
\code{foreach()} or \strong{doFuture} to achieve this. Instead, one calls a
a, so called, "progressor" function within each iteration.  See
the \href{https://cran.r-project.org/package=progressr}{\strong{progressr}}
package and its \code{vignette(package = "progressr")} for examples.

(*) The legacy \strong{doSNOW} package uses a special \code{foreach()} argument
\code{.options.doSNOW$progress} that can be used to make a progress update
each time results from a parallel workers is returned. This approach
is limited by how chunking works, requires the developer to set that
argument, and the code becomes incompatible with foreach adaptors
registered by other \strong{doNnn} packages.
}

\examples{
\donttest{
plan(multisession)  # parallelize futures on the local machine

y <- foreach(x = 1:10, .combine = rbind) \%dofuture\% {
  y <- sqrt(x)
  data.frame(x = x, y = y, pid = Sys.getpid())
}
print(y)


## Random number generation
y <- foreach(i = 1:3, .combine = rbind, .options.future = list(seed = TRUE)) \%dofuture\% {
  data.frame(i = i, random = runif(n = 1L)) 
}
print(y)

## Random number generation (alternative specification)
y <- foreach(i = 1:3, .combine = rbind) \%dofuture\% {
  data.frame(i = i, random = runif(n = 1L)) 
} \%seed\% TRUE
print(y)

## Random number generation with the foreach() \%:\% nested operator
y <- foreach(i = 1:3, .combine = rbind) \%:\%
       foreach(j = 3:5, .combine = rbind, .options.future = list(seed = TRUE)) \%dofuture\% {
  data.frame(i = i, j = j, random = runif(n = 1L)) 
}
print(y)

## Random number generation with the nested foreach() calls
y <- foreach(i = 1:3, .combine = rbind, .options.future = list(seed = TRUE)) \%dofuture\% {
  foreach(j = 3:5, .combine = rbind, .options.future = list(seed = TRUE)) \%dofuture\% {
    data.frame(i = i, j = j, random = runif(n = 1L)) 
  }
}
print(y)

}

\dontshow{
## R CMD check: make sure any open connections are closed afterward
if (!inherits(plan(), "sequential")) plan(sequential)
}
}
