% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/arfima_function.R
\name{arfima}
\alias{arfima}
\title{Fit ARFIMA, ARIMA-FGN, and ARIMA-PLA (multi-start) models
 
Fits ARFIMA/ARIMA-FGN/ARIMA-PLA multi-start models to times series data.
Options include fixing parameters, whether or not to fit fractional noise,
what type of fractional noise (fractional Gaussian noise (FGN), fractionally
differenced white noise (FDWN), or the newly introduced power-law
autocovariance noise (PLA)), etc.  This function can fit regressions with
ARFIMA/ARIMA-FGN/ARIMA-PLA errors via the xreg argument, including dynamic
regression (transfer functions).}
\usage{
arfima(
  z,
  order = c(0, 0, 0),
  numeach = c(1, 1),
  dmean = TRUE,
  whichopt = 0,
  itmean = FALSE,
  fixed = list(phi = NA, theta = NA, frac = NA, seasonal = list(phi = NA, theta = NA,
    frac = NA), reg = NA),
  lmodel = c("d", "g", "h", "n"),
  seasonal = list(order = c(0, 0, 0), period = NA, lmodel = c("d", "g", "h", "n"),
    numeach = c(1, 1)),
  useC = 3,
  cpus = 1,
  rand = FALSE,
  numrand = NULL,
  seed = NA,
  eps3 = 0.01,
  xreg = NULL,
  reglist = list(regpar = NA, minn = -10, maxx = 10, numeach = 1),
  check = F,
  autoweed = TRUE,
  weedeps = 0.01,
  adapt = TRUE,
  weedtype = c("A", "P", "B"),
  weedp = 2,
  quiet = FALSE,
  startfit = NULL,
  back = FALSE
)
}
\arguments{
\item{z}{The data set (time series)}

\item{order}{The order of the ARIMA model to be fit: c(p, d, q).  We have
that p is the number of AR parameters (phi), d is the amount of integer
differencing, and q is the number of MA parameters (theta).  Note we use the
Box-Jenkins convention for the MA parameters, in that they are the negative
of \code{\link{arima}}: see "Details".}

\item{numeach}{The number of starts to fit for each parameter.  The first
argument in the vector is the number of starts for each AR/MA parameter,
while the second is the number of starts for the fractional parameter.  When
this is set to 0, no fractional noise is fit.  Note that the number of
starts in total is multiplicative: if we are fitting an ARFIMA(2, d, 2), and
use the older number of starts (c(2, 2)), we will have 2^2 * 2 * 2^2 = 32
starting values for the fits.  \strong{Note that the default has changed
from c(2, 2) to c(1, 1) since package version 1.4-0}}

\item{dmean}{Whether the mean should be fit dynamically with the optimizer.
Note that the likelihood surface will change if this is TRUE, but this
is usually not worrisome.
See the referenced thesis for details.}

\item{whichopt}{Which optimizer to use in the optimization: see "Details".}

\item{itmean}{This option is under investigation, and will be set to FALSE
automatically until it has been decided what to do.

Whether the mean should be fit iteratively using the function
\code{\link[ltsa]{TrenchMean}}.  Currently itmean, if set to TRUE, has higher
priority that dmean: if both are TRUE, dmean will be set to FALSE, with a
warning.}

\item{fixed}{A list of parameters to be fixed.  If we are to fix certain
elements of the AR process, for example, fixed$phi must have length equal to
p.  Any numeric value will fix the parameter at that value; for example, if
we are modelling an AR(2) process, and we wish to fix only the first
autoregressive parameter to 0, we would have fixed = list(phi = c(0, NA)).
NA corresponds to that parameter being allowed to change in the optimization
process.  We can fix the fractional parameters, and unlike
\code{\link{arima}}, can fix the seasonal parameters as well. Currently,
fixing regression/transfer function parameters is disabled.}

\item{lmodel}{The long memory model (noise type) to be used: "d" for FDWN,
"g" for FGN, "h" for PLA, and "n" for none (i.e. ARMA short memory models).
Default is "d".}

\item{seasonal}{The seasonal components of the model we wish to fit, with
the same components as above.  The period must be supplied.}

\item{useC}{How much interfaced C code to use: an integer between 0 and 3.
The value 3 is strongly recommended. See "Details".}

\item{cpus}{The number of CPUs used to perform the multi-start fits.  A
small number of fits and a high number of cpus (say both equal 4) with n not
large can actually be slower than when cpus = 1.  The number of CPUs should
not exceed the number of threads available to R.}

\item{rand}{Whether random starts are used in the multistart method.
Defaults to FALSE.}

\item{numrand}{The number of random starts to use.}

\item{seed}{The seed for the random starts.}

\item{eps3}{How far to start from the boundaries when using a grid for the
multi-starts (i.e. when rand is FALSE.)}

\item{xreg}{A matrix, data frame, or vector of regressors for regression or
transfer functions.}

\item{reglist}{A list with the following elements:
\itemize{
\item regpar -
either NA or a list, matrix, data frame, or vector with 3 columns.  If
regpar is a vector, the matrix xreg must have one row or column only.  In
order, the elements of regpar are: r, s, and b.  The values of r are the the
orders of the delta parameters as in Box, Jenkins and Reinsel, the values of
s are the orders of omega parameters, and the values of b are the
backshifting to be done.

\item minn - the minimum value for the starting value of the search, if
reglist$numeach > 1.
\item maxx - the maximum value for the starting value
of the search, if reglist$numeach > 1.
\item numeach - the number of starts
to try for each regression parameter.

}}

\item{check}{If TRUE, checks at each optim iteration whether the model is
identifiable.  This makes the optimization much slower.}

\item{autoweed}{Whether to automatically (before the fit is returned) weed
out modes found that are found that are close together (usually the same
point.)}

\item{weedeps}{The maximum distance between modes that are close together
for the mode with the lower log-likelihood to be weeded out.  If adapt is
TRUE (default) this value changes.}

\item{adapt}{If TRUE, if dim is the dimensionality of the search, weedeps is
changed to \eqn{(1 + weedeps)^{dim} - 1}.}

\item{weedtype}{The type of weeding to be done.  See \code{\link{weed}}.}

\item{weedp}{The p in the p-norm to be used in the weeding.  p = 2 (default)
is Euclidean distance.}

\item{quiet}{If TRUE, no auxiliary output is generated. The default (FALSE)
has information of fits being proformed.}

\item{startfit}{Meant primarily for debugging (for now), allows starting places
for the fitting process.  Overrides \code{numeach}.}

\item{back}{Setting this to true will restore the defaults in numeach.}
}
\value{
An object of class "arfima".  In it, full information on the fit is
given, though not printed under the print.arfima method.  The phis are the
AR parameters, and the thetas are the MA parameters.  Residuals, regression
residuals, etc., are all available, along with the parameter values and
standard errors.  Note that the muHat returned in the arfima object is
of the \strong{differenced} series, if differencing is applied.

Note that if multiple modes are found, they are listed in order of
log-likelihood value.
}
\description{
Fits by direct optimization using optim.  The optimizer choices are: 0 -
BFGS; 1 - Nealder-Mead; 2 - SANN; otherwise CG.
}
\details{
A word of warning: it is generally better to use the default, and only use
Nelder-Mead to check for spurious modes.  SANN takes a long time (and may
only find one mode), and CG may not be stable.

If using Nelder-Mead, it must be stressed that Nelder-Mead can take out
non-spurious modes or add spurious modes: we have checked visually where we
could.  Therefore it is wise to use BFGS as the default and if there are
modes close to the boundaries, check using Nelder-Mead.

The moving average parameters are in the Box-Jenkins convention: they are
the negative of the parameters given by \code{\link{arima}}.  That is, the
model to be fit is, in the case of a non-seasonal ARIMA model, phi(B)
(1-B)^d z[t] = theta(B) a[t], where phi(B) = 1 - phi(1) B - ... - phi(p) B^p
and theta(B) = 1 - theta(1) B - ... - theta(q) B^q.

For the useC parameter, a "0" means no C is used; a "1" means C is only used
to compute the log-likelihood, but not the theoretical autocovariance
function (tacvf); a "2" means that C is used to compute the tacvf and not
the log-likelihood; and a "3" means C is used to compute everything.
}
\examples{

\donttest{
set.seed(8564)
sim <- arfima.sim(1000, model = list(phi = c(0.2, 0.1),
dfrac = 0.4, theta = 0.9))
fit <- arfima(sim, order = c(2, 0, 1), back=TRUE)

fit

data(tmpyr)

fit <- arfima(tmpyr, order = c(1, 0, 1), numeach = c(3, 3))
fit

plot(tacvf(fit), maxlag = 30, tacf = TRUE)

data(SeriesJ)
attach(SeriesJ)

fitTF <- arfima(YJ, order= c(2, 0, 0), xreg = XJ, reglist =
list(regpar = c(2, 2, 3)), lmodel = "n")
fitTF

detach(SeriesJ)
}

}
\references{
McLeod, A. I., Yu, H. and Krougly, Z. L. (2007) Algorithms for
Linear Time Series Analysis: With R Package Journal of Statistical Software,
Vol. 23, Issue 5

Veenstra, J.Q. Persistence and Antipersistence:  Theory and
Software (PhD Thesis)

P. Borwein (1995) An efficient algorithm for Riemann Zeta function Canadian
Math. Soc. Conf. Proc., 27, pp. 29-34.
}
\seealso{
\code{\link{arfima.sim}}, \code{\link{SeriesJ}},
\code{\link{arfima-package}}
}
\author{
JQ (Justin) Veenstra
}
\keyword{ts}
