% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/distr-fitting.R
\name{fit_two_distr}
\alias{fit_two_distr}
\alias{fit_two_distr.default}
\alias{fit_two_distr.count}
\alias{fit_two_distr.incidence}
\title{Maximum likelihood fitting of two distributions and goodness-of-fit
comparison.}
\usage{
fit_two_distr(data, ...)

\method{fit_two_distr}{default}(data, random, aggregated, ...)

\method{fit_two_distr}{count}(data, random = smle_pois,
  aggregated = smle_nbinom, n_est = c(random = 1, aggregated = 2), ...)

\method{fit_two_distr}{incidence}(data, random = smle_binom,
  aggregated = smle_betabinom, n_est = c(random = 1, aggregated = 2), ...)
}
\arguments{
\item{data}{An \code{intensity} object.}

\item{...}{Additional arguments to be passed to other methods.}

\item{random}{Distribution to describe random patterns.}

\item{aggregated}{Distribution to describe aggregated patterns.}

\item{n_est}{Number of estimated parameters for both distributions.}
}
\value{
An object of class \code{fit_two_distr}, which is a list containing at least
the following components:
\tabular{ll}{
    \code{call}  \tab The function \code{\link[base]{call}}. \cr
    \code{name}  \tab The names of both distributions. \cr
    \code{model} \tab The outputs of fitting process for both distributions. \cr
    \code{llr}   \tab The result of the log-likelihood ratio test. \cr
}
Other components can be present such as:
\tabular{ll}{
    \code{param} \tab A numeric matrix of estimated parameters (that can be
                      printed using \code{\link[stats]{printCoefmat}}). \cr
    \code{freq}  \tab A data frame or a matrix with the observed and expected
                      frequencies for both distributions for the different
                      categories. \cr
    \code{gof}   \tab Goodness-of-fit tests for both distributions (which are
                      typically chi-squared goodness-of-fit tests). \cr
}
}
\description{
Different distributions may be used depending on the kind of provided data.
By default, the Poisson and negative binomial distributions are fitted to
count data, whereas the binomial and beta-binomial distributions are used
with incidence data. Either Randomness assumption (Poisson or binomial
distributions) or aggregation assumption (negative binomial or beta-binomial)
are made, and then, a goodness-of-fit comparison of both distributions is
made using a log-likelihood ratio test.
}
\details{
Under the hood, \code{distr_fit} relies on the \code{\link{smle}} utility
which is a wrapped around the \code{\link[stats]{optim}} procedure.

Note that there may appear warnings about chi-squared goodness-of-fit tests
if any expected count is less than 5 (Cochran's rule of thumb).
}
\examples{
# Simple workflow for incidence data:
my_data <- count(arthropods)
my_data <- split(my_data, by = "t")[[3]]
my_res  <- fit_two_distr(my_data)
summary(my_res)
plot(my_res)

# Simple workflow for incidence data:
my_data <- incidence(tobacco_viruses)
my_res  <- fit_two_distr(my_data)
summary(my_res)
plot(my_res)

# Note that there are other methods to fit some common distributions.
# For example for the Poisson distribution, one can use glm:
my_arthropods <- arthropods[arthropods$t == 3, ]
my_model <- glm(my_arthropods$i ~ 1, family = poisson)
lambda <- exp(coef(my_model)[[1]]) # unique(my_model$fitted.values) works also.
lambda
# ... or the fitdistr function in MASS package:
require(MASS)
fitdistr(my_arthropods$i, "poisson")

# For the binomial distribution, glm still works:
my_model <- with(tobacco_viruses, glm(i/n ~ 1, family = binomial, weights = n))
prob <- logit(coef(my_model)[[1]], rev = TRUE)
prob
# ... but the binomial distribution is not yet recognized by MASS::fitdistr.

# Examples featured in Madden et al. (2007).
# p. 242-243
my_data <- incidence(dogwood_anthracnose)
my_data <- split(my_data, by = "t")
my_fit_two_distr <- lapply(my_data, fit_two_distr)
lapply(my_fit_two_distr, function(x) x$param$aggregated[c("prob", "theta"), ])
lapply(my_fit_two_distr, plot)

my_agg_index <- lapply(my_data, agg_index)
lapply(my_agg_index, function(x) x$index)
lapply(my_agg_index, chisq.test)

}
\references{
Madden LV, Hughes G. 1995. Plant disease incidence: Distributions,
heterogeneity, and temporal analysis. Annual Review of Phytopathology 33(1):
529–564.
\href{http://dx.doi.org/doi:10.1146/annurev.py.33.090195.002525}{doi:10.1146/annurev.py.33.090195.002525}
}
