% Generated by roxygen2 (4.1.0): do not edit by hand
% Please edit documentation in R/statsMS.R
\name{statsMS}
\alias{statsMS}
\title{Obtain performance statistics of a series of linear models}
\usage{
statsMS(model, design.info, arrange.by, digits)
}
\arguments{
\item{model}{A list of linear models returned by \code{buildMS}.}

\item{design.info}{Extra information about the linear models in the series.}

\item{arrange.by}{Character string defining if the table with the
performance statistics of the linear models should be arranged, and which
column should be used. Available options are \code{"candidates"},
\code{"df"}, \code{"aic"}, \code{"rmse"}, \code{"nrmse"}, \code{"r2"},
\code{"adj_r2"}, and \code{"ADJ_r2"}. Decending order is used by default and
cannot be changed in the current implementation. See \sQuote{Value} for more
information.}

\item{digits}{Integer or vector with six integers indicating the number of
decimal places to be used to round the performance statistics. If a vector
is passed to the function, the number of decimal places should be in the
following order:

\code{c("aic", "rmse", "nrmse", "r2", "adj_r2", "ADJ_r2")}.}
}
\value{
A data frame with several performance statistics:
\describe{
\item{id}{Identification of the model.}
\item{candidates}{Number of candidate predictor variables initially
offered to the model.}
\item{df}{Number of degrees of freedom of the final selected model.}
\item{aic}{Akaike's Information Criterion (AIC). Obtained using
\code{extractAIC}.}
\item{rmse}{Root-mean squared error, calculated based on the number of
candidate predictor variables initially offered to the model.}
\item{nrmse}{Normalized Root-mean squared error, calculated as the ratio
between the RMSE and the standard deviation of the observed values of the
dependent variable.}
\item{r2}{Multiple coefficient of determination.}
\item{adj_r2}{Adjusted multiple coefficient of determination.}
\item{ADJ_r2}{Adjusted multiple coefficient of determination. Calculations
are done based on the number of candidate predictor variables initially
offered to the model.}
}
}
\description{
This function returns several statistics measuring the performance of a
series of linear models built using the function \code{buildMS}, with an
option to rank the models based on one of the returned performance
statistics.
}
\details{


This function was devised to deal with a list of linear models generated by
the function \code{buildMS}. The main objective is to compare several linear
models using several performance statistics. Such statistics can then be
used to rank the linear models and identify, for example, the best
performing model, given the selected performance statistics.

An important feature of \code{statsMS} is that it uses the information about
the initial number of candidate predictor variables offered to the build the
model to calculate penalized or adjusted measures of model performance. Such
information is recorded as an attribute of the final model selected by
\code{buildMS}. This feature was included in \code{statsMS} because
data-driven variable selection results biased linear models (too optimistic),
and the effective number of degrees of freedom is close to the number of
candidate predictor variables initially offered to the model (Harrell, 2001).
}
\section{TODO}{

\enumerate{
\item Include other performance statistics such as: PRESS, BIC, Mallow's Cp,
max(VIF);
\item Add option to select which performance statistics should be returned.
}
}
\examples{
\dontrun{
# based on the second example of function stepAIC
require(MASS)
cpus1 <- cpus
for(v in names(cpus)[2:7])
  cpus1[[v]] <- cut(cpus[[v]], unique(quantile(cpus[[v]])),
                    include.lowest = TRUE)
cpus0 <- cpus1[, 2:8]  # excludes names, authors' predictions
cpus.samp <- sample(1:209, 100)
cpus.form <- list(formula(log10(perf) ~ syct + mmin + mmax + cach + chmin +
                  chmax + perf),
                  formula(log10(perf) ~ syct + mmin + cach + chmin + chmax),
                  formula(log10(perf) ~ mmax + cach + chmin + chmax + perf))
data <- cpus1[cpus.samp,2:8]
cpus.ms <- buildMS(cpus.form, data, vif = TRUE, aic = TRUE)
cpus.des <- data.frame(a = c(0, 1, 0), b = c(1, 0, 1), c = c(1, 1, 0))
stats <- statsMS(cpus.ms, design.info = cpus.des, arrange.by = "aic")
}
}
\author{
Alessandro Samuel-Rosa \email{alessandrosamuelrosa@gmail.com}
}
\references{
Harrell, F. E. (2001) \emph{Regression modeling strategies: with
applications to linear models, logistic regression, and survival analysis.}
First edition. New York: Springer.

Venables, W. N. and Ripley, B. D. (2002) \emph{Modern applied statistics
with S.} Fourth edition. New York: Springer.
}
\seealso{
\code{\link[pedometrics]{buildMS}},
\code{\link[pedometrics]{plotMS}}.
}
\keyword{manip}
\keyword{models}

