% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/anlzMWRoutlier.R
\name{anlzMWRoutlier}
\alias{anlzMWRoutlier}
\title{Analyze outliers in results file}
\usage{
anlzMWRoutlier(
  res = NULL,
  param,
  acc = NULL,
  fset = NULL,
  type = c("box", "jitterbox", "jitter"),
  group,
  dtrng = NULL,
  repel = TRUE,
  outliers = FALSE,
  labsize = 3,
  fill = "lightgrey",
  alpha = 0.8,
  width = 0.8,
  yscl = "auto",
  ttlsize = 1.2,
  bssize = 11,
  runchk = TRUE,
  warn = TRUE
)
}
\arguments{
\item{res}{character string of path to the results file or \code{data.frame} for results returned by \code{\link{readMWRresults}}}

\item{param}{character string of the parameter to plot, must conform to entries in the \code{"Simple Parameter"} column of \code{\link{paramsMWR}}}

\item{acc}{character string of path to the data quality objectives file for accuracy or \code{data.frame} returned by \code{\link{readMWRacc}}}

\item{fset}{optional list of inputs with elements named \code{res}, \code{acc}, \code{frecom}, \code{sit}, or \code{wqx} overrides the other arguments}

\item{type}{character indicating \code{"box"}, \code{"jitterbox"}, or \code{"jitter"}, see details}

\item{group}{character indicating whether the summaries are grouped by month, site, or week of year}

\item{dtrng}{character string of length two for the date ranges as YYYY-MM-DD, optional}

\item{repel}{logical indicating if overlapping outlier labels are offset}

\item{outliers}{logical indicating if outliers are returned to the console instead of plotting}

\item{labsize}{numeric indicating font size for the outlier labels}

\item{fill}{numeric indicating fill color for boxplots}

\item{alpha}{numeric from 0 to 1 indicating transparency of fill color}

\item{width}{numeric for width of boxplots}

\item{yscl}{character indicating one of \code{"auto"} (default), \code{"log"}, or \code{"linear"}, see details}

\item{ttlsize}{numeric value indicating font size of the title relative to other text in the plot}

\item{bssize}{numeric for overall plot text scaling, passed to \code{\link[ggplot2]{theme_minimal}}}

\item{runchk}{logical to run data checks with \code{\link{checkMWRresults}} or \code{\link{checkMWRacc}}, applies only if \code{res} or \code{acc} are file paths}

\item{warn}{logical to return warnings to the console (default)}
}
\value{
A \code{\link[ggplot2]{ggplot}} object that can be further modified if \code{outliers = FALSE}, otherwise a data frame of outliers is returned.
}
\description{
Analyze outliers in results file
}
\details{
Outliers are defined following the standard \code{\link[ggplot2]{ggplot}} definition as 1.5 times the inter-quartile range of each boxplot.  The data frame returned if \code{outliers = TRUE} may vary based on the boxplot groupings defined by \code{group}.

Specifying \code{type = "box"} (default) will produce standard boxplots.  Specifying \code{type = "jitterbox"} will produce boxplots with non-outlier observations jittered on top.  Specifying \code{type = "jitter"} will suppress the boxplots and show only the jittered points and the outliers.

Specifying \code{group = "week"} will group the samples by week of year using an integer specifying the week.  Note that there can be no common month/day indicating the start of the week between years and an integer is the only way to compare summaries if the results data span multiple years.

The y-axis scaling as arithmetic (linear) or logarithmic can be set with the \code{yscl} argument.  If \code{yscl = "auto"} (default), the scaling is  determined automatically from the data quality objective file for accuracy, i.e., parameters with "log" in any of the columns are plotted on log10-scale, otherwise arithmetic. Setting \code{yscl = "linear"} or \code{yscl = "log"} will set the axis as linear or log10-scale, respectively, regardless of the information in the data quality objective file for accuracy.

Any entries in \code{resdat} in the \code{"Result Value"} column as \code{"BDL"} or \code{"AQL"} are replaced with appropriate values in the \code{"Quantitation Limit"} column, if present, otherwise the \code{"MDL"} or \code{"UQL"} columns from the data quality objectives file for accuracy are used.  Values as \code{"BDL"} use one half of the appropriate limit.
}
\examples{
# results data path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data
accdat <- readMWRacc(accpth)

# outliers by month
anlzMWRoutlier(res = resdat, param = 'DO', acc = accdat, group = 'month')

# outliers by site
anlzMWRoutlier(res = resdat, param = 'DO', acc = accdat, group = 'site')

# outliers by site, May through July 2021 only
anlzMWRoutlier(res = resdat, param = 'DO', acc = accdat, group = 'site', 
     dtrng = c('2022-05-01', '2022-07-31'))

# outliers by month, type as jitterbox
anlzMWRoutlier(res = resdat, param = 'DO', acc = accdat, group = 'month', type = 'jitterbox')

# outliers by month, type as jitter
anlzMWRoutlier(res = resdat, param = 'DO', acc = accdat, group = 'month', type = 'jitter')

# data frame output
anlzMWRoutlier(res = resdat, param = 'DO', acc = accdat, group = 'month', outliers = TRUE)

}
