% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/explore_dada.R
\name{explore_dada}
\alias{explore_dada}
\title{Explore variants output by DADA2 in the parameter space}
\usage{
explore_dada(
  fs,
  sample_locus = "(^[a-zA-Z0-9]*)_([a-zA-Z0-9]*)",
  value_na = 10,
  reduced = TRUE,
  omega_a = 0.9,
  band_size = 16,
  pool = FALSE,
  vline = NULL,
  hline_fr = NULL,
  p_titles = NULL
)
}
\arguments{
\item{fs}{Character vector with full paths to FASTQ files.}

\item{sample_locus}{Regex expression with groups to extract sample (group 1)
and loci (group 2) from "(^[a-zA-Z0-9]\emph{)_([a-zA-Z0-9]})".}

\item{value_na}{Numeric to replace 'NA' or infinite values assigned to 'pval'
or 'birth_pval' in clustering element from 'dada-class'.}

\item{reduced}{If TRUE, a reduced number of columns is returned.
If FALSE, all columns from from 'dada-class' clustering are returned.}

\item{omega_a}{"OMEGA_A" passed to \code{\link[dada2:dada]{dada2::dada()}}.}

\item{band_size}{"BAND_SIZE" passed to \code{\link[dada2:dada]{dada2::dada()}}.}

\item{pool}{Passed to \code{\link[dada2:dada]{dada2::dada()}}.}

\item{vline}{Numeric x-intersection to annotate in plots p1 and p3.}

\item{hline_fr}{Numeric y-intersection to annotate 'frequency' in plots.}

\item{p_titles}{character(4) with plot names.
Passed to 'ggtitle()' in plots p1:p4.}
}
\value{
List with tidy 'dada-class' clustering element and plots.
\enumerate{
\item tidy_dada: tidy 'clustering' element from 'dada-class'
merged across loci.
\item p1: plot 1, frequency of variants (sample x locus) against
'log(-log(birth_pval))'.
\item p2: plot 2, read count of variants against their frequency.
\item p3: plot 3, p1 facetted by locus.
\item p4: plot 4, p2 facetted by locus.
}
}
\description{
DADA2 is run with a set of desired parameters for a set of FASTQ files. The element
'clustering' output by \code{\link[dada2:dada]{dada2::dada()}} containing all relevant statistics
from cluster formation are output in tidy format as the first element. These
tidy results are plotted in 4 different ways.
}
\details{
'OMEGA_A', 'BAND_SIZE' and 'pool = T/F' parameters can have a strong
effect in the variants called by \code{\link[dada2:dada]{dada2::dada()}}. This function explores the
relation of read count, relative frequency of the variants and the
'birth_pval' assigned to the clusters for any given values of starting
parameters. Critical parameters 'pool' and 'BAND_SIZE' can be specified as
arguments. Additional parameters can be set with \code{\link[dada2:setDadaOpt]{dada2::setDadaOpt()}}.
The 'log(-log(birth_pval))' proposed in Rosen et al. (2012) is computed and
represented in plots. For represenation purposes, a virtually infinite value
of 'log(-log(birth_pval)) = 10', is assigned by default to the first cluster
and to 'birth_pval = 0'. Variants in each, sample/locus combination are
ranked by abundance and plotted in the legend. Ranks >= 3 are
named as "3". For biallelic markers, a rank >= 3 implies a likely false
positive. This, visual representation can be used to decide to tune
\code{\link[=variant_call]{variant_call()}}.
\itemize{
\item \emph{omega_a}: threshold for variants to be significant overabundant
'log(-log(birth_pval))' (see Rosen et al. 2012). For exploration, it is
recommended to run \code{\link[=explore_dada]{explore_dada()}} with a large \code{omega_a}.
\item \emph{band_size}: positive numbers set a band size in Needleman-Wunsch alignments.
In this context, ends free alignment is performed.
Zero turns off banding, triggering full Needleman-Wunsch alignments,
in which gapless alignment is performed
(see \href{https://github.com/benjjneb/dada2/issues/1982}{issue}).
\item \emph{pool}: calling variants pooling samples can increase sensitivity
(see \href{https://benjjneb.github.io/dada2/pseudo.html}{dicussion}).
}
}
\examples{
fq <-
 list.files(system.file("extdata", "truncated",
                        package = "tidyGenR"),
                        pattern = "F_filt.fastq.gz",
            full.names = TRUE)
explore_dada(fq,
    value_na = 10,
    reduced = TRUE,
    pool = FALSE,
    vline = 2,
    hline_fr = 0.1,
    omega_a = 0.9,
    band_size = 16
)
}
\references{
Rosen et al. (2012). \emph{Denoising PCR-amplified metagenome data}.
BMC Bioinformatics, 13(1).
}
