% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/read_varian_sms.R
\name{read_varian_sms}
\alias{read_varian_sms}
\title{Read 'Varian' SMS}
\usage{
read_varian_sms(
  path,
  what = c("MS1", "TIC", "BPC"),
  format_out = c("matrix", "data.frame", "data.table"),
  data_format = c("wide", "long"),
  read_metadata = TRUE,
  collapse = TRUE
)
}
\arguments{
\item{path}{Path to 'Varian' \code{.SMS} files.}

\item{what}{Whether to extract chromatograms (\code{chroms}) and/or
\code{MS1} data. Accepts multiple arguments.}

\item{format_out}{R format. Either \code{matrix} or \code{data.frame}.}

\item{data_format}{Whether to return data in \code{wide} or \code{long} format.}

\item{read_metadata}{Whether to read metadata from file. (This is just a
placeholder for now as there is not yet support for parsing metadata).}

\item{collapse}{Logical. Whether to collapse lists that only contain a single
element.}
}
\value{
A chromatogram or list of chromatograms from the specified
\code{file},  according to the value of \code{what}. Chromatograms are
returned in the format specified by \code{format_out}.
}
\description{
Reads 'Varian Workstation' SMS files.
}
\details{
Varian SMS files begin with a "DIRECTORY" with offsets for each section. The
first section (in all the files I've been able to inspect) is "MSData"
generally beginning at byte 3238. This MSdata section is in turn divided into
two sections. The first section (after a short header) contains chromatogram
data. Some of the information found in this section includes scan numbers,
retention times, (as 64-bit
floats), the total ion chromatogram (TIC), the base peak chromatogram (BPC),
ion time (µsec), as well as some other unidentified information. The scan
numbers and intensities for the TIC and BPC are stored at 4-byte
little-endian integers. Following this section, there is a series of null
bytes, followed by a series of segments containing the mass spectra.

The encoding scheme for the mass spectra is somewhat more complicated. Each
scan is represented by a series of values of variable length separated from
the next scan by two null bytes. Within these segments, values are paired.
The first value in each pair represents the delta-encoded mass-to-charge ratio,
while the second value represents the intensity of the signal. Values in this
section are variable-length, big-endian integers that are encoded using a
selective bit masking based on the leading digit (\code{d}) of each value.
The length of each integer seems to be determined as 1 + (d \%/\% 4). Integers
beginning with digits 0-3 are simple 2-byte integers. If d >= 4, values are
determined by masking to preserve the lowest \code{n} bits according to the
following scheme:
\itemize{
\item d = 4-5 -> preserve lowest 13 bits
\item d = 6-7 -> preserve lowest 14 bits
\item d = 8-9 -> preserve lowest 21 bits
\item d = 10-11 (A-B) -> preserve lowest 22 bits
\item d = 12-13 (C-D) -> preserve lowest 27 bits
\item d = 14-15 (E-F) -> preserve lowest 28 bits (?)
}
}
\note{
There is still only limited support for the extraction of metadata from
this file format. Also, the timestamp conversions aren't quite right.
}
\examples{
\dontrun{
read_varian_sms(path)
}
}
\seealso{
Other 'Varian' parsers: 
\code{\link{read_varian_peaklist}()}
}
\author{
Ethan Bass
}
\concept{'Varian' parsers}
