% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/crazyfy.R
\name{crazyfy}
\alias{crazyfy}
\title{Data preparation before detection of strangers}
\usage{
crazyfy(data, do = c("factor", "log", "impute", "range"), id = NULL,
  skewness.cutpoint = 2, NA.method = "mean", NA.value = 0,
  verbose = FALSE)
}
\arguments{
\item{data}{Source data (data.frame or data.table).}

\item{do}{character vector - List of processing steps to apply -- see details.}

\item{id}{(optional) character - name of a preexisting variable to be used as ID.}

\item{skewness.cutpoint}{numeric - value that is used to determine whether
log recoding should be applied.}

\item{NA.method}{character - method to be used for missing values imputation;
one of "mean" or "value" (then using following parameter \code{NA.value}).}

\item{NA.value}{numeric Value to be used to impute missing values when \code{NA.method}
if "value".}

\item{verbose}{logical - should function display some details about processing.}
}
\value{
Pre-processed data of classes data.table overloaded by crazy.data.table.
}
\description{
\code{crazyfy} preprocess data for anomalies detection computational
routines with \code{strange} : missing values
treatement, variables standardisation, eventual recoding in log,
treatment of character/factor variables.
}
\details{
See here this list of possible pre-treatment operations.
Factors/characters are transformed into numeric by using term frequency–inverse
document frequency approach (td-idf). Note that we use the smooth weighting IDF weight,
ie. we take the log of 1+N/nt where N is the number of observations and nt the frequency
for the specific term t.
}
\examples{
library(stranger)
data(iris)
crazy <- crazyfy(iris[,1:4])
}
