% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/OneR.R
\name{maxlevels}
\alias{maxlevels}
\title{Remove factors with too many levels}
\usage{
maxlevels(data, maxlevels = 20, na.omit = TRUE)
}
\arguments{
\item{data}{dataframe which contains the data.}

\item{maxlevels}{number of maximum factor levels.}

\item{na.omit}{logical value whether missing values should be treated as a level, defaults to omit missing values before counting.}
}
\value{
A dataframe.
}
\description{
Removes all columns of a dataframe where a factor (or character string) has more than a maximum number of levels.
}
\details{
Often categories that have very many levels are not useful in modelling OneR rules because they result in too many rules and tend to overfit.
Examples are IDs or names.

Character strings are treated as factors although they keep their datatype. Numeric data is left untouched.
If data contains unused factor levels (e.g. due to subsetting) these are ignored and a warning is given.
}
\examples{
df <- data.frame(numeric = c(1:26), alphabet = letters)
str(df)
str(maxlevels(df))
}
\author{
Holger von Jouanne-Diedrich
}
\references{
\url{https://github.com/vonjd/OneR}
}
\seealso{
\code{\link{OneR}}
}

