% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/logisregr.R
\name{logisregr}
\alias{logisregr}
\title{Logistic Regression Models for Binary Data}
\usage{
logisregr(
  data,
  rep = "",
  event = "event",
  covariates = "",
  freq = "",
  weight = "",
  offset = "",
  id = "",
  link = "logit",
  robust = FALSE,
  firth = FALSE,
  flic = FALSE,
  plci = FALSE,
  alpha = 0.05
)
}
\arguments{
\item{data}{The input data frame that contains the following variables:
\itemize{
\item \code{rep}: The replication for by-group processing.
\item \code{event}: The event indicator, 1=event, 0=no event.
\item \code{covariates}: The values of baseline covariates.
\item \code{freq}: The frequency for each observation.
\item \code{weight}: The weight for each observation.
\item \code{offset}: The offset for each observation.
\item \code{id}: The optional subject ID to group the score residuals
in computing the robust sandwich variance.
}}

\item{rep}{The name(s) of the replication variable(s) in the input data.}

\item{event}{The name of the event variable in the input data.}

\item{covariates}{The vector of names of baseline covariates
in the input data.}

\item{freq}{The name of the frequency variable in the input data.
The frequencies must be the same for all observations within each
cluster as indicated by the id. Thus freq is the cluster frequency.}

\item{weight}{The name of the weight variable in the input data.}

\item{offset}{The name of the offset variable in the input data.}

\item{id}{The name of the id variable in the input data.}

\item{link}{The link function linking the response probabilities to the
linear predictors. Options include "logit" (default), "probit", and
"cloglog" (complementary log-log).}

\item{robust}{Whether a robust sandwich variance estimate should be
computed. In the presence of the id variable, the score residuals
will be aggregated for each id when computing the robust sandwich
variance estimate.}

\item{firth}{Whether the firth's bias reducing penalized likelihood
should be used. The default is \code{FALSE}.}

\item{flic}{Whether to apply intercept correction to obtain more
accurate predicted probabilities. The default is \code{FALSE}.}

\item{plci}{Whether to obtain profile likelihood confidence interval.}

\item{alpha}{The two-sided significance level.}
}
\value{
A list with the following components:
\itemize{
\item \code{sumstat}: The data frame of summary statistics of model fit
with the following variables:
\itemize{
\item \code{n}: The number of subjects.
\item \code{nevents}: The number of events.
\item \code{loglik0}: The (penalized) log-likelihood under null.
\item \code{loglik1}: The maximum (penalized) log-likelihood.
\item \code{niter}: The number of Newton-Raphson iterations.
\item \code{p}: The number of parameters, including the intercept,
and regression coefficients associated with the covariates.
\item \code{link}: The link function.
\item \code{robust}: Whether a robust sandwich variance estimate should
be computed.
\item \code{firth}: Whether the firth's penalized likelihood is used.
\item \code{flic}: Whether to apply intercept correction.
\item \code{loglik0_unpenalized}: The unpenalized log-likelihood under null.
\item \code{loglik1_unpenalized}: The maximum unpenalized log-likelihood.
\item \code{rep}: The replication.
}
\item \code{parest}: The data frame of parameter estimates with the
following variables:
\itemize{
\item \code{param}: The name of the covariate for the parameter estimate.
\item \code{beta}: The parameter estimate.
\item \code{sebeta}: The standard error of parameter estimate.
\item \code{z}: The Wald test statistic for the parameter.
\item \code{expbeta}: The exponentiated parameter estimate.
\item \code{vbeta}: The covariance matrix for parameter estimates.
\item \code{lower}: The lower limit of confidence interval.
\item \code{upper}: The upper limit of confidence interval.
\item \code{p}: The p-value from the chi-square test.
\item \code{method}: The method to compute the confidence interval and
p-value.
\item \code{sebeta_naive}: The naive standard error of parameter estimate.
\item \code{vbeta_naive}: The naive covariance matrix of parameter
estimates.
\item \code{rep}: The replication.
}
\item \code{fitted}: The data frame with the following variables:
\itemize{
\item \code{linear_predictors}: The linear fit on the logit scale.
\item \code{fitted_values}: The fitted probabilities of having an event,
obtained by transforming the linear predictors by the inverse of
the logit link.
\item \code{rep}: The replication.
}
\item \code{p}: The number of parameters.
\item \code{link}: The link function.
\item \code{param}: The parameter names.
\item \code{beta}: The parameter estimate.
\item \code{vbeta}: The covariance matrix for parameter estimates.
\item \code{vbeta_naive}: The naive covariance matrix for parameter estimates.
\item \code{linear_predictors}: The linear fit on the logit scale.
\item \code{fitted_values}: The fitted probabilities of having an event.
\item \code{terms}: The terms object.
\item \code{xlevels}: A record of the levels of the factors used in fitting.
\item \code{data}: The input data.
\item \code{rep}: The name(s) of the replication variable(s).
\item \code{event}: The name of the event variable.
\item \code{covariates}: The names of baseline covariates.
\item \code{freq}: The name of the freq variable.
\item \code{weight}: The name of the weight variable.
\item \code{offset}: The name of the offset variable.
\item \code{id}: The name of the id variable.
\item \code{robust}: Whether a robust sandwich variance estimate should be
computed.
\item \code{firth}: Whether to use the firth's bias reducing penalized
likelihood.
\item \code{flic}: Whether to apply intercept correction.
\item \code{plci}: Whether to obtain profile likelihood confidence interval.
\item \code{alpha}: The two-sided significance level.
}
}
\description{
Obtains the parameter estimates from logistic regression
models with binary data.
}
\details{
Fitting a logistic regression model using Firth's bias reduction method
is equivalent to penalization of the log-likelihood by the Jeffreys prior.
Firth's penalized log-likelihood is given by
\deqn{l(\beta) + \frac{1}{2} \log(\mbox{det}(I(\beta)))}
and the components of the gradient \eqn{g(\beta)} are computed as
\deqn{g(\beta_j) + \frac{1}{2} \mbox{trace}\left(I(\beta)^{-1}
\frac{\partial I(\beta)}{\partial \beta_j}\right)}
The Hessian matrix is not modified by this penalty.

Firth's method reduces bias in maximum likelihood estimates of
coefficients, but it introduces a bias toward one-half in the
predicted probabilities.

A straightforward modification to Firth’s logistic regression to
achieve unbiased average predicted probabilities involves a post hoc
adjustment of the intercept. This approach, known as Firth’s logistic
regression with intercept correction (FLIC), preserves the
bias-corrected effect estimates. By excluding the intercept from
penalization, it ensures that we don't sacrifice the accuracy of
effect estimates to improve the predictions.
}
\examples{

(fit1 <- logisregr(
  ingots, event = "NotReady", covariates = "Heat*Soak", freq = "Freq"))

}
\references{
David Firth.
Bias Reduction of Maximum Likelihood Estimates.
Biometrika 1993; 80:27–38.

Georg Heinze and Michael Schemper.
A solution to the problem of separation in logistic regression.
Statistics in Medicine 2002;21:2409–2419.

Rainer Puhr, Georg Heinze, Mariana Nold, Lara Lusa, and
Angelika Geroldinger.
Firth's logistic regression with rare events: accurate effect
estimates and predictions?
Statistics in Medicine 2017; 36:2302-2317.
}
\author{
Kaifeng Lu, \email{kaifenglu@gmail.com}
}
