% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/ipw.R
\name{ipw}
\alias{ipw}
\title{Time-smoothed inverse probability weighting}
\usage{
ipw(
  data,
  time_smoothed = TRUE,
  smoothing_method = "nonstacked",
  outcome_times,
  A_model,
  R_model_numerator = NULL,
  R_model_denominator,
  Y_model,
  truncation_percentile = NULL,
  include_baseline_outcome,
  return_model_fits = TRUE,
  return_weights = TRUE,
  trim_returned_models = FALSE
)
}
\arguments{
\item{data}{Data table (or data frame) containing the observed data. See "Details".}

\item{time_smoothed}{Logical scalar specifying whether the time-smoothed or non-smoothed IPW method is applied. The default is \code{TRUE}, i.e., the time-smoothed IPW method.}

\item{smoothing_method}{Character string specifying the time-smoothed IPW method when there are deaths present. The options include \code{"nonstacked"} and \code{"stacked"}. The default is \code{"nonstacked"}.}

\item{outcome_times}{Numeric vector specifying the follow-up time(s) of interest for the counterfactual outcome mean/probability}

\item{A_model}{Model statement for the treatment variable}

\item{R_model_numerator}{(Optional) Model statement for the indicator variable for the measurement of the outcome variable, used in the numerator of the IP weights. The default is \code{NULL}, i.e., a numerator of 1 is used in the IP weights.}

\item{R_model_denominator}{Model statement for the indicator variable for the measurement of the outcome variable, used in the denominator of the IP weights}

\item{Y_model}{Model statement for the outcome variable}

\item{truncation_percentile}{Numerical scalar specifying the percentile by which to truncate the IP weights. The default is \code{NULL}, i.e., no truncation.}

\item{include_baseline_outcome}{Logical scalar indicating whether to include the time interval indexed by 0 in fitting the time-smoothed outcome model and outcome measurement models. By default, this argument is set to \code{TRUE} if \code{data} has any non-missing outcome values in the time interval indexed by 0 and is otherwise set to \code{FALSE}.}

\item{return_model_fits}{Logical scalar specifying whether to include the fitted models in the output. The default is \code{TRUE}.}

\item{return_weights}{Logical scalar specifying whether to return the estimated inverse probability weights. The default is \code{TRUE}.}

\item{trim_returned_models}{Logical scalar specifying whether to only return the estimated coefficients (and corresponding standard errors, z scores, and p-values) of the fitted models (e.g., treatment model) rather than the full fitted model objects. This reduces the size of the object returned by the \code{ipw} function when \code{return_model_fits} is set to \code{TRUE}, especially when the observed data set is large. By default, this argument is set to \code{FALSE}.}
}
\value{
An object of class "ipw". This object is a list that includes the following components:
\item{est}{A data frame containing the counterfactual mean/probability estimates for each medication at each time interval.}
\item{model_fits}{A list containing the fitted models for the treatment, outcome measurement, and outcome (if \code{return_model_fits} is set to \code{TRUE}).
If the nonstacked time-smoothed approach is used, the \eqn{i}th element in \code{model_fits} is a list of fitted models for the \eqn{i}th outcome time in \code{outcome_times}.
If the stacked time-smoothed approach is used, the \eqn{i}th element in \code{model_fits} is a list of fitted models for the outcome time \eqn{i+1} in the data set \code{data}. The last element in \code{model_fits} contains the fitted outcome model.}
\item{data_weights}{(A list containing) the artificially censored data set with columns for the estimated weights. The column \code{"weights"} contains the (final) inverse probability weight, and the columns \code{"weights_A"} and \code{"weights_R"} contain the inverse probability weights for treatment and outcome measurement, respectively.
If no deaths are present in the data, this object will be a data frame.
If deaths are present in the data and either the non-smoothed IPW method is applied or the time-smoothed non-stacked IPW method is applied, this object will be a list of length \code{length(outcome_times)} where each element corresponds to the artificially censored data set for each outcome time in \code{outcome_times}.
If deaths are present in the data and the time-smoothed stacked IPW method is applied, this object will be a data frame with the stacked, artificially censored data.}
\item{args}{A list containing the arguments supplied to \code{\link{ipw}}, except the observed data set.}
}
\description{
This function applies the time-smoothed inverse probability weighted (IPW) approach described by McGrath et al. (2025) to estimate effects of generalized time-varying treatment strategies on the mean of an outcome at one or more selected follow-up times of interest. Binary and continuous outcomes are supported.
}
\details{
\strong{Treatment strategies}

Users can estimate effects of treatment strategies with the following components:
\itemize{
\item Initiate treatment \eqn{z} at baseline
\item Follow a user-specified time-varying adherence protocol for treatment \eqn{z}
\item Ensure an outcome measurement at the follow-up time of interest.
}
The time-varying adherence protocol is specified by indicating in \code{data} when an individual deviates from their adherence protocol. The function \code{\link{prep_data}} facilitates this step. See also "Formatting \code{data}".

\strong{Formatting \code{data}}

The input data set \code{data} must be a data table (or data frame) in a "long" format, where each row represents one time interval for one individual. The data frame should contain the following columns:
\itemize{
\item \code{id}: A unique identifier for each participant.
\item \code{time}: The follow-up time index, starting from 0 and increasing in increments of 1 in consecutive rows.
\item Covariate columns: One or more columns for baseline and time-varying covariates.
\item \code{Z}: The treatment initiated at baseline.
\item \code{A}: An indicator for adherence to the treatment protocol at each time point.
\item \code{R}: An indicator of whether the outcome was measured at that time point (1 for measured, 0 for not measured/censored).
\item \code{Y}: The outcome variable, which can be binary or continuous.
}
To specify the intervention, the data set should additionally have the following columns:
\itemize{
\item \code{C_artificial}: An indicator specifying when an individual should be artificially censored from the data due to violating the adherence protocol.
\item \code{A_model_eligible}: An indicator specifying which records should be used for fitting the treatment adherence model.
}
The \code{\link{prep_data}} function facilitates adding these columns to the data set. Users may optionally include the following column for fitting the outcome measurement model:
\itemize{
\item \code{R_model_denominator_eligible}: An indicator specifying which records should be used for fitting the outcome measurement model \code{R_model_denominator_eligible}.
}
Otherwise, the \code{R_model_denominator_eligible} is fit on all records on the artificially censored data set.

\strong{Specifying the models}

Users must specify model statements for the treatment (\code{A_model}), outcome measurement (\code{R_model_numerator} and \code{R_model_denominator}), and outcome variable (\code{Y_model}). The package uses pooled-over-time generalized linear models that are fit over the relevant time points (see "Formatting \code{data}"), where logistic regression is used for binary variables and linear regression is used for continuous variables.

For stabilized weights, the outcome measurement model \code{R_model_numerator} should \strong{only} include baseline covariates, treatment initiated \code{Z}, and \code{time} as predictors. It must not include time-varying covariates as predictors. The outcome model \code{Y_model} should also only depend on baseline covariates, treatment initiated \code{Z}, and \code{time} (if using time smoothing).

\strong{A note on the outcome definition at baseline}

In some settings, the outcome may not be defined in the baseline time interval. The \code{ipw} function can accommodate such settings in two ways:
\enumerate{
\item Users can set a value of \code{NA} in the column \code{Y} in the input data set \code{data} in rows corresponding to time 0. In this case, users should ensure that \code{include_baseline_outcome} is set to \code{FALSE}.
\item Users can specify the value of \eqn{Y_{t+1}} (rather than \eqn{Y_t}) in the column \code{Y} in the input data set \code{data} in rows corresponding to time \eqn{t}. That is, the value supplied for \code{Y} in the input data set \code{data} at time 0 is \eqn{Y_1}. In this case, users should ensure that \code{include_baseline_outcome} is set to \code{TRUE}. Users should also set \code{outcome_times} accordingly.
}

Note that these two approaches involve different assumptions. For example, the first approach allows the outcome at time \eqn{t} to depend on time-varying covariates up to and including time \eqn{t}, whereas the second approach only allows the outcome at time \eqn{t} to depend on covariates up to and including time \eqn{t-1}.
}
\examples{

## Time-smoothed IPW without deaths (continuous outcome)
data_null_processed <- prep_data(data = data_null, grace_period_length = 2,
                                 baseline_vars = 'L')
res <- ipw(data = data_null_processed,
           time_smoothed = TRUE,
           outcome_times = c(6, 12, 18, 24),
           A_model = A ~ L + Z,
           R_model_numerator = R ~ L_baseline + Z,
           R_model_denominator = R ~ L + A + Z,
           Y_model = Y ~ L_baseline * (time + Z))
res

## Time-smoothed IPW with deaths, nonstacked smoothing method (continuous outcome)
data_null_deaths_processed <- prep_data(data = data_null_deaths, grace_period_length = 2,
                                        baseline_vars = 'L')
res <- ipw(data = data_null_deaths_processed,
           time_smoothed = TRUE,
           smoothing_method = 'nonstacked',
           outcome_times = c(6, 12, 18, 24),
           A_model = A ~ L + Z,
           R_model_numerator = R ~ L_baseline + Z,
           R_model_denominator = R ~ L + A + Z,
           Y_model = Y ~ L_baseline * (time + Z))
res

## Time-smoothed IPW with deaths, stacked smoothing method (binary outcome)
\donttest{
data_null_deaths_binary_processed <- prep_data(data = data_null_deaths_binary,
                                               grace_period_length = 2,
                                               baseline_vars = 'L')
res <- ipw(data = data_null_deaths_binary_processed,
           time_smoothed = TRUE,
           smoothing_method = 'stacked',
           outcome_times = c(6, 12, 18, 24),
           A_model = A ~ L + Z,
           R_model_numerator = R ~ L_baseline + Z,
           R_model_denominator = R ~ L + A + Z,
           Y_model = Y ~ L_baseline * (time + Z))
res$est
}

}
\references{
McGrath S, Kawahara T, Petimar J, Rifas-Shiman SL, Díaz I, Block JP, Young JG. (2025). Time-smoothed inverse probability weighted estimation of effects of generalized time-varying treatment strategies on repeated outcomes truncated by death. arXiv e-prints arXiv:2509.13971.
}
