% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/create_list_from_scratch.R
\name{create_list_from_scratch}
\alias{create_list_from_scratch}
\title{Create a sparse list representation of treatment-to-control distance
matrix with a caliper.}
\usage{
create_list_from_scratch(
  Z,
  X,
  exact = NULL,
  soft_exact = FALSE,
  p = NULL,
  caliper_low = NULL,
  caliper_high = NULL,
  k = NULL,
  alpha = 1,
  penalty = Inf,
  method = "maha",
  dist_func = NULL
)
}
\arguments{
\item{Z}{A length-n vector of treatment indicator.}

\item{X}{A n-by-p matrix of covariates.}

\item{exact}{A vector of strings indicating which variables need to be exactly matched.}

\item{soft_exact}{If set to TRUE, the exact constraint is enforced up to a large penalty.}

\item{p}{A length-n vector on which a caliper applies, e.g. a vector of propensity score.}

\item{caliper_low}{Size of caliper low.}

\item{caliper_high}{Size of caliper high.}

\item{k}{Connect each treated to the nearest k controls. See details section.}

\item{alpha}{Tuning parameter.}

\item{penalty}{Penalty for violating the caliper. Set to Inf by default.}

\item{method}{Method used to compute treated-control distance}

\item{dist_func}{A user-specified function that compute treate-control distance. See
details section.}
}
\value{
This function returns a list of three objects: start_n, end_n, and d.
        See documentation of function ``create_list_from_mat'' for more details.
}
\description{
This function takes in a n-by-p matrix of observed covariates,
a length-n vector of treatment indicator, a caliper, and construct
a possibly sparse list representation of the distance matrix.
}
\details{
Currently, there are 4 methods implemented in this function: 'maha'
(Mahalanobis distance), robust maha' (robust Mahalanobis distance),
'0/1' (distance = 0 if and only if covariates are the same),
'Hamming' (Hamming distance).

Users can also supply their own distance function by setting method = 'other' and
using the argument ``dist_func''. ``dist_func'' is a user-supplied distance
function in the following format:
dist_func(controls, treated), where treated is a length-p vector
of covaraites and controls is a n_c-by-p matrix of covariates.
The output of function dist_func is a length-n_c vector of distance
between each control and the treated.

There are two options for users to make a network sparse. Option caliper
is a value applied to the vector p to avoid connecting treated to controls
whose covariate or propensity score defined by p is outside p +/- caliper.
Second, within a specified caliper, sometimes there are still too many controls
connected to each treated, and we can further trim down this number up to k
by restricting our attention to the k nearest (in p) to each treated.

By default a hard caliper is applied, i.e., option penalty is set to Inf by default.
Users may make the caliper a soft one by setting penalty to a large yet finite number.
}
\examples{
# We first prepare the input X, Z, propensity score

attach(dt_Rouse)
X = cbind(female,black,bytest,dadeduc,momeduc,fincome)
Z = IV
propensity = glm(IV~female+black+bytest+dadeduc+momeduc+fincome,
                family=binomial)$fitted.values
detach(dt_Rouse)

# Create distance lists with built-in options.

# Mahalanobis distance with propensity score caliper = 0.05
# and k = 100.

dist_list_pscore_maha = create_list_from_scratch(Z, X, p = propensity,
                               caliper_low = 0.05, k = 100, method = 'maha')


# More examples, including how to use a user-supplied
# distance function, can be found in the vignette.

}
