% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/gofstatistics.R
\name{gof-statistics}
\alias{gof-statistics}
\alias{gofstatistics}
\alias{dsp}
\alias{esp}
\alias{nsp}
\alias{deg}
\alias{b1deg}
\alias{b2deg}
\alias{odeg}
\alias{ideg}
\alias{kstar}
\alias{b1star}
\alias{b2star}
\alias{ostar}
\alias{istar}
\alias{kcycle}
\alias{geodesic}
\alias{triad.directed}
\alias{triad.undirected}
\alias{comemb}
\alias{walktrap.modularity}
\alias{walktrap.roc}
\alias{walktrap.pr}
\alias{fastgreedy.modularity}
\alias{fastgreedy.roc}
\alias{fastgreedy.pr}
\alias{louvain.modularity}
\alias{louvain.roc}
\alias{louvain.pr}
\alias{maxmod.modularity}
\alias{maxmod.roc}
\alias{maxmod.pr}
\alias{edgebetweenness.modularity}
\alias{edgebetweenness.roc}
\alias{edgebetweenness.pr}
\alias{spinglass.modularity}
\alias{spinglass.roc}
\alias{spinglass.pr}
\alias{rocpr}
\title{Statistics for goodness-of-fit assessment of network models}
\usage{
dsp(mat, ...)

esp(mat, ...)

nsp(mat, ...)

deg(mat, ...)

b1deg(mat, ...)

b2deg(mat, ...)

odeg(mat, ...)

ideg(mat, ...)

kstar(mat, ...)

b1star(mat, ...)

b2star(mat, ...)

ostar(mat, ...)

istar(mat, ...)

kcycle(mat, ...)

geodesic(mat, ...)

triad.directed(mat, ...)

triad.undirected(mat, ...)

comemb(vec)

walktrap.modularity(mat, ...)

walktrap.roc(sim, obs, ...)

walktrap.pr(sim, obs, ...)

fastgreedy.modularity(mat, ...)

fastgreedy.roc(sim, obs, ...)

fastgreedy.pr(sim, obs, ...)

louvain.modularity(mat, ...)

louvain.roc(sim, obs, ...)

louvain.pr(sim, obs, ...)

maxmod.modularity(mat, ...)

maxmod.roc(sim, obs, ...)

maxmod.pr(sim, obs, ...)

edgebetweenness.modularity(mat, ...)

edgebetweenness.roc(sim, obs, ...)

edgebetweenness.pr(sim, obs, ...)

spinglass.modularity(mat, ...)

spinglass.roc(sim, obs, ...)

spinglass.pr(sim, obs, ...)

rocpr(sim, obs, roc = TRUE, pr = TRUE, joint = TRUE, pr.impute = "poly4", ...)
}
\arguments{
\item{mat}{A sparse network matrix as created by the \code{Matrix} function
in the \pkg{Matrix} package.}

\item{...}{Additional arguments. This must be present in all auxiliary GOF
statistics.}

\item{vec}{A vector of community memberships in order to create a community
co-membership matrix.}

\item{sim}{A list of simulated networks. Each element in the list should be a
sparse matrix as created by the \code{\link[Matrix]{Matrix}} function in
the \pkg{Matrix} package.}

\item{obs}{A list of observed (= target) networks. Each element in the list
should be a sparse matrix as created by the \code{\link[Matrix]{Matrix}}
function in the \pkg{Matrix} package.}

\item{roc}{Compute receiver-operating characteristics (ROC)?}

\item{pr}{Compute precision-recall curve (PR)?}

\item{joint}{Merge all time steps into a single big prediction task and
compute predictive fit (instead of computing GOF for all time steps
separately)?}

\item{pr.impute}{In some cases, the first precision value of the
precision-recall curve is undefined. The \code{pr.impute} argument serves
to impute this missing value to ensure that the AUC-PR value is not
severely biased. Possible values are \code{"no"} for no imputation,
\code{"one"} for using a value of \code{1.0}, \code{"second"} for using the
next (= adjacent) precision value, \code{"poly1"} for fitting a straight
line through the remaining curve to predict the first value, \code{"poly2"}
for fitting a second-order polynomial curve etc. until \code{"poly9"}.
Warning: this is a pragmatic solution. Please double-check whether the
imputation makes sense. This can be checked by plotting the resulting
object and using the \code{pr.poly} argument to plot the predicted curve on
top of the actual PR curve.}
}
\description{
Statistics for goodness-of-fit assessment of network models.
}
\details{
These functions can be plugged into the \code{statistics} argument of the
\code{gof} methods in order to compare observed with simulated networks (see
the \link{gof-methods} help page). There are three types of statistics:
\enumerate{
  \item Univariate statistics, which aggregate a network into a single
    quantity. For example, modularity measures or density. The distribution
    of statistics can be displayed using histograms, density plots, and
    median bars. Univariate statistics take a sparse matrix (\code{mat})
    as an argument and return a single numeric value that summarize a network
    matrix.
  \item Multivariate statistics, which aggregate a network into a vector of
    quantities. For example, the distribution of geodesic distances, edgewise
    shared partners, or indegree. These statistics typically have multiple
    values, e.g., esp(1), esp(2), esp(3) etc. The results can be displayed
    using multiple boxplots for simulated networks and a black curve for the
    observed network(s). Multivariate statistics take a sparse matrix
    (\code{mat}) as an argument and return a vector of numeric values that
    summarize a network matrix.
  \item Tie prediction statistics, which predict dyad states the observed
    network(s) by the dyad states in the simulated networks. For example,
    receiver operating characteristics (ROC) or precision-recall curves (PR)
    of simulated networks based on the model, or ROC or PR predictions of
    community co-membership matrices of the simulated vs. the observed
    network(s). Tie prediction statistics take a list of simulated sparse
    network matrices and another list of observed sparse network matrices
    (possibly containing only a single sparse matrix) as arguments and return
    a \code{rocpr}, \code{roc}, or \code{pr} object (as created by the
    \link{rocpr} function).
}

Users can create their own statistics for use with the code{gof} methods. To
do so, one needs to write a function that accepts and returns the respective
objects described in the enumeration above. It is advisable to look at the
definitions of some of the existing functions to add custom functions. It is
also possible to add an attribute called \code{label} to the return object,
which describes what is being returned by the function. This label will be
used as a descriptive label in the plot and for verbose output during
computations. The examples section contains an example of a custom user
statistic. Note that all statistics \emph{must} contain the \code{...}
argument to ensure that custom arguments of other statistics do not cause an
error.

To aid the development of custom statistics, the helper function
\code{comemb} is available: it accepts a vector of community memberships and
converts it to a co-membership matrix. This function is also used internally
by statistics like \code{walktrap.roc} and others.
}
\section{Functions}{
\itemize{
\item \code{dsp}: Multivariate GOF statistic: dyad-wise shared
partner distribution

\item \code{esp}: Multivariate GOF statistic: edge-wise shared
partner distribution

\item \code{nsp}: Multivariate GOF statistic: non-edge-wise shared
partner distribution

\item \code{deg}: Multivariate GOF statistic: degree distribution

\item \code{b1deg}: Multivariate GOF statistic: degree distribution
for the first mode

\item \code{b2deg}: Multivariate GOF statistic: degree distribution
for the second mode

\item \code{odeg}: Multivariate GOF statistic: outdegree distribution

\item \code{ideg}: Multivariate GOF statistic: indegree distribution

\item \code{kstar}: Multivariate GOF statistic: k-star distribution

\item \code{b1star}: Multivariate GOF statistic: k-star distribution
for the first mode

\item \code{b2star}: Multivariate GOF statistic: k-star distribution
for the second mode

\item \code{ostar}: Multivariate GOF statistic: outgoing k-star
distribution

\item \code{istar}: Multivariate GOF statistic: incoming k-star
distribution

\item \code{kcycle}: Multivariate GOF statistic: k-cycle distribution

\item \code{geodesic}: Multivariate GOF statistic: geodesic distance
distribution

\item \code{triad.directed}: Multivariate GOF statistic: triad census in
directed networks

\item \code{triad.undirected}: Multivariate GOF statistic: triad census in
undirected networks

\item \code{comemb}: Helper function: create community co-membership
matrix

\item \code{walktrap.modularity}: Univariate GOF statistic: Walktrap modularity
distribution

\item \code{walktrap.roc}: Tie prediction GOF statistic: ROC of Walktrap
community detection. Receiver-operating characteristics of predicting the
community structure in the observed network(s) by the community structure
in the simulated networks, as computed by the Walktrap algorithm.

\item \code{walktrap.pr}: Tie prediction GOF statistic: PR of Walktrap
community detection. Precision-recall curve for predicting the community
structure in the observed network(s) by the community structure in the
simulated networks, as computed by the Walktrap algorithm.

\item \code{fastgreedy.modularity}: Univariate GOF statistic: fast and greedy
modularity distribution

\item \code{fastgreedy.roc}: Tie prediction GOF statistic: ROC of fast and
greedy community detection. Receiver-operating characteristics of
predicting the community structure in the observed network(s) by the
community structure in the simulated networks, as computed by the fast and
greedy algorithm. Only sensible with undirected networks.

\item \code{fastgreedy.pr}: Tie prediction GOF statistic: PR of fast and
greedy community detection. Precision-recall curve for predicting the
community structure in the observed network(s) by the community structure
in the simulated networks, as computed by the fast and greedy algorithm.
Only sensible with undirected networks.

\item \code{louvain.modularity}: Univariate GOF statistic: Louvain clustering
modularity distribution

\item \code{louvain.roc}: Tie prediction GOF statistic: ROC of Louvain
community detection. Receiver-operating characteristics of predicting the
community structure in the observed network(s) by the community structure
in the simulated networks, as computed by the Louvain algorithm.

\item \code{louvain.pr}: Tie prediction GOF statistic: PR of Louvain
community detection. Precision-recall curve for predicting the community
structure in the observed network(s) by the community structure in the
simulated networks, as computed by the Louvain algorithm.

\item \code{maxmod.modularity}: Univariate GOF statistic: maximal modularity
distribution

\item \code{maxmod.roc}: Tie prediction GOF statistic: ROC of maximal
modularity community detection. Receiver-operating characteristics of
predicting the community structure in the observed network(s) by the
community structure in the simulated networks, as computed by the
modularity maximization algorithm.

\item \code{maxmod.pr}: Tie prediction GOF statistic: PR of maximal
modularity community detection. Precision-recall curve for predicting the
community structure in the observed network(s) by the community structure
in the simulated networks, as computed by the modularity maximization
algorithm.

\item \code{edgebetweenness.modularity}: Univariate GOF statistic: edge betweenness
modularity distribution

\item \code{edgebetweenness.roc}: Tie prediction GOF statistic: ROC of edge
betweenness community detection. Receiver-operating characteristics of
predicting the community structure in the observed network(s) by the
community structure in the simulated networks, as computed by the
Girvan-Newman edge betweenness community detection method.

\item \code{edgebetweenness.pr}: Tie prediction GOF statistic: PR of edge
betweenness community detection. Precision-recall curve for predicting the
community structure in the observed network(s) by the community structure
in the simulated networks, as computed by the Girvan-Newman edge
betweenness community detection method.

\item \code{spinglass.modularity}: Univariate GOF statistic: spinglass modularity
distribution

\item \code{spinglass.roc}: Tie prediction GOF statistic: ROC of spinglass
community detection. Receiver-operating characteristics of predicting the
community structure in the observed network(s) by the community structure
in the simulated networks, as computed by the Spinglass algorithm.

\item \code{spinglass.pr}: Tie prediction GOF statistic: PR of spinglass
community detection. Precision-recall curve for predicting the community
structure in the observed network(s) by the community structure in the
simulated networks, as computed by the Spinglass algorithm.

\item \code{rocpr}: Tie prediction GOF statistic: ROC and PR curves.
Receiver-operating characteristics (ROC) and precision-recall curve (PR).
Prediction of the dyad states of the observed network(s) by the dyad states
of the simulated networks.
}}

\examples{
# To see how these statistics are used, look at the examples section of 
# ?"gof-methods". The following example illustrates how custom 
# statistics can be created. Suppose one is interested in the density 
# of a network. Then a univariate statistic can be created as follows.

dens <- function(mat, ...) {        # univariate: one argument
  mat <- as.matrix(mat)             # sparse matrix -> normal matrix
  d <- sna::gden(mat)               # compute the actual statistic
  attributes(d)$label <- "Density"  # add a descriptive label
  return(d)                         # return the statistic
}

# Note that the '...' argument must be present in all statistics. 
# Now the statistic can be used in the statistics argument of one of 
# the gof methods.

# For illustrative purposes, let us consider an existing statistic, the 
# indegree distribution, a multivariate statistic. It also accepts a 
# single argument. Note that the sparse matrix is converted to a 
# normal matrix object when it is used. First, statnet's summary 
# method is used to compute the statistic. Names are attached to the 
# resulting vector for the different indegree values. Then the vector 
# is returned.

ideg <- function(mat, ...) {
  d <- summary(mat ~ idegree(0:(nrow(mat) - 1)))
  names(d) <- 0:(length(d) - 1)
  attributes(d)$label <- "Indegree"
  return(d)
}

# See the gofstatistics.R file in the package for more complex examples.

}
\references{
Leifeld, Philip, Skyler J. Cranmer and Bruce A. Desmarais (2018): Temporal
Exponential Random Graph Models with btergm: Estimation and Bootstrap
Confidence Intervals. \emph{Journal of Statistical Software} 83(6): 1--36.
\doi{10.18637/jss.v083.i06}.
}
