% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/RPSS.R
\name{RPSS}
\alias{RPSS}
\title{Compute the Ranked Probability Skill Score}
\usage{
RPSS(
  exp,
  obs,
  ref = NULL,
  time_dim = "sdate",
  memb_dim = "member",
  cat_dim = NULL,
  dat_dim = NULL,
  prob_thresholds = c(1/3, 2/3),
  indices_for_clim = NULL,
  Fair = FALSE,
  weights_exp = NULL,
  weights_ref = NULL,
  cross.val = FALSE,
  na.rm = FALSE,
  sig_method.type = "two.sided.approx",
  alpha = 0.05,
  N.eff = NA,
  ncores = NULL
)
}
\arguments{
\item{exp}{A named numerical array of either the forecast with at least time
and member dimensions, or the probabilities with at least time and category
dimensions. The probabilities can be generated by \code{s2dv::GetProbs}.}

\item{obs}{A named numerical array of either the observation with at least 
time dimension, or the probabilities with at least time and category 
dimensions. The probabilities can be generated by \code{s2dv::GetProbs}. The
dimensions must be the same as 'exp' except 'memb_dim' and 'dat_dim'.}

\item{ref}{A named numerical array of either the reference forecast with at 
least time and member dimensions, or the probabilities with at least time and
category dimensions. The probabilities can be generated by 
\code{s2dv::GetProbs}. The dimensions must be the same as 'exp' except 
'memb_dim' and 'dat_dim'. If there is only one reference dataset, it should
not have dataset dimension. If there is corresponding reference for each
experiment, the dataset dimension must have the same length as in 'exp'. If
'ref' is NULL, the climatological forecast is used as reference forecast.
The default value is NULL.}

\item{time_dim}{A character string indicating the name of the time dimension.
The default value is 'sdate'.}

\item{memb_dim}{A character string indicating the name of the member dimension
to compute the probabilities of the forecast and the reference forecast. The
default value is 'member'. If the data are probabilities, set memb_dim as 
NULL.}

\item{cat_dim}{A character string indicating the name of the category 
dimension that is needed when exp, obs, and ref are probabilities. The
default value is NULL, which means that the data are not probabilities.}

\item{dat_dim}{A character string indicating the name of dataset dimension. 
The length of this dimension can be different between 'exp' and 'obs'. 
The default value is NULL.}

\item{prob_thresholds}{A numeric vector of the relative thresholds (from 0 to
1) between the categories. The default value is c(1/3, 2/3), which 
corresponds to tercile equiprobable categories.}

\item{indices_for_clim}{A vector of the indices to be taken along 'time_dim' 
for computing the thresholds between the probabilistic categories. If NULL,
the whole period is used. The default value is NULL.}

\item{Fair}{A logical indicating whether to compute the FairRPSS (the 
potential RPSS that the forecast would have with an infinite ensemble size).
The default value is FALSE.}

\item{weights_exp}{A named numerical array of the forecast ensemble weights
for probability calculation. The dimension should include 'memb_dim', 
'time_dim' and 'dat_dim' if there are multiple datasets. All dimension 
lengths must be equal to 'exp' dimension lengths. The default value is NULL,
which means no weighting is applied. The ensemble should have at least 70 
members or span at least 10 time steps and have more than 45 members if 
consistency between the weighted and unweighted methodologies is desired.}

\item{weights_ref}{Same as 'weights_exp' but for the reference forecast.}

\item{cross.val}{A logical indicating whether to compute the thresholds
between probabilistics categories in cross-validation. The default value is
FALSE.}

\item{na.rm}{A logical or numeric value between 0 and 1. If it is numeric, it 
means the lower limit for the fraction of the non-NA values. 1 is equal to 
FALSE (no NA is acceptable), 0 is equal to TRUE (all NAs are acceptable). 
than na.rm. Otherwise, RPS will be calculated. The default value is FALSE.}

\item{sig_method.type}{A character string indicating the test type of the
significance method. Check \code{RandomWalkTest()} parameter 
\code{test.type} for details. The default is 'two.sided.approx', which is 
the default of \code{RandomWalkTest()}.}

\item{alpha}{A numeric of the significance level to be used in the statistical
significance test. The default value is 0.05.}

\item{N.eff}{Effective sample size to be used in the statistical significance
test. It can be NA (and it will be computed with the s2dv:::.Eno), FALSE 
(and it will use the length of 'obs' along 'time_dim', so the 
autocorrelation is not taken into account), a numeric (which is used for 
all cases), or an array with the same dimensions as 'obs' except 'time_dim'
(for a particular N.eff to be used for each case). The default value is NA.}

\item{ncores}{An integer indicating the number of cores to use for parallel 
computation. The default value is NULL.}
}
\value{
\item{$rpss}{
 A numerical array of RPSS with dimensions c(nexp, nobs, the rest dimensions 
 of 'exp' except 'time_dim' and 'memb_dim' dimensions). nexp is the number of 
 experiment (i.e., dat_dim in exp), and nobs is the number of observation 
 i.e., dat_dim in obs). If dat_dim is NULL, nexp and nobs are omitted.
}
\item{$sign}{
 A logical array of the statistical significance of the RPSS with the same 
 dimensions as $rpss.
}
}
\description{
The Ranked Probability Skill Score (RPSS; Wilks, 2011) is the skill score 
based on the Ranked Probability Score (RPS; Wilks, 2011). It can be used to 
assess whether a forecast presents an improvement or worsening with respect to
a reference forecast. The RPSS ranges between minus infinite and 1. If the 
RPSS is positive, it indicates that the forecast has higher skill than the 
reference forecast, while a negative value means that it has a lower skill.\cr 
Examples of reference forecasts are the climatological forecast (same 
probabilities for all categories for all time steps), persistence, a previous
model version, and another model. It is computed as 
\code{RPSS = 1 - RPS_exp / RPS_ref}. The statistical significance is obtained 
based on a Random Walk test at the specified confidence level (DelSole and 
Tippett, 2016).\cr
The function accepts either the ensemble members or the probabilities of
each data as inputs. If there is more than one dataset, RPSS will be 
computed for each pair of exp and obs data. The NA ratio of data will be  
examined before the calculation. If the ratio is higher than the threshold
(assigned by parameter \code{na.rm}), NA will be returned directly. NAs are 
counted by per-pair method, which means that only the time steps that all the
datasets have values count as non-NA values.
}
\examples{
set.seed(1)
exp <- array(rnorm(3000), dim = c(lat = 3, lon = 2, member = 10, sdate = 50))
set.seed(2)
obs <- array(rnorm(300), dim = c(lat = 3, lon = 2, sdate = 50))
set.seed(3)
ref <- array(rnorm(3000), dim = c(lat = 3, lon = 2, member = 10, sdate = 50))
weights <- sapply(1:dim(exp)['sdate'], function(i) {
            n <- abs(rnorm(10))
            n/sum(n)
          })
dim(weights) <- c(member = 10, sdate = 50)
# Use data as input
res <- RPSS(exp = exp, obs = obs) ## climatology as reference forecast
res <- RPSS(exp = exp, obs = obs, ref = ref) ## ref as reference forecast
res <- RPSS(exp = exp, obs = obs, ref = ref, weights_exp = weights, weights_ref = weights)
res <- RPSS(exp = exp, obs = obs, alpha = 0.01, sig_method.type = 'two.sided')

# Use probs as input
exp_probs <- GetProbs(exp, memb_dim = 'member')
obs_probs <- GetProbs(obs, memb_dim = NULL)
ref_probs <- GetProbs(ref, memb_dim = 'member')
res <- RPSS(exp = exp_probs, obs = obs_probs, ref = ref_probs, memb_dim = NULL, 
           N.eff = FALSE, cat_dim = 'bin')

}
\references{
Wilks, 2011; https://doi.org/10.1016/B978-0-12-385022-5.00008-7
DelSole and Tippett, 2016; https://doi.org/10.1175/MWR-D-15-0218.1
}
