% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/f_summarise.R
\name{f_summarise}
\alias{f_summarise}
\alias{f_summarize}
\title{Summarise each group down to one row}
\usage{
f_summarise(.data, ..., .by = NULL, .order = group_by_order_default(.data))

f_summarize(.data, ..., .by = NULL, .order = group_by_order_default(.data))
}
\arguments{
\item{.data}{A data frame.}

\item{...}{Name-value pairs of summary functions. Expressions with
\code{across()} are also accepted.}

\item{.by}{(Optional). A selection of columns to group by for this operation.
Columns are specified using tidy-select.}

\item{.order}{Should the groups be returned in sorted order?
If \code{FALSE}, this will return the groups in order of first appearance,
and in many cases is faster.}
}
\value{
An un-grouped data frame of summaries by group.
}
\description{
Like \code{dplyr::summarise()} but with some internal optimisations
for common statistical functions.
}
\section{Details}{


fastplyr data-masking functions like \code{f_mutate} and \code{f_summarise} operate
very similarly to their dplyr counterparts but with some crucial
differences.
Optimisations for by-group operations kick in for
common statistical functions which are detailed below.
A message will be printed which one can disable
by running \code{options(fastplyr.inform = FALSE)}.
When this happens, the expressions which become optimised no longer
obey data-masking rules pertaining to sequential and dependent expression
execution.
For example,
the pseudo code
\code{f_summarise(data, mean = mean(x), mean2 = round(mean), .by = g)}
when optimised will not work because the named col \code{mean} will not be visible
in later expressions.

One can disable fastplyr optimisations
globally by running \code{options(fastplyr.optimise = F)}.
\subsection{Optimised statistical functions}{

Some functions are internally optimised using 'collapse'
fast statistical functions. This makes execution on many groups very fast.

For fast quantiles (percentiles) by group, see \link{tidy_quantiles}

List of currently optimised functions

\code{dplyr::n} -> <custom_expression> \cr
\code{dplyr::row_number} -> <custom_expression> (only for \code{f_mutate}) \cr
\code{dplyr::cur_group} -> <custom_expression> \cr
\code{dplyr::cur_group_id} -> <custom_expression> \cr
\code{dplyr::cur_group_rows} -> <custom_expression> (only for \code{f_mutate}) \cr
\code{dplyr::lag} -> <custom_expression> (only for \code{f_mutate}) \cr
\code{dplyr::lead} -> <custom_expression> (only for \code{f_mutate}) \cr
\code{base::sum} -> \code{collapse::fsum} \cr
\code{base::prod} -> \code{collapse::fprod} \cr
\code{base::min} -> \code{collapse::fmin} \cr
\code{base::max} -> \code{collapse::fmax} \cr
\code{stats::mean} -> \code{collapse::fmean} \cr
\code{stats::median} -> \code{collapse::fmedian} \cr
\code{stats::sd} -> \code{collapse::fsd} \cr
\code{stats::var} -> \code{collapse::fvar} \cr
\code{dplyr::first} -> \code{collapse::ffirst} \cr
\code{dplyr::last} -> \code{collapse::flast} \cr
\code{dplyr::n_distinct} -> \code{collapse::fndistinct} \cr
}
}

\examples{
library(fastplyr)
library(nycflights13)
library(dplyr)
options(fastplyr.inform = FALSE)
# Number of flights per month, including first and last day
flights |>
  f_group_by(year, month) |>
  f_summarise(first_day = first(day),
              last_day = last(day),
              num_flights = n())

## Fast mean summary using `across()`

flights |>
  f_summarise(
    across(where(is.numeric), mean),
    .by = tailnum
  )

flights |>
  f_group_by(.cols = "tailnum") |>
  f_summarise(
    across(where(is.numeric), mean)
  )
}
\seealso{
\link{tidy_quantiles}
}
