% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/seek.R
\name{seek}
\alias{seek}
\alias{seek_in}
\title{Extract Matching Lines from Files}
\usage{
seek(
  pattern,
  path = ".",
  ...,
  filter = NULL,
  negate = FALSE,
  recurse = FALSE,
  all = FALSE,
  relative_path = TRUE,
  matches = FALSE
)

seek_in(files, pattern, ..., matches = FALSE)
}
\arguments{
\item{pattern}{A regular expression pattern used to match lines.}

\item{path}{A character vector of one or more directories where files should be
discovered (only for \code{seek()}).}

\item{...}{Additional arguments passed to \code{\link[readr:read_lines]{readr::read_lines()}}, such as
\code{skip}, \code{n_max}, or \code{locale}.}

\item{filter}{Optional. A regular expression pattern used to filter file paths
before reading. If \code{NULL}, all text files are considered.}

\item{negate}{Logical. If \code{TRUE}, files matching the \code{filter} pattern are excluded
instead of included. Useful to skip files based on name or extension.}

\item{recurse}{If \code{TRUE} recurse fully, if a positive number the number of levels
to recurse.}

\item{all}{If \code{TRUE} hidden files are also returned.}

\item{relative_path}{Logical. If TRUE, file paths are made relative to the
path argument. If multiple root paths are provided, relative_path is
automatically ignored and absolute paths are kept to avoid ambiguity.}

\item{matches}{Logical. If \code{TRUE}, all matches per line are also returned in a
\code{matches} list-column.}

\item{files}{A character vector of files to search (only for \code{seek_in()}).}
}
\value{
A tibble with one row per matched line, containing:
\itemize{
\item \code{path}: File path (relative or absolute).
\item \code{line_number}: Line number in the file.
\item \code{match}: The first matched substring.
\item \code{matches}: All matched substrings (if \code{matches = TRUE}).
\item \code{line}: Full content of the matching line.
}
}
\description{
These functions search through one or more text files, extract lines matching
a regular expression pattern, and return a tibble containing the results.
\itemize{
\item \code{seek()}: Discovers files inside one or more directories (recursively or not),
applies optional file name and text file filtering, and searches lines.
\item \code{seek_in()}: Searches inside a user-provided character vector of files.
}
}
\details{
\ifelse{html}{\href{https://lifecycle.r-lib.org/articles/stages.html#experimental}{\figure{lifecycle-experimental.svg}{options: alt='[Experimental]'}}}{\strong{[Experimental]}}

The overall process involves the following steps:
\itemize{
\item \strong{File Selection}
\itemize{
\item \code{seek()}: Files are discovered using \code{\link[fs:dir_ls]{fs::dir_ls()}}, starting from one or more directories.
\item \code{seek_in()}: Files are directly supplied by the user (no discovery phase).
}
\item \strong{File Filtering}
\itemize{
\item Files located inside \verb{.git/} folders are automatically excluded.
\item Files with known non-text extensions (e.g., \code{.png}, \code{.exe}, \code{.rds}) are excluded.
\item If a file's extension is unknown, a check is performed to detect embedded null bytes (binary indicator).
\item Optionally, an additional regex-based path filter (\code{filter}) can be applied.
}
\item \strong{Line Reading}
\itemize{
\item Files are read line-by-line using \code{\link[readr:read_lines]{readr::read_lines()}}.
\item Only lines matching the provided regular expression \code{pattern} are retained.
\item If a file cannot be read, it is skipped gracefully without failing the process.
}
\item \strong{Data Frame Construction}
\itemize{
\item A tibble is constructed with one row per matched line.
}
}

These functions are particularly useful for analyzing source code,
configuration files, logs, and other structured text data.
}
\examples{
path = system.file("extdata", package = "seekr")

# Search all function definitions in R files
seek("[^\\\\s]+(?= (=|<-) function\\\\()", path, filter = "\\\\.R$")

# Search for usage of "TODO" comments in source code in a case insensitive way
seek("(?i)TODO", path, filter = "\\\\.R$")

# Search for error/warning in log files
seek("(?i)error", path, filter = "\\\\.log$")

# Search for config keys in YAML
seek("database:", path, filter = "\\\\.ya?ml$")

# Looking for "length" in all types of text files
seek("(?i)length", path)

# Search for specific CSV headers using seek_in() and reading only the first line
csv_files <- list.files(path, "\\\\.csv$", full.names = TRUE)
seek_in(csv_files, "(?i)specie", n_max = 1)

}
\seealso{
\code{\link[fs:dir_ls]{fs::dir_ls()}}, \code{\link[readr:read_lines]{readr::read_lines()}}, \code{\link[stringr:str_detect]{stringr::str_detect()}}
}
