---
title: "Designing precise queries across disciplines"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Designing precise queries across disciplines}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
```

```{r setup}
library(scopusflow)
```

A retrieval is only as good as its query. This article shows how to compose
correct, field-tagged 'Scopus' queries with `scopus_query()` rather than pasting
fragments by hand, where a missing bracket or a mistyped tag quietly returns the
wrong records. Everything here is string construction, so it all runs offline;
each query is shown as the literal string it produces.

## Field tags decide where to look

A field tag restricts a query to part of a record. `scopus_field_tags()` lists
the common ones.

```{r}
scopus_field_tags()
```

The most generally useful tag is `TITLE-ABS-KEY`, which searches the title,
abstract and keywords together, broad enough to catch a topic without the noise
of a full-text match.

## One term, many disciplines

The same builder serves any field. Each call below returns the exact query string
that would be sent to 'Scopus'.

```{r}
scopus_query("CRISPR", .field = "TITLE-ABS-KEY")              # molecular biology
scopus_query("gravitational waves", .field = "TITLE-ABS-KEY") # physics
scopus_query("microplastics", .field = "TITLE-ABS-KEY")       # environmental science
scopus_query("blockchain", .field = "TITLE-ABS-KEY")          # computer science
scopus_query("digital humanities", .field = "AUTHKEY")        # humanities
```

The last example uses `AUTHKEY`, the author-supplied keywords, which isolates work
that self-identifies with a field and so cuts incidental mentions.

## Combining terms with boolean operators

Passing several terms joins them. The default operator is `AND`, and `OR` or
`AND NOT` are available through `.op`.

```{r}
# Two concepts that must co-occur (materials science).
scopus_query("perovskite", "solar cell", .field = "TITLE-ABS-KEY")

# Spelling variants, either of which will do (economics).
scopus_query("behavioral economics", "behavioural economics", .op = "OR")

# A family of related tools (molecular biology).
scopus_query("CRISPR", "Cas9", "Cas12", .op = "OR")
```

## From a query to a plan

A composed query drops straight into the rest of the workflow. Here it anchors a
year-partitioned plan, which keeps each cell under the API's 5000-record ceiling.

```{r}
q <- scopus_query("gut microbiome", "immunology", .field = "TITLE-ABS-KEY")
q
plan <- scopus_plan(q, years = 2015:2022, partition = "year")
plan
```

The plan is ready to size and run, which contacts the API.

```{r eval = FALSE}
scopus_count(q, years = 2015:2022)
records <- scopus_fetch_plan(plan)
```

## Searching by affiliation

Field tags reach beyond topics. `AFFILORG` searches the affiliation, which turns a
query into an institution-level view of output.

```{r}
scopus_query("Max Planck", .field = "AFFILORG")
```

## When a term is empty

The builder validates its input, so a stray empty term is caught early rather
than producing a malformed query.

```{r}
tryCatch(
  scopus_query("graphene", ""),
  scopus_error_bad_input = function(e) conditionMessage(e)
)
```
