---
title: "Working with multiple APIs"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Working with multiple APIs}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown}
editor_options: 
  chunk_output_type: console
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  eval = FALSE
)
```

One of pixieweb's strengths is its ability to connect to any PX-Web
instance with the same interface. This vignette shows how to compare
data across national statistics agencies.

**The honest truth about cross-country comparison:** The pixieweb functions
work identically across APIs, but the *data* is not harmonised. Table
IDs, variable names, and code systems differ between countries. The
workflow is: find a comparable table in each country (the hard part),
then use identical pixieweb code to fetch and combine the results.

> **Prerequisite:** This vignette assumes you are comfortable with the
> basics from `vignette("a-quickstart")`.

## Available APIs

```{r}
library(pixieweb)

px_api_catalogue()
```

pixieweb ships with a catalogue of known PX-Web instances. You can also
connect to any PX-Web API by providing a full URL.

## Connecting to multiple agencies

```{r}
scb <- px_api("scb", lang = "en")      # Sweden (v2)
ssb <- px_api("ssb", lang = "en")      # Norway (v2)
statfi <- px_api("statfi", lang = "en") # Finland (v1)
```

Each API object stores the base URL, language, API version, and
configuration (cell limits, rate limits):

```{r}
scb
ssb
```

## API version differences

PX-Web has two API versions:
- **v1**: Legacy, POST-only data queries, no search endpoint.
  Table discovery requires walking a folder hierarchy.
- **v2**: Modern, GET+POST data queries, full-text search, codelists
  endpoint, saved queries.

pixieweb handles both versions transparently. The user-facing functions
have the same signatures — only the internal request building differs.

Some selection helpers are v2-only: `px_bottom()`, `px_from()`,
`px_to()`, and `px_range()` will raise an informative error if used
against a v1 API.

## Cross-country comparison example

Suppose you want to compare population data across Sweden and Norway.
The table IDs and variable codes will differ, but the workflow is
identical:

```{r}
library(dplyr)
library(purrr)

# Find population tables in each country
scb_tables <- get_tables(scb, query = "population")
ssb_tables <- get_tables(ssb, query = "population")

# Explore a table from each
scb_tables |> table_describe(max_n = 3)
ssb_tables |> table_describe(max_n = 3)
```

Note that table IDs are completely different between countries, and
variable names may also differ ("Region" in SCB vs other names
elsewhere). Always run `variable_describe()` on each table before
building your query:

```{r}
# Fetch data using prepare_query() for quick exploration
scb_q <- prepare_query(scb, "TAB638",
  Region = "00",       # "Riket" (whole country)
  Tid = px_top(5),
  ContentsCode = "BE0101N1" # Population
)

# Norwegian table IDs are different — explore to find the right one
ssb_vars <- get_variables(ssb, "05803")
ssb_vars |> variable_describe()
```

## Combining results

Since `get_data()` returns standard tibbles with a `table_id` column,
you can bind results from different APIs:

```{r}
results <- list(
  sweden = get_data(scb, query = scb_q),
  norway = get_data(ssb, "05803",
    ContentsCode = "Personer",
    Tid = px_top(5)
  )
)

# .id = "country" adds a column tracking which list element each row
# came from — essential for traceability after binding
bind_rows(results, .id = "country")
# NOTE: column names may differ between countries. If so, you may need
# to rename() before bind_rows() to align them.
```

## Tips for cross-agency work

- **Language matters.** Codes are often language-dependent. `lang = "en"`
  gives the most consistent *labels* across countries, but codes and
  table IDs are language-independent.
- **Table structure varies.** Swedish tables may have "Region" while
  Finnish tables have "Alue". Run
  `get_variables() |> variable_describe()` on each table before writing
  queries.
- **API limits differ.** SCB allows ~100 000 cells per request; other
  agencies may allow less. Use `api$config$max_cells` to check.
  `prepare_query()` respects the limit automatically.
- **v1 vs v2.** Not all agencies have migrated to v2. Selection helpers
  `px_from()`, `px_range()` etc. raise an informative error if used
  against a v1 API. Check `api$version` and the catalogue's `versions`
  column.

## Next steps

- **Data model & advanced features** — `vignette("introduction-to-pixieweb")`
  covers codelists, wide output, and query composition.
- **Quick refresher** — `vignette("a-quickstart")` for the single-API
  basics.
