The Public Suffix List (PSL)
is a community-curated list of the domain suffixes under which Internet
users can directly register names. pslr bundles a pinned
snapshot of that list and implements the official
prevailing-rule algorithm to answer two core questions about a
hostname:
co.uk for example.co.uk.example.co.uk.public_suffix("www.example.co.uk")
#> [1] "co.uk"
registrable_domain("www.example.co.uk")
#> [1] "example.co.uk"The matcher is compiled with cpp11 and needs no external
system library. Hostname canonicalization (case folding and Unicode/IDNA
handling) is delegated to the punycoder
package.
com, *.ck, or !www.ck.com,
co.uk).*.ck means
every label directly under ck is itself a public
suffix.!www.ck carves a
single name back out of a wildcard.*:
any unlisted TLD label is treated as a public suffix.github.io).The prevailing rule is chosen as: an exception beats a wildcard, the longest match beats shorter matches, and the implicit default applies only when nothing else does.
section selects which rules are eligible. Filtering
happens before prevailing-rule selection, so asking for one
section never silently borrows a rule from the other.
# github.io is a PRIVATE rule sitting under the ICANN suffix io.
public_suffix("user.github.io", section = "all") # default scope, both sections
#> [1] "github.io"
public_suffix("user.github.io", section = "icann") # the ICANN rule for io
#> [1] "io"
public_suffix("user.github.io", section = "private")
#> [1] "github.io"section = "private" fall-throughWhen you restrict to a section and the host matches no explicit rule
there, the query falls through to the implicit default rule rather than
failing. A plain ICANN host queried under
section = "private" therefore resolves to its own last
label via the default rule:
To distinguish “no explicit rule matched” from a real match, combine
the section with unknown = "na" (below).
By default an unlisted suffix is handled by the implicit
* rule, so a made-up TLD still yields a public suffix. Pass
unknown = "na" to require an explicit rule and get
NA otherwise.
public_suffix("example.madeuptld") # default rule
#> [1] "madeuptld"
public_suffix("example.madeuptld", unknown = "na") # explicit-only
#> [1] NAis_public_suffix() reports whether a host is itself a
public suffix. Under the default policy an unlisted single label is
TRUE via the implicit rule; use unknown = "na"
to test explicit list membership instead.
Input may be ASCII, Unicode, or A-label (xn--)
hostnames; equivalent spellings canonicalize to the same answer. Output
is ASCII A-labels by default; pass output = "unicode" to
decode them.
A single terminal root dot is preserved on hostname-shaped output, so a fully-qualified name round-trips:
suffix_extract() splits each host into subdomain,
registrant label, and suffix; public_suffix_rule() reports
which rule prevailed, useful for auditing.
suffix_extract("blog.user.github.io")
#> input host subdomain domain suffix
#> 1 blog.user.github.io blog.user.github.io blog user github.io
#> registrable_domain
#> 1 user.github.io
public_suffix_rule(c("www.ck", "a.b.kobe.jp", "example.madeuptld"))
#> input host_ascii rule kind rule_section
#> 1 www.ck www.ck !www.ck exception icann
#> 2 a.b.kobe.jp a.b.kobe.jp *.kobe.jp wildcard icann
#> 3 example.madeuptld example.madeuptld * default <NA>
#> public_suffix_ascii
#> 1 ck
#> 2 b.kobe.jp
#> 3 madeuptldAll query functions are vectorised, length- and name-preserving, and
NA-safe. Invalid input (URLs, IPv6, empty labels, dotted-decimal IPv4
literals, …) is NA by default; pass
invalid = "error" to abort on the first invalid
element.
The package ships with a pinned snapshot, so it works fully offline
and the bundled list is the default for every query.
psl_refresh() is the only function that touches
the network: an explicit, HTTPS-only, validated download into a user
cache. psl_use() chooses which list backs the session.
# Download and validate a fresh list into the user cache, then activate it:
psl_refresh(activate = TRUE)
# Switch the active list for this session:
psl_use("cache") # the latest refreshed snapshot
psl_use("bundled") # back to the shipped snapshot
psl_use("path", path = "my_list.dat") # a custom fileActivation is session-only and validated before any state changes; a failed refresh never replaces a working cache or active list.
A public-suffix result depends on both which list answered
and how hosts were normalized. psl_version()
reports both — the source-snapshot provenance and the runtime
normalization identifiers — so a result can be reproduced later. Record
this row alongside reproducibility-sensitive output.
psl_version()
#> source path retrieved_at list_date
#> 1 bundled <NA> 2026-06-15 16:18:34 UTC 2026-06-13T21:47:08Z
#> commit size
#> 1 9186eeeda85cef35b1551d00731464939c765cab 332703
#> checksum
#> 1 sha256:54fb5c65a1e21aad963acd74a204370b5f517071e8b8e140c48de40727f0171c
#> normalizer normalizer_version normalization_profile unicode_version
#> 1 punycoder 1.1.0 uts46-nontransitional-std3-v1 16.0.0psl_rules() exposes the active rule table itself:
nrow(psl_rules("icann"))
#> [1] 6933
head(psl_rules("private"), 3)
#> rule canonical_rule kind section labels
#> 1 co.krd co.krd normal private 2
#> 2 edu.krd edu.krd normal private 2
#> 3 art.pl art.pl normal private 2If the shipped index was generated under a different normalization
profile or Unicode version than the installed punycoder,
the list is transparently rebuilt in memory from source on activation,
so an index is never mixed with hosts normalized under a different
profile.
psl_refresh() does, and only when you call it. It is
HTTPS-only, rejects embedded credentials and downgrade redirects, and
enforces a source-size ceiling.