| Type: | Package |
| Title: | Geographically Weighted Panel Regression |
| Version: | 1.0.0 |
| Description: | A modern, first implementation of Geographically Weighted Panel Regression (GWPR) for spatial panel data. The package provides a unified public API supporting Gaussian and binomial family models, within/pooling/random panel effects, three bandwidth search strategies (grid, Stochastic Gradient Descent, random), five kernel functions, and optional parallel execution via the 'future' framework. Diagnostic tools include spatial Moran's I, local F-test, Hausman test, and Lagrange Multiplier test. |
| License: | AGPL (≥ 3) |
| Encoding: | UTF-8 |
| LazyData: | true |
| Imports: | fixest, glmmTMB, lmtest, plm, sf, stats, utils |
| Depends: | R (≥ 3.5.0) |
| Suggests: | future, future.apply, rmarkdown, knitr, testthat (≥ 3.0.0) |
| VignetteBuilder: | knitr |
| URL: | https://github.com/MichaelChaoLi-cpu/GWPR.light |
| BugReports: | https://github.com/MichaelChaoLi-cpu/GWPR.light/issues |
| Config/testthat/edition: | 3 |
| Config/roxygen2/version: | 8.0.0 |
| NeedsCompilation: | no |
| Packaged: | 2026-05-22 00:18:34 UTC; lichao |
| Author: | Chao Li |
| Maintainer: | Chao Li <chaoli0394@gmail.com> |
| Repository: | CRAN |
| Date/Publication: | 2026-05-29 09:20:08 UTC |
Create a placeholder NA history row for epochs after early stopping
Description
Create a placeholder NA history row for epochs after early stopping
Usage
.make_na_history_row(ep, bw, learning_rate, delta, batch_size)
Build a scorer function for bandwidth search
Description
Returns a closure that fits the full GWPR engine for a given bandwidth and returns the score (MSE for linear, log_loss for logistic), together with aggregate metrics.
Usage
.make_scorer(family = "gaussian", threshold = 0.5)
Arguments
family |
Character; '"gaussian"' or '"binomial"'. |
threshold |
Numeric; classification threshold (binomial only). |
Value
A function with signature 'scorer(context, bandwidth)'.
Convert a list of history row lists to a data.frame
Description
Convert a list of history row lists to a data.frame
Usage
.rows_to_df(rows)
California (sf)
Description
The counties' boundary in California
Usage
data(California)
Format
An sf object with 58 rows (one per county) and two columns:
- GEOID
a numeric vector, fips IDs of the counties
- geometry
sfc_MULTIPOLYGON, county boundary polygons (CRS: NAD83 longlat)
Author(s)
Chao Li <chaoli0394@gmail.com> Shunsuke Managi <managi.s@gmail.com>
Examples
data(California)
class(California)
Panel Dataset for Testing GWPR
Description
Panel dataset to estimate the relationship between county-level PM2.5 concentration and on-road transporation in California.
Usage
data(TransAirPolCalif)
Format
A data.frame with 23 variables, and 928 observations, which are:
- GEOID
a numeric vector, fips IDs of the counties
- year
a numeric vector, year
- pm25
a numeric vector, annually average PM2.5 concentration in the counties
- co2_mean
a numeric vector, geographically average CO2 emission from on-road transportation in each year, million tons/km2
- Developed_Open_Space_perc
a numeric vector, percentage of developed open space of total area in each county
- Developed_Low_Intensity_perc
a numeric vector, percentage of low-intensity developed area of total area in each county
- Developed_Medium_Intensity_perc
a numeric vector, percentage of medium-intensity developed area of total area in each county
- Developed_High_Intensity_perc
a numeric vector, percentage of high-intensity develope area of total area in each county
- Open_Water_perc
a numeric vector, percentage of open water of total area in each county
- Woody_Wetlands_perc
a numeric vector, percentage of woody wetland of total area in each county
- Emergent_Herbaceous_Wetlands_perc
a numeric vector, percentage of emergent herbaceous wetland of total area in each county
- Deciduous_Forest_perc
a numeric vector, percentage of deciduous forest of total area in each county
- Evergreen_Forest_perc
a numeric vector, percentage of evergreen forest of total area in each county
- Mixed_Forest_perc
a numeric vector, percentage of mixed forest of total area in each county
- Shrub_perc
a numeric vector, percentage of shrub of total area in each county
- Grassland_perc
a numeric vector, percentage of grassland of total area in each county
- Pasture_perc
a numeric vector, percentage of pasture of total area in each county
- Cultivated_Crops_perc
a numeric vector, percentage of cultivated crops of total area in each county
- pop_density
a numeric vector, average population density in each county
- summer_tmmx
a numeric vector, average temperature in summer
- winter_tmmx
a numeric vector, average temperature in winter
- summer_rmax
a numeric vector, average humidity in summer
- winter_rmax
a numeric vector, average humidity in winter
Author(s)
Chao Li <chaoli0394@gmail.com> Shunsuke Managi <managi.s@gmail.com>
Examples
data(TransAirPolCalif)
head(TransAirPolCalif)
Reorder spatial rows to match panel ID order
Description
Given a named integer vector 'id_map' (names = unit IDs, values = row indices in 'spatial') produced by 'build_id_map()', this function reorders the rows of 'spatial' so that they align with the panel data ordering.
Usage
align_spatial_to_panel(spatial, id_map)
Arguments
spatial |
An 'sf' object containing at minimum the rows referenced in 'id_map'. |
id_map |
A named integer vector as produced by 'build_id_map()'. Names are unit IDs (character); values are 1-based row indices into 'spatial'. |
Value
An 'sf' object with rows reordered to match 'id_map'.
Public API for GWPR.light 1.0.0
Description
High-level user-facing functions for Geographically Weighted Panel Regression. These four functions form the complete public interface; all internal complexity is hidden behind them.
-
gwpr– full pipeline (bandwidth search + fitting + optional diagnostics). -
select_bandwidth– standalone bandwidth search. -
fit_gwpr– fit with a known bandwidth. -
diagnose_gwpr– run diagnostics on a fitted model.
Assert that an object is an sf object
Description
Stops with an informative error when 'spatial' does not inherit from '"sf"'. sp objects are not supported.
Usage
assert_sf(spatial)
Arguments
spatial |
Any R object. |
Value
Invisibly returns 'TRUE' when the check passes.
Grid Search for Bandwidth Selection
Description
Functions implementing an exhaustive grid search over a user-specified range of bandwidth candidates. Each candidate is evaluated by a user- supplied scorer function; the full search history and the best bandwidth are returned as a 'gwpr_bandwidth' object.
Random Bandwidth Optimizer ('bandwidth_random.R')
Description
Implements a bounded random search for bandwidth selection. A user- specified number of candidate bandwidths are drawn uniformly at random from '[lower, upper]', scored by a user-supplied scorer function, and the candidate with the lowest score is returned as the best bandwidth.
The search boundaries ('lower', 'upper') **must** be set explicitly by the caller; automatic inference is intentionally not supported.
SGD Bandwidth Search ('bandwidth_sgd.R')
Description
Implements a stochastic gradient descent (SGD) based bandwidth search for GWPR models. A one-dimensional finite-difference gradient approximation is used to iteratively update the bandwidth over a fixed number of epochs. Mini-batch sampling and early stopping are supported.
The search does **not** require the user to specify 'lower', 'upper', or 'step'; SGD starts from a single initialised bandwidth and follows the (approximate) gradient. When 'lower' / 'upper' are supplied they are used as hard constraints.
Build a distance context object
Description
Wraps a distance matrix together with unit IDs into a list suitable for passing to 'get_local_distances()'. Optionally pre-computes the full distance matrix (recommended for small data) or stores only the coordinate matrix for on-the-fly computation (recommended for large data).
Usage
build_distance_context(coords, ids, longlat = FALSE, cache = TRUE)
Arguments
coords |
A numeric matrix with columns 'X' and 'Y'. |
ids |
Character or numeric vector of unit IDs (length = nrow(coords)). |
longlat |
Logical. Passed to 'compute_distance()'. |
cache |
Logical. If 'TRUE' (default), pre-computes and caches the full n x n distance matrix. Set 'FALSE' for very large datasets to avoid memory pressure; 'get_local_distances()' will then compute rows on demand. |
Value
A list with class '"gwpr_distance_context"' containing: * 'ids' — character vector of unit IDs. * 'distance_matrix' — n x n matrix (or 'NULL' if 'cache = FALSE'). * 'coords' — the original coordinate matrix. * 'longlat' — logical flag.
Build a mapping from panel unit IDs to spatial row indices
Description
Returns a named integer vector where each name is a panel unit ID (as a character string) and each value is the corresponding row index in 'spatial_data' (1-based).
Usage
build_id_map(panel_data, spatial_data, id)
Arguments
panel_data |
A data frame with a column named 'id'. |
spatial_data |
An 'sf' data frame with a column named 'id'. |
id |
Character; name of the shared ID column. |
Details
Rules: - Every panel ID must have a spatial match; missing IDs cause an error. - Extra spatial rows (not in panel) are silently ignored.
Value
Named integer vector mapping unit ID to spatial row index.
Build the model frame from the formula and panel data
Description
Build the model frame from the formula and panel data
Usage
build_model_frame(context)
Arguments
context |
A 'gwpr_context' with 'formula' and 'panel_data' populated. |
Value
Updated context with 'model_frame' populated.
Build the design matrix and response vector
Description
Extracts the response variable 'y' and the design matrix 'X' from the model frame. For 'binomial' family, the response is standardised to 0/1 via 'standardize_binary_response()'.
Usage
build_model_matrix(context)
Arguments
context |
A 'gwpr_context' with 'model_frame', 'formula', and 'family' populated. |
Value
Updated context with 'model_matrix' and 'response' populated.
Build a neighbour structure from an sf object
Description
Returns different structures depending on 'type':
Usage
build_neighbor_structure(spatial, type = c("distance", "contiguity"))
Arguments
spatial |
An 'sf' object. |
type |
Character scalar: '"distance"' (default) or '"contiguity"'. |
Details
* '"distance"' — a numeric coordinate matrix (columns 'X' and 'Y') suitable for pairwise distance computation. * '"contiguity"' — a named list where each element is the integer vector of neighbour row indices (1-based, Queen contiguity via ‘sf::st_relate()'). Used for spatial diagnostics such as Moran’s I.
Value
* For '"distance"': a numeric matrix with columns 'X' and 'Y'. * For '"contiguity"': a named list of integer vectors.
Classify memory risk level
Description
Maps an estimated byte count to a human-readable risk category.
Usage
classify_memory_risk(estimated_bytes)
Arguments
estimated_bytes |
Non-negative numeric; total estimated bytes. |
Details
Thresholds:
-
low: < 500 MB
-
medium: 500 MB – 2 GB
-
high: > 2 GB
Value
A character string: "low", "medium", or
"high".
Compute a full pairwise distance matrix from a coordinate matrix
Description
Compute a full pairwise distance matrix from a coordinate matrix
Usage
compute_distance(coords, longlat = FALSE)
Arguments
coords |
A numeric matrix with at least two columns ('X' and 'Y', or the first two columns if names are absent). Each row is one spatial unit. |
longlat |
Logical. If 'TRUE', great-circle distances (in kilometres) are calculated using the Haversine formula. If 'FALSE' (default), Euclidean distance is used. |
Value
An n x n symmetric numeric matrix where element [i, j] is the distance between unit i and unit j. Diagonal is 0.
Compute geographically weighted kernel weights
Description
Given a numeric vector of distances from one focal unit to all others, returns a weight for each unit according to the chosen kernel and bandwidth.
Usage
compute_kernel_weights(distance, bandwidth, kernel, adaptive)
Arguments
distance |
Numeric vector of distances from the focal unit to all spatial units (length n). |
bandwidth |
Numeric scalar. For fixed bandwidth: distance scale. For adaptive bandwidth: positive integer number of neighbours. |
kernel |
Character scalar, one of '"bisquare"', '"gaussian"', '"exponential"', '"tricube"', '"boxcar"'. |
adaptive |
Logical. 'TRUE' for adaptive (kNN) bandwidth. |
Details
For **fixed** bandwidth ('adaptive = FALSE') the bandwidth parameter is a distance threshold or scale parameter used directly in the kernel formula.
For **adaptive** bandwidth ('adaptive = TRUE') the bandwidth parameter is the number of nearest neighbours k. The function first identifies the k-th smallest distance (among all units, including the focal unit itself at distance 0) and uses that distance as the effective bandwidth in the kernel formula.
Kernel formulae (d = distance, bw = effective bandwidth): * 'bisquare': '(1 - (d/bw)^2)^2' for 'd <= bw', else 0. * 'gaussian': 'exp(-0.5 * (d/bw)^2)'. * 'exponential': 'exp(-d/bw)'. * 'tricube': '(1 - (d/bw)^3)^3' for 'd <= bw', else 0. * 'boxcar': '1' for 'd <= bw', else 0.
Value
A numeric vector of length n with non-negative kernel weights.
Internal Context Object for GWPR.light 1.0.0
Description
These internal functions construct and validate the standardised 'gwpr_context' list that is passed between modules, eliminating repetitive argument passing.
Data Preparation Module for GWPR.light 1.0.0
Description
Internal functions that convert user inputs into the internal data structures required by the model engine: panel indices, spatial alignment, model frame, and model matrix.
Run diagnostic tests on a fitted GWPR model
Description
Top-level interface that dispatches to individual diagnostic sub-functions. Returns a 'gwpr_diagnostics' object containing all requested test results.
Usage
diagnose_gwpr(
object,
diagnostics = c("moran", "f_test", "hausman", "lm_test"),
spatial_weights = NULL,
panel_index = NULL,
...
)
Arguments
object |
A 'gwpr_fit' object returned by 'fit_gwpr()' or similar. |
diagnostics |
Character vector naming the tests to run. Any subset of 'c("moran", "f_test", "hausman", "lm_test")'. Default is all four. |
spatial_weights |
Required when '"moran"' is in 'diagnostics'. A row-standardised n x n spatial weights matrix. |
panel_index |
Required when '"moran"' is in 'diagnostics'. A data.frame with columns 'id' and 'time' identifying each element of 'object$residuals'. |
... |
Additional arguments passed to individual diagnostic functions. |
Details
Tests that are not applicable to the fitted model (e.g., Hausman test on a pooling model) return a list with 'status = "not_applicable"' and an explanatory 'message', rather than an error.
Value
A 'gwpr_diagnostics' object (list with class '"gwpr_diagnostics"') whose 'diagnostics' slot contains the result of each requested test.
Examples
library(sf)
pts <- sf::st_as_sf(
data.frame(id = 1:4, X = c(0,1,0,1), Y = c(0,0,1,1)),
coords = c("X", "Y"), crs = NA_integer_
)
dat <- data.frame(
id = rep(1:4, each = 5),
time = rep(1:5, 4),
y = rnorm(20),
x1 = rnorm(20)
)
fit <- fit_gwpr(y ~ x1, data = dat, spatial = pts, id = "id",
time = "time", bandwidth = 2, workers = 1)
diag_result <- diagnose_gwpr(fit, diagnostics = c("f_test", "hausman"))
print(diag_result)
Local Hausman test diagnostic on a gwpr_fit object
Description
Performs a local Hausman test (within vs. random) for each spatial unit using test statistics pre-computed during model fitting or stored in 'local_results'.
Usage
diagnose_hausman(object, ...)
Arguments
object |
A 'gwpr_fit' object. |
... |
Currently ignored. |
Details
**Applicable models**: gaussian with 'model = "random"'. For pooling models the Hausman test is not meaningful; the function returns a 'status = "not_applicable"' result. For logistic models the function also returns 'status = "not_applicable"'.
**Panel balance requirement**: No constraint at the unit level.
**Failure conditions**: Returns 'status = "missing_hausman_data"' for any unit where the required statistics are absent from 'local_results'.
**Logistic interpretation limit**: Not applicable.
Value
A named list with elements:
- 'local_hausman'
Data frame with columns 'unit_id', 'statistic', 'p_value', 'df', 'status'.
- 'n_tested'
Number of units tested.
- 'n_failed'
Number of units where the test could not be computed.
- 'status'
Overall status: '"ok"', '"not_applicable"', or '"no_local_results"'.
Examples
library(sf)
pts <- sf::st_as_sf(
data.frame(id = 1:4, X = c(0,1,0,1), Y = c(0,0,1,1)),
coords = c("X", "Y"), crs = NA_integer_
)
dat <- data.frame(
id = rep(1:4, each = 5),
time = rep(1:5, 4),
y = rnorm(20),
x1 = rnorm(20)
)
fit <- fit_gwpr(y ~ x1, data = dat, spatial = pts, id = "id",
time = "time", bandwidth = 2, workers = 1)
diagnose_hausman(fit)
Local Breusch-Pagan LM test diagnostic on a gwpr_fit object
Description
Performs a local Breusch-Pagan Lagrange Multiplier test for random effects for each spatial unit using test statistics stored in 'local_results'.
Usage
diagnose_lm(object, ...)
Arguments
object |
A 'gwpr_fit' object. |
... |
Currently ignored. |
Details
**Applicable models**: gaussian with 'model = "pooling"' or 'model = "random"'. For 'within' models the test is not directly applicable (it tests for random effects vs. OLS); the function returns 'status = "not_applicable"'. For logistic models also not applicable.
**Panel balance requirement**: No constraint at the unit level.
**Failure conditions**: Returns 'status = "missing_lm_data"' for units missing the required statistics.
**Logistic interpretation limit**: Not applicable.
Value
A named list with elements:
- 'local_lm'
Data frame with columns 'unit_id', 'statistic', 'p_value', 'df', 'status'.
- 'n_tested'
Number of units tested.
- 'n_failed'
Number of units where the test could not be computed.
- 'status'
Overall status.
Examples
library(sf)
pts <- sf::st_as_sf(
data.frame(id = 1:4, X = c(0,1,0,1), Y = c(0,0,1,1)),
coords = c("X", "Y"), crs = NA_integer_
)
dat <- data.frame(
id = rep(1:4, each = 5),
time = rep(1:5, 4),
y = rnorm(20),
x1 = rnorm(20)
)
fit <- fit_gwpr(y ~ x1, data = dat, spatial = pts, id = "id",
time = "time", bandwidth = 2, workers = 1)
diagnose_lm(fit)
Local F test diagnostic on a gwpr_fit object
Description
Performs a local F test (fixed effects vs. pooling) using per-unit local residuals stored in the fitted model object.
Usage
diagnose_local_f(object, ...)
Arguments
object |
A 'gwpr_fit' object. |
... |
Currently ignored. |
Details
**Applicable models**: gaussian (linear). Not applicable to logistic models; returns a 'status = "not_applicable"' result when 'family = "binomial"'.
**Panel balance requirement**: No constraint; the test uses per-unit local residuals already computed during fitting.
**Failure conditions**: If 'local_results' is empty or missing, all units are reported as failed. If a unit's local result does not contain the information needed (within and pooling residuals), that unit is reported as failed with an informative 'status'.
**Logistic interpretation limit**: Not applicable; see above.
Value
A named list with elements:
- 'local_f'
Data frame with columns 'unit_id', 'statistic', 'p_value', 'df1', 'df2', 'status'.
- 'n_tested'
Number of units tested.
- 'n_failed'
Number of units where the test could not be computed.
Examples
library(sf)
pts <- sf::st_as_sf(
data.frame(id = 1:4, X = c(0,1,0,1), Y = c(0,0,1,1)),
coords = c("X", "Y"), crs = NA_integer_
)
dat <- data.frame(
id = rep(1:4, each = 5),
time = rep(1:5, 4),
y = rnorm(20),
x1 = rnorm(20)
)
fit <- fit_gwpr(y ~ x1, data = dat, spatial = pts, id = "id",
time = "time", bandwidth = 2, workers = 1)
diagnose_local_f(fit)
Run Moran's I diagnostic on a gwpr_fit object
Description
Extracts residuals from a fitted GWPR model (Pearson residuals for logistic models, raw residuals for linear models) and computes the panel Moran's I statistic.
Usage
diagnose_moran(object, spatial_weights, panel_index, ...)
Arguments
object |
A 'gwpr_fit' object returned by 'fit_gwpr()' or similar. |
spatial_weights |
A row-standardised n x n spatial weights matrix. 'n' must equal the number of spatial individuals in the fitted model. |
panel_index |
A data.frame or list with columns/elements 'id' and 'time' that identify each element of 'object$residuals'. |
... |
Currently ignored. |
Details
**Applicable models**: gaussian (linear residuals) and binomial (Pearson residuals).
**Panel balance**: See 'compute_panel_moran()'.
**Failure conditions**: Fails if 'object' is not a 'gwpr_fit', if 'object$residuals' is 'NULL', or if 'spatial_weights' dimensions do not match the number of individuals.
**Logistic interpretation limit**: Moran's I computed on Pearson residuals is exploratory; the asymptotic distribution differs from the linear case.
Value
A named list compatible with the 'diagnostics' slot of a 'gwpr_diagnostics' object. Contains the elements returned by 'compute_panel_moran()' plus 'residual_type'.
Examples
library(sf)
pts <- sf::st_as_sf(
data.frame(id = 1:4, X = c(0,1,0,1), Y = c(0,0,1,1)),
coords = c("X", "Y"), crs = NA_integer_
)
dat <- data.frame(
id = rep(1:4, each = 5),
time = rep(1:5, 4),
y = rnorm(20),
x1 = rnorm(20)
)
fit <- fit_gwpr(y ~ x1, data = dat, spatial = pts, id = "id",
time = "time", bandwidth = 2, workers = 1)
W <- matrix(1/3, nrow = 4, ncol = 4); diag(W) <- 0
idx <- dat[, c("id", "time")]
diagnose_moran(fit, W, idx)
Diagnostics Module for GWPR.light 1.0.0
Description
Unified diagnostic interface for Geographically Weighted Panel Regression models. Provides Moran's I (spatial autocorrelation), local F test (fixed vs. pooling), local Hausman test (fixed vs. random), and local Breusch-Pagan LM test.
Details
**Model applicability**
| Diagnostic | Linear | Logistic | Notes | |————–|——–|———-|——————————————–| | moran | yes | yes | Logistic uses Pearson residual | | f_test | yes | no | Requires within and pooling models | | hausman | yes | no | Only meaningful for random-effect models | | lm_test | yes | no | Pooling or random-effect models |
**Panel balance**
'compute_panel_moran()' is fully supported for balanced panels. For unbalanced panels a 'warning()' is issued and the function attempts computation using only the time periods present in every individual; results may be unreliable.
**Logistic interpretation**
Moran's I computed from Pearson residuals of a Logistic model does not follow the same asymptotic distribution as for linear models. Treat the test result as an exploratory heuristic, not a formal test.
Build a fixest::feglm formula string from the user-facing effect string
Description
Build a fixest::feglm formula string from the user-facing effect string
Usage
effect_to_feglm_fml(base_formula, effect, id_col, time_col)
Arguments
base_formula |
A formula object (without fixed-effect terms). |
effect |
Character scalar: one of |
id_col |
Name of the individual ID column. |
time_col |
Name of the time column. |
Value
A formula suitable for fixest::feglm().
Append random-effect terms to a base formula for glmmTMB
Description
Append random-effect terms to a base formula for glmmTMB
Usage
effect_to_glmmtmb_fml(base_formula, effect, id_col, time_col)
Arguments
base_formula |
A formula object (without random-effect terms). |
effect |
Character scalar: one of |
id_col |
Name of the individual ID column. |
time_col |
Name of the time column. |
Value
A formula suitable for glmmTMB::glmmTMB().
Map user-facing effect string to plm effect parameter
Description
Map user-facing effect string to plm effect parameter
Usage
effect_to_plm(effect)
Arguments
effect |
Character scalar: one of '"individual"', '"time"', '"two-way"', '"nested"'. |
Value
Character scalar accepted by 'plm::plm()' for its 'effect' argument: '"individual"', '"time"', or '"twoways"'. For '"nested"', the function stops with an informative message because 'plm' does not support nested effects without additional data conventions.
Estimate memory usage for a GWPR run
Description
Calculates an approximate memory requirement (in bytes) based on the data dimensions stored in a 'gwpr_context' object, the number of parallel workers, and whether the full distance matrix will be cached.
Usage
estimate_memory(context, workers = 1, cache_distance = NULL)
Arguments
context |
A 'gwpr_context' list containing at least
|
workers |
Positive integer. Number of parallel workers. |
cache_distance |
Logical or |
Details
The two main cost components are:
Distance matrix (when
cache_distance = TRUE):n\_units^2 \times 8bytes.Local model working copies (one per worker):
n\_rows \times n\_vars \times 8 \times workersbytes.
Value
A named list with class "gwpr_memory_estimate":
- n_units
Number of spatial units.
- n_time
Number of time periods.
- n_vars
Number of explanatory variables.
- n_rows
Total panel rows (
n_units * n_time).- workers
Workers used for the estimate.
- cache_distance
Whether distance caching was assumed.
- distance_bytes
Bytes for the distance matrix (0 if not cached).
- model_bytes
Bytes for local model copies across workers.
- total_bytes
Total estimated bytes.
- risk
Character risk level:
"low","medium", or"high".
Extract representative XY coordinates from an sf object
Description
For POINT geometries the point coordinates are returned directly. For all other geometry types (POLYGON, MULTIPOLYGON, etc.) 'sf::st_centroid()' is used to derive a representative point, with warnings suppressed (they are typically non-actionable geographic-CRS notes).
Usage
extract_coordinates(spatial)
Arguments
spatial |
An 'sf' object. |
Value
A numeric matrix with columns 'X' and 'Y', one row per feature.
Extract representative XY coordinates from an sf object
Description
Uses point coordinates for POINT geometries and centroids for other geometry types (POLYGON, MULTIPOLYGON, etc.).
Usage
extract_coords_from_sf(spatial)
Arguments
spatial |
An 'sf' object. |
Value
A numeric matrix with columns 'X' and 'Y'.
Extract the geometry column from an sf object
Description
Returns the 'sfc' geometry column of the supplied 'sf' object.
Usage
extract_geometry(spatial)
Arguments
spatial |
An 'sf' object. |
Value
An 'sfc' geometry column.
Extract coefficients and diagnostics from a local linear model
Description
Extract coefficients and diagnostics from a local linear model
Usage
extract_linear_local_result(local_result)
Arguments
local_result |
A list as returned by 'fit_linear_local_model()'. |
Value
A named list with elements:
- 'coefficients'
Named numeric vector of local coefficient estimates.
- 'se'
Named numeric vector of standard errors (same names).
- 'tvalues'
Named numeric vector of t-statistics.
- 'local_r2'
Numeric scalar or 'NA_real_'.
- 'local_aic'
Numeric scalar or 'NA_real_'.
- 'status'
'"ok"' or '"failed"'.
- 'error'
'NULL' or character error message.
Extract coefficients and diagnostics from a local logistic model
Description
Computes predicted probabilities, predicted classes, Pearson residuals,
and extracts coefficient estimates. Pearson residuals are defined as
(y - p) / sqrt(p * (1 - p)) with p clipped to
[eps, 1 - eps] to avoid division by zero.
Usage
extract_logistic_local_result(
local_result,
local_data,
formula,
threshold = 0.5,
eps = 1e-15
)
Arguments
local_result |
A list as returned by |
local_data |
The |
formula |
The model formula (used to extract the response name). |
threshold |
Numeric scalar; classification threshold (default 0.5). |
eps |
Numeric scalar; clipping bound for probability (default
|
Value
A named list with elements:
coefficientsNamed numeric vector of local coefficient estimates, or
NA_real_on failure.probNumeric vector of predicted probabilities for the local data rows, or
NA_real_on failure.class_predInteger vector (0/1) of predicted classes, or
NA_real_on failure.pearson_residNumeric vector of Pearson residuals, or
NA_real_on failure.status"ok"or"failed".errorNULLor character error message.
Fit GWPR with a given bandwidth
Description
Validates inputs, prepares data, builds spatial weights, and fits the
Geographically Weighted Panel Regression for the specified bandwidth.
Returns a gwpr_fit object.
Usage
fit_gwpr(
formula,
data,
spatial,
id,
time,
bandwidth,
family = c("gaussian", "binomial"),
model = c("within", "pooling", "random"),
effect = c("individual", "time", "two-way", "nested"),
kernel = c("bisquare", "gaussian", "exponential", "tricube", "boxcar"),
adaptive = FALSE,
threshold = 0.5,
workers = 1L,
seed = NULL,
...
)
Arguments
formula |
A |
data |
A |
spatial |
An |
id |
Character scalar; unit ID column name. |
time |
Character scalar; time column name. |
bandwidth |
Numeric scalar. The bandwidth to use (fixed distance or
number of neighbours when |
family |
|
model |
|
effect |
|
kernel |
Kernel function name (default |
adaptive |
Logical; |
threshold |
Numeric; classification threshold (binomial only,
default |
workers |
Positive integer; number of parallel workers (default 1). |
seed |
Integer random seed, or |
... |
Currently unused. |
Value
A gwpr_fit object.
Examples
library(sf)
pts <- sf::st_as_sf(
data.frame(id = 1:4, X = c(0,1,0,1), Y = c(0,0,1,1)),
coords = c("X", "Y"), crs = NA_integer_
)
dat <- data.frame(
id = rep(1:4, each = 5),
time = rep(1:5, 4),
y = rnorm(20),
x1 = rnorm(20)
)
fit <- fit_gwpr(y ~ x1, data = dat, spatial = pts, id = "id",
time = "time", bandwidth = 2, workers = 1)
print(fit)
Fit a single local panel linear model
Description
Fits a geographically weighted panel linear model for one focal spatial unit. Errors are caught and returned as a structured failure result rather than propagating to the caller.
Usage
fit_linear_local_model(
local_data,
formula,
model,
effect,
weights,
index,
random_method = "swar"
)
Arguments
local_data |
A 'pdata.frame' or plain 'data.frame' with columns for all formula variables plus the panel indices. |
formula |
A formula object. |
model |
Character scalar: '"pooling"', '"within"', or '"random"'. |
effect |
Character scalar: '"individual"', '"time"', '"two-way"', or '"nested"'. |
weights |
Numeric vector of kernel weights aligned with the rows of 'local_data'. |
index |
Character vector of length 2 giving the panel index column names: 'c(id_col, time_col)'. |
random_method |
Character scalar; estimation method for variance components when 'model = "random"' (default '"swar"'). |
Value
A list with elements:
- 'fit'
The fitted model object, or 'NULL' on failure.
- 'status'
'"ok"' or '"failed"'.
- 'error'
'NULL' or character string with the error message.
- 'metadata'
Named list with additional model-fitting metadata, e.g. flagging single-observation individuals for within models.
Fit a fixed-effects logistic model via fixest::feglm
Description
Fit a fixed-effects logistic model via fixest::feglm
Usage
fit_logistic_fixed(
local_data,
formula,
effect,
weights,
id_col,
time_col,
family = "binomial"
)
Arguments
local_data |
A |
formula |
A formula object with a binary response (no fixed-effect
terms; those are added automatically from |
effect |
Character scalar: one of |
weights |
Numeric vector of kernel weights. |
id_col |
Name of the individual ID column. |
time_col |
Name of the time column. |
family |
Character scalar reserved for future extension. |
Value
A fixest object.
Fit a single local binary logistic panel model
Description
Dispatches to the correct backend (pooling, fixed, or
random) based on model, wrapping execution in a
tryCatch so that convergence failures or complete-separation errors
are captured rather than propagated.
Usage
fit_logistic_local_model(
local_data,
formula,
model,
effect,
weights,
index,
threshold = 0.5,
family = "binomial"
)
Arguments
local_data |
A |
formula |
A formula object. |
model |
Character scalar: |
effect |
Character scalar: |
weights |
Numeric vector of kernel weights aligned with
|
index |
Character vector of length 2: |
threshold |
Numeric scalar; classification threshold (default 0.5). |
family |
Character scalar reserved for future extension (currently
only |
Value
A list with elements:
fitThe fitted model object, or
NULLon failure.status"ok"or"failed".errorNULLor character error message.metadataNamed list with additional fitting metadata.
Fit a pooled logistic model via stats::glm
Description
Fit a pooled logistic model via stats::glm
Usage
fit_logistic_pooling(local_data, formula, weights, family = "binomial")
Arguments
local_data |
A |
formula |
A formula object with a binary response. |
weights |
Numeric vector of kernel weights (same length as
|
family |
Character scalar reserved for future extension;
currently only |
Value
A glm object.
Fit a random-effects logistic model via glmmTMB::glmmTMB
Description
Fit a random-effects logistic model via glmmTMB::glmmTMB
Usage
fit_logistic_random(
local_data,
formula,
effect,
weights,
id_col,
time_col,
family = "binomial"
)
Arguments
local_data |
A |
formula |
A formula object with a binary response (no random-effect
terms; those are added automatically from |
effect |
Character scalar: one of |
weights |
Numeric vector of kernel weights. |
id_col |
Name of the individual ID column. |
time_col |
Name of the time column. |
family |
Character scalar reserved for future extension. |
Value
A glmmTMB object.
Format a human-readable memory warning message
Description
Converts a 'gwpr_memory_estimate' object (produced by
estimate_memory) into a readable character string. For
high-risk estimates the message also includes actionable suggestions.
Usage
format_memory_warning(memory_estimate)
Arguments
memory_estimate |
A 'gwpr_memory_estimate' list, typically the
return value of |
Value
A character string containing the warning text. The string is
suitable for passing to message() or warning().
Extract distances from one focus unit to all others
Description
Extract distances from one focus unit to all others
Usage
get_local_distances(distance_context, focus_id)
Arguments
distance_context |
A list as returned by 'build_distance_context()', or a plain n x n numeric distance matrix. If a plain matrix is supplied, the rows and columns must already be in the same order as the spatial units. |
focus_id |
Integer scalar (1-based) or character matching a row/column name of the distance matrix. The focal unit whose distances are extracted. |
Value
A numeric vector of length n giving the distance from the focus unit to every unit (including itself, which is 0).
Fit a Geographically Weighted Panel Regression (main entry point)
Description
Orchestrates the complete GWPR pipeline: input validation, data preparation, optional memory estimation, optional bandwidth search, model fitting, and optional diagnostics.
Usage
gwpr(
formula,
data,
spatial,
id,
time,
family = c("gaussian", "binomial"),
model = c("within", "pooling", "random"),
effect = c("individual", "time", "two-way", "nested"),
bandwidth = NULL,
bandwidth_method = c("sgd", "grid", "random"),
bandwidth_control = list(),
kernel = c("bisquare", "gaussian", "exponential", "tricube", "boxcar"),
adaptive = FALSE,
threshold = 0.5,
workers = 1L,
seed = NULL,
diagnostics = TRUE,
...
)
Arguments
formula |
A |
data |
A |
spatial |
An |
id |
Character scalar. Name of the unit (individual) ID column shared
by |
time |
Character scalar. Name of the time-period column in |
family |
Character scalar. Model family: |
model |
Character scalar. Panel model type: |
effect |
Character scalar. Panel effect: |
bandwidth |
Numeric scalar or |
bandwidth_method |
Character scalar. Method for automatic bandwidth
search: |
bandwidth_control |
Named list of control parameters passed to the
bandwidth search function. For |
kernel |
Character scalar. Kernel function: |
adaptive |
Logical scalar. |
threshold |
Numeric scalar. Classification threshold for
|
workers |
Positive integer. Number of parallel workers. |
seed |
Integer or |
diagnostics |
Logical scalar. When |
... |
Additional arguments passed to the bandwidth search or fitting functions. |
Value
A gwpr_fit object. Key fields:
local_resultsPer-unit local model results.
predictionsIn-sample predicted values / probabilities.
residualsResiduals or Pearson residuals.
metricsOverall goodness-of-fit metrics.
spatial_resultsData frame of per-unit coefficients.
searchBandwidth search result (
gwpr_bandwidth), orNULLwhen bandwidth was supplied directly.diagnosticsA
gwpr_diagnosticsobject, orNULL.
Examples
# Minimal linear GWPR with a fixed bandwidth
library(sf)
pts <- sf::st_as_sf(
data.frame(id = 1:4, X = c(0,1,0,1), Y = c(0,0,1,1)),
coords = c("X", "Y"), crs = NA_integer_
)
dat <- data.frame(
id = rep(1:4, each = 5),
time = rep(1:5, 4),
y = rnorm(20),
x1 = rnorm(20)
)
fit <- gwpr(y ~ x1, data = dat, spatial = pts, id = "id", time = "time",
bandwidth = 2, diagnostics = FALSE, workers = 1)
print(fit)
Memory Estimation Module for GWPR.light 1.0.0
Description
Functions for estimating memory usage before running GWPR models. The module provides warnings about memory risk levels to help users avoid out-of-memory errors. These functions only warn; they never stop execution.
Metrics Module
Description
Functions for computing evaluation metrics for linear and logistic panel regression models.
Linear GWPR Engine for GWPR.light 1.0.0
Description
Internal functions for fitting Geographically Weighted Panel Regression with a Gaussian (linear) response. Supports pooling, within, and random panel models, plus individual, time, two-way, and nested effects.
Binary Panel Logistic Engine for GWPR.light 1.0.0
Description
Internal functions for fitting Geographically Weighted Panel Regression with a binary (binomial) response. Supports pooling (stats::glm), fixed effects (fixest::feglm), and random effects (glmmTMB::glmmTMB) panel models, plus individual, time, two-way, and nested effects.
The family parameter is reserved for future multi-class extension;
in version 1.0.0 only "binomial" is supported.
Construct a new gwpr_context object
Description
Creates the standardised internal context list used to pass state between GWPR modules. Any field not supplied defaults to 'NULL' (or, for 'metadata' and 'warnings', to their appropriate empty types).
Usage
new_gwpr_context(
call = NULL,
formula = NULL,
family = NULL,
model = NULL,
effect = NULL,
id = NULL,
time = NULL,
kernel = NULL,
adaptive = NULL,
threshold = NULL,
workers = NULL,
seed = NULL,
raw_data = NULL,
raw_spatial = NULL,
panel_data = NULL,
spatial_data = NULL,
id_map = NULL,
coords = NULL,
model_frame = NULL,
model_matrix = NULL,
response = NULL,
metadata = list(),
warnings = character(),
...
)
Arguments
call |
The matched call from the top-level API function. |
formula |
A formula object. |
family |
Character: '"gaussian"' or '"binomial"'. |
model |
Character: '"pooling"', '"within"', or '"random"'. |
effect |
Character: '"individual"', '"time"', '"two-way"', or '"nested"'. |
id |
Name of the unit ID column. |
time |
Name of the time column. |
kernel |
Kernel name. |
adaptive |
Logical; 'TRUE' for adaptive bandwidth. |
threshold |
Numeric classification threshold (Logistic). |
workers |
Number of parallel workers. |
seed |
Integer random seed or 'NULL'. |
raw_data |
The original user-supplied data frame. |
raw_spatial |
The original user-supplied sf object. |
panel_data |
Processed panel data frame. |
spatial_data |
Processed sf object. |
id_map |
Named integer vector mapping unit IDs to row indices. |
coords |
Matrix of spatial coordinates. |
model_frame |
Model frame derived from formula and panel_data. |
model_matrix |
Design matrix. |
response |
Numeric response vector. |
metadata |
Named list of supplementary information. |
warnings |
Character vector of accumulated warnings. |
... |
Additional named fields stored in the context list. |
Value
A named list with class '"gwpr_context"'.
Minimal parallel_map implementation
Description
Wraps 'lapply' when 'workers = 1' and 'parallel::mclapply' when 'workers > 1'. This stub is superseded once 'parallel.R' is available.
A thin wrapper that uses plain lapply when workers = 1 and
switches to future.apply::future_lapply with a multisession
plan for workers > 1. The global future plan is always
restored to sequential after the call, preventing side-effects.
Usage
parallel_map(x, fn, workers = 1, seed = NULL, ..., packages = NULL)
parallel_map(x, fn, workers = 1, seed = NULL, ..., packages = NULL)
Arguments
x |
A list (or vector) of inputs to iterate over. |
fn |
A function to apply to each element of |
workers |
Integer scalar. Number of parallel workers. |
seed |
Integer scalar or |
... |
Additional arguments forwarded to |
packages |
Character vector of package names that workers need to load,
or |
Details
Worker-level errors are caught and returned as character strings (prefixed
with "ERROR: ") rather than aborting the entire call.
Value
A list of the same length as x. Elements where fn
threw an error are replaced with a character string "ERROR: <msg>".
Examples
result <- parallel_map(1:3, function(x) x^2, workers = 1)
stopifnot(identical(result, list(1, 4, 9)))
Parallel Execution Module
Description
Unified parallel execution interface for bandwidth search, local model fitting, and diagnostics. Shields backend differences and ensures CRAN-friendly behaviour.
Predict response values for a local linear model
Description
Returns predicted values for the rows of 'local_data', using the fitted model object stored in 'local_result'.
Usage
predict_linear_local_model(local_result, local_data)
Arguments
local_result |
A list as returned by 'fit_linear_local_model()'. |
local_data |
The data frame used for prediction (same structure as the training data). |
Value
Numeric vector of predicted values (same length as 'nrow(local_data)'). Returns a vector of 'NA_real_' on failure.
Predict probabilities for a local logistic model
Description
Returns predicted probabilities (type = "response") for the rows of
local_data, using the fitted model stored in local_result.
Usage
predict_logistic_local_model(local_result, local_data)
Arguments
local_result |
A list as returned by |
local_data |
A |
Value
Numeric vector of probabilities in [0, 1]. Returns a vector
of NA_real_ on failure.
Prepare all data structures for GWPR fitting
Description
Orchestrates the full data preparation pipeline: panel data extraction, spatial data extraction, ID mapping, model frame and model matrix construction. Updates and returns a 'gwpr_context'.
Usage
prepare_data(context)
Arguments
context |
A 'gwpr_context' object with at least 'formula', 'family', 'id', 'time', 'model', 'raw_data', and 'raw_spatial' populated. |
Value
An updated 'gwpr_context' with 'panel_data', 'spatial_data', 'id_map', 'coords', 'model_frame', 'model_matrix', 'response', and 'metadata' filled in.
Prepare panel data from the raw data in context
Description
Extracts the 'id', 'time', and formula variables from 'raw_data'. Adds a 'raw_row_id' column preserving the original row positions. Sorts by id then time. Records panel balance information in 'metadata'. For 'within' models, records single-observation individuals in 'metadata'.
Usage
prepare_panel_data(context)
Arguments
context |
A 'gwpr_context' with 'raw_data', 'formula', 'id', 'time', and 'model' populated. |
Value
Updated context with 'panel_data' and 'metadata' populated.
Extract and align spatial data for GWPR fitting
Description
Extracts geometry and representative coordinates from the 'raw_spatial' sf object. Aligns spatial rows to the unique individual IDs present in 'panel_data' via 'id_map'.
Usage
prepare_spatial_data(context)
Arguments
context |
A 'gwpr_context' with 'raw_spatial', 'panel_data', and 'id' populated. |
Value
Updated context with 'spatial_data' and 'coords' populated.
Print a gwpr_bandwidth object
Description
Displays the search method, best bandwidth, criterion score, and number of iterations explored.
Usage
## S3 method for class 'gwpr_bandwidth'
print(x, ...)
Arguments
x |
A 'gwpr_bandwidth' object. |
... |
Currently ignored. |
Value
Invisibly returns 'x'.
Examples
library(sf)
pts <- sf::st_as_sf(
data.frame(id = 1:4, X = c(0,1,0,1), Y = c(0,0,1,1)),
coords = c("X", "Y"), crs = NA_integer_
)
dat <- data.frame(id = rep(1:4, each = 5), time = rep(1:5, 4),
y = rnorm(20), x1 = rnorm(20))
bw <- select_bandwidth(y ~ x1, data = dat, spatial = pts, id = "id",
time = "time", method = "grid",
control = list(lower = 1, upper = 3, step = 1), workers = 1)
print(bw)
Print a gwpr_diagnostics object
Description
Displays each diagnostic test name and, where available, its statistic and p-value.
Usage
## S3 method for class 'gwpr_diagnostics'
print(x, ...)
Arguments
x |
A 'gwpr_diagnostics' object. |
... |
Currently ignored. |
Value
Invisibly returns 'x'.
Examples
library(sf)
pts <- sf::st_as_sf(
data.frame(id = 1:4, X = c(0,1,0,1), Y = c(0,0,1,1)),
coords = c("X", "Y"), crs = NA_integer_
)
dat <- data.frame(id = rep(1:4, each = 5), time = rep(1:5, 4),
y = rnorm(20), x1 = rnorm(20))
fit <- fit_gwpr(y ~ x1, data = dat, spatial = pts, id = "id",
time = "time", bandwidth = 2, workers = 1)
diag_obj <- diagnose_gwpr(fit, diagnostics = c("f_test", "hausman"))
print(diag_obj)
Print a gwpr_fit object
Description
Displays a concise summary of a fitted GWPR model: family, panel model type, effect, bandwidth, and top-level goodness-of-fit metrics.
Usage
## S3 method for class 'gwpr_fit'
print(x, ...)
Arguments
x |
A 'gwpr_fit' object. |
... |
Currently ignored. |
Value
Invisibly returns 'x'.
Examples
library(sf)
pts <- sf::st_as_sf(
data.frame(id = 1:4, X = c(0,1,0,1), Y = c(0,0,1,1)),
coords = c("X", "Y"), crs = NA_integer_
)
dat <- data.frame(id = rep(1:4, each = 5), time = rep(1:5, 4),
y = rnorm(20), x1 = rnorm(20))
fit <- fit_gwpr(y ~ x1, data = dat, spatial = pts, id = "id",
time = "time", bandwidth = 2, workers = 1)
print(fit)
Result Object Module for GWPR.light 1.0.0
Description
S3 classes and constructor functions for the three result objects returned by the GWPR.light public API: 'gwpr_fit', 'gwpr_bandwidth', and 'gwpr_diagnostics'. Also provides 'build_spatial_results()' for assembling a data.frame that can be aligned with an 'sf' geometry column.
Score a single bandwidth candidate
Description
Calls the 'scorer' function for a given bandwidth and records the result together with timing information, model counts, and metric values.
Usage
score_bandwidth_candidate(context, bandwidth, scorer)
Arguments
context |
A 'gwpr_context' list. |
bandwidth |
Numeric scalar; the candidate bandwidth to evaluate. |
scorer |
A function with signature 'scorer(context, bandwidth)' returning a named list with at minimum:
|
Value
A named list describing the candidate result:
- 'bandwidth'
The evaluated bandwidth.
- 'score'
Numeric criterion score, or 'NA_real_' on failure.
- 'criterion'
Name of the scoring criterion.
- 'status'
'"ok"' or '"failed"'.
- 'error_message'
'NA_character_' or error text.
- 'warning_message'
'NA_character_' or warning text.
- 'elapsed_time'
Elapsed wall-clock time in seconds.
- 'n_local_models'
Number of local models attempted.
- 'n_failed_local_models'
Number of local models that failed.
- 'r2', 'mse', 'rmse', 'mae'
Linear metrics, or 'NA_real_'.
- 'log_loss', 'accuracy', 'precision', 'recall', 'f1_score'
Logistic metrics, or 'NA_real_'.
Select an optimal bandwidth for GWPR
Description
Validates inputs, prepares data, and dispatches to the appropriate bandwidth
search algorithm: grid search, stochastic gradient descent (sgd), or random
search, depending on the method argument.
Usage
select_bandwidth(
formula,
data,
spatial,
id,
time,
family = c("gaussian", "binomial"),
model = c("within", "pooling", "random"),
effect = c("individual", "time", "two-way", "nested"),
method = c("sgd", "grid", "random"),
control = list(),
kernel = c("bisquare", "gaussian", "exponential", "tricube", "boxcar"),
adaptive = FALSE,
threshold = 0.5,
workers = 1L,
seed = NULL,
...
)
Arguments
formula |
A |
data |
A |
spatial |
An |
id |
Character scalar; unit ID column name. |
time |
Character scalar; time column name. |
family |
|
model |
|
effect |
|
method |
Bandwidth search method: |
control |
Named list of search control parameters. |
kernel |
Kernel function name. |
adaptive |
Logical; |
threshold |
Numeric; classification threshold (binomial only). |
workers |
Positive integer; number of parallel workers. |
seed |
Integer random seed, or |
... |
Additional arguments (currently unused). |
Value
A gwpr_bandwidth object with fields:
best_bandwidthThe selected bandwidth value.
best_scoreThe criterion value at the best bandwidth.
methodThe search method used.
historySearch history data frame.
Examples
library(sf)
pts <- sf::st_as_sf(
data.frame(id = 1:4, X = c(0,1,0,1), Y = c(0,0,1,1)),
coords = c("X", "Y"), crs = NA_integer_
)
dat <- data.frame(
id = rep(1:4, each = 5),
time = rep(1:5, 4),
y = rnorm(20),
x1 = rnorm(20)
)
bw <- select_bandwidth(
y ~ x1, data = dat, spatial = pts, id = "id", time = "time",
method = "grid",
control = list(lower = 0.5, upper = 2, step = 0.5),
workers = 1
)
bw$best_bandwidth
Spatial SF Module for GWPR.light 1.0.0
Description
Internal functions providing the 'sf'-first spatial interface. These functions abstract geometry extraction, coordinate derivation, spatial alignment to panel data, and neighbour-structure construction. They are used by the data preparation and diagnostics modules.
Standardise a binary response variable to 0/1 numeric
Description
Converts logical and two-level factor responses to 0/1 integer. Numeric 0/1 vectors are returned unchanged (as numeric). Other inputs raise an error.
Usage
standardize_binary_response(y)
Arguments
y |
A vector that is 0/1 numeric, logical, or a two-level factor. |
Details
For factor inputs the first level is mapped to 0 and the second level is mapped to 1.
Value
A numeric vector of 0s and 1s.
Coerce a binary response to a numeric 0/1 integer vector
Description
Converts logical and two-level factor responses to 0/1 integer.
Numeric 0/1 vectors are returned unchanged.
Factors with more than two levels raise an error via
validate_binary_response().
Usage
standardize_logistic_response(y)
Arguments
y |
A numeric, logical, or factor vector. |
Value
An integer vector of 0s and 1s.
Summarise a gwpr_bandwidth object
Description
Prints the search method, best bandwidth, and a brief history overview.
Usage
## S3 method for class 'gwpr_bandwidth'
summary(object, ...)
Arguments
object |
A 'gwpr_bandwidth' object. |
... |
Currently ignored. |
Value
Invisibly returns 'object'.
Examples
library(sf)
pts <- sf::st_as_sf(
data.frame(id = 1:4, X = c(0,1,0,1), Y = c(0,0,1,1)),
coords = c("X", "Y"), crs = NA_integer_
)
dat <- data.frame(id = rep(1:4, each = 5), time = rep(1:5, 4),
y = rnorm(20), x1 = rnorm(20))
bw <- select_bandwidth(y ~ x1, data = dat, spatial = pts, id = "id",
time = "time", method = "grid",
control = list(lower = 1, upper = 3, step = 1), workers = 1)
summary(bw)
Summarise a gwpr_diagnostics object
Description
Prints each diagnostic test result with statistic and p-value where available.
Usage
## S3 method for class 'gwpr_diagnostics'
summary(object, ...)
Arguments
object |
A 'gwpr_diagnostics' object. |
... |
Currently ignored. |
Value
Invisibly returns 'object'.
Examples
library(sf)
pts <- sf::st_as_sf(
data.frame(id = 1:4, X = c(0,1,0,1), Y = c(0,0,1,1)),
coords = c("X", "Y"), crs = NA_integer_
)
dat <- data.frame(id = rep(1:4, each = 5), time = rep(1:5, 4),
y = rnorm(20), x1 = rnorm(20))
fit <- fit_gwpr(y ~ x1, data = dat, spatial = pts, id = "id",
time = "time", bandwidth = 2, workers = 1)
diag_obj <- diagnose_gwpr(fit, diagnostics = c("f_test", "hausman"))
summary(diag_obj)
Summarise a gwpr_fit object
Description
Prints the global model overview, quantile summary of local coefficients (when available), and goodness-of-fit metrics.
Usage
## S3 method for class 'gwpr_fit'
summary(object, ...)
Arguments
object |
A 'gwpr_fit' object. |
... |
Currently ignored. |
Value
Invisibly returns 'object'.
Examples
library(sf)
pts <- sf::st_as_sf(
data.frame(id = 1:4, X = c(0,1,0,1), Y = c(0,0,1,1)),
coords = c("X", "Y"), crs = NA_integer_
)
dat <- data.frame(id = rep(1:4, each = 5), time = rep(1:5, 4),
y = rnorm(20), x1 = rnorm(20))
fit <- fit_gwpr(y ~ x1, data = dat, spatial = pts, id = "id",
time = "time", bandwidth = 2, workers = 1)
summary(fit)
Validate bandwidth parameter
Description
Checks that the supplied bandwidth is legal: * fixed (adaptive = FALSE): must be a finite positive numeric scalar. * adaptive (adaptive = TRUE): must be a finite positive integer scalar.
Usage
validate_bandwidth(bandwidth, adaptive)
Arguments
bandwidth |
Numeric scalar. The bandwidth value to validate. |
adaptive |
Logical scalar. 'TRUE' for adaptive (k-nearest-neighbour) bandwidth, 'FALSE' for fixed distance bandwidth. |
Value
Invisibly returns 'TRUE' when validation passes.
Validate bandwidth search control parameters
Description
Validate bandwidth search control parameters
Usage
validate_bandwidth_control(method, control, adaptive)
Arguments
method |
Character: '"grid"', '"sgd"', or '"random"'. |
control |
Named list of control parameters. |
adaptive |
Logical; if 'TRUE', bandwidth is a neighbour count. |
Value
Invisibly returns 'TRUE' when valid; stops otherwise.
Validate that a response vector is suitable for binary logistic regression
Description
Stops with an informative error when y is a factor with more than two
levels. Passes through numeric 0/1 vectors and two-level factors unchanged.
Usage
validate_binary_response(y)
Arguments
y |
A numeric, logical, or factor vector. |
Value
Invisibly TRUE when validation passes.
Validate the response variable against the specified family
Description
Validate the response variable against the specified family
Usage
validate_family_response(data, formula, family)
Arguments
data |
A data frame. |
formula |
A formula; the left-hand side is the response variable. |
family |
Character string: '"gaussian"' or '"binomial"'. |
Value
Invisibly returns 'TRUE' when valid; stops otherwise.
Validate a model formula against a data frame
Description
Validate a model formula against a data frame
Usage
validate_formula(formula, data)
Arguments
formula |
A formula object. |
data |
A data frame containing the model variables. |
Value
Invisibly returns 'TRUE' when valid; stops with an informative message when a problem is detected.
Validate a gwpr_context object
Description
Checks that all core fields required for model fitting are non-'NULL'. Stops with an informative message listing every missing field.
Usage
validate_gwpr_context(context)
Arguments
context |
A list (typically of class '"gwpr_context"') to validate. |
Details
Core fields: 'formula', 'family', 'id', 'time', 'model', 'effect', 'kernel', 'adaptive', 'threshold', 'workers'.
Value
Invisibly returns 'TRUE' when all core fields are present.
Validate all inputs to the main GWPR functions
Description
This is the top-level validation entry point. It calls the individual 'validate_*' helpers and also checks 'model', 'effect', and 'kernel'.
Usage
validate_inputs(
formula,
data,
spatial,
id,
time,
family = c("gaussian", "binomial"),
model = c("within", "pooling", "random"),
effect = c("individual", "time", "two-way", "nested"),
kernel = c("bisquare", "gaussian", "exponential", "tricube", "boxcar"),
adaptive = FALSE,
workers = 1L
)
Arguments
formula |
A formula object. |
data |
A data frame. |
spatial |
An 'sf' object. |
id |
Name of the unit ID column (character). |
time |
Name of the time column (character). |
family |
'"gaussian"' or '"binomial"'. |
model |
'"pooling"', '"within"', or '"random"'. |
effect |
'"individual"', '"time"', '"two-way"', or '"nested"'. |
kernel |
One of '"bisquare"', '"gaussian"', '"exponential"', '"tricube"', '"boxcar"'. |
adaptive |
Logical. |
workers |
Positive integer. |
Value
Invisibly returns 'TRUE' when all checks pass; stops otherwise.
Validate panel index columns in a data frame
Description
Validate panel index columns in a data frame
Usage
validate_panel_index(data, id, time)
Arguments
data |
A data frame. |
id |
Name of the unit (individual) index column. |
time |
Name of the time index column. |
Value
Invisibly returns 'TRUE' when valid; stops otherwise.
Validate a spatial sf object
Description
Validate a spatial sf object
Usage
validate_spatial(spatial, id)
Arguments
spatial |
An 'sf' object representing spatial units. |
id |
Name of the ID column that must be present in 'spatial'. |
Value
Invisibly returns 'TRUE' when valid; stops otherwise.
Validate the workers argument
Description
Validate the workers argument
Usage
validate_workers(workers)
Arguments
workers |
Number of parallel workers. Must be a positive integer. |
Value
Invisibly returns 'TRUE' when valid; stops otherwise.
Input Validation Functions for GWPR.light 1.0.0
Description
These internal functions validate user inputs before any expensive computation, providing early and consistent error messages.
Distance and Kernel Weights Module for GWPR.light 1.0.0
Description
Internal functions for computing spatial distances and geographically weighted kernel weights. Supports fixed and adaptive bandwidths, and five kernel functions: bisquare, gaussian, exponential, tricube, and boxcar.
Execute an expression with a reproducible seed
Description
Sets set.seed(seed) before evaluating expr, then restores the
prior RNG state so the caller's random stream is unaffected.
Usage
with_reproducible_seed(seed, expr)
Arguments
seed |
Integer scalar. Seed value passed to |
expr |
An R expression to evaluate. |
Value
The value of expr.
Examples
r1 <- with_reproducible_seed(42, runif(3))
r2 <- with_reproducible_seed(42, runif(3))
stopifnot(identical(r1, r2))