---
title: "**levelSets** Problem Setup"
author: "Richard Raubertas"
date: "03 April 2026"
output: 
  pdf_document: 
    toc: true
    toc_depth: 3
    number_sections: true
  html_document:
    toc: true
    toc_depth: 3
    number_sections: true
vignette: >
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteIndexEntry{Problem Setup}
  %\VignetteEncoding{UTF-8}
---

<!-- Vignette 2 for 'levelSets' package -->

<!-- Global options -->
```{r, include=FALSE}
knitr::opts_chunk$set(echo=TRUE, fig.align="center")
has_ggplot <- requireNamespace("ggplot2", quietly=TRUE)
```

```{r, echo=FALSE, eval=has_ggplot}
ggplot2::theme_update(plot.background=ggplot2::element_rect(fill=NA))
```

# Introduction

This vignette goes beyond the *Introduction to the **levelSets** Package* 
vignette to provide 
more details about how to set up and run a level set problem.  It uses a 
simple 2D problem for illustration.  More extended examples are provided 
in the *Example* vignettes.

Recall that a level set of a response function $f(x)$ is the set of 
$d$-dimensional input 
points $x$ for which the function value is greater than or equal to a 
specified threshold $t$: 
$$L_f(t) = \{x: f(x) \ge t\}$$
If there 
are restrictions on the inputs that $f$ will accept, we call the allowed 
portion of the input space the *feasible region* for $f$, and any point 
outside the feasible region is *infeasible*.

This package maps out the boundary of a level set by finding 
its intersections with collections of 1-dimensional *rays*.  A ray is just 
a line with a defined origin (referred to as *position 0* of the ray) and 
orientation.  A ray's orientation is specified by a second, 
distinct point that defines position 1 along the ray.  Any other point 
on the ray has a *position* value obtained by interpolation or extrapolation 
from the origin and position 1 points.  (Ray points on the opposite
side of the origin from the position 1 point have negative position values.)

The package provides tools to generate rays, find intersections, and 
visualize results. 
  
# An example

The Rosenbrock "banana" function will be used for illustration.  It has a 
2-dimensional input space, and a minimum value of 0 at the point (1, 1). 
Since level sets are defined to have function values at or *above* a threshold, 
we use the negative of the function:
$$f(x_1, x_2) = -100((x_2 - x_1^2)^2) - (1 - x_1)^2$$
Contours of this function can be strongly curved, so level 
sets may not be convex.  The level set threshold we use is $f \ge -50$.

The response function can be coded in `R` as

```{r}
respf <- function(x, inclGrad=FALSE) {
  y <- -1 * (100*((x[, 2] - x[, 1]^2)^2) + (1 - x[, 1])^2)
  if (inclGrad) {
    grad <- cbind(400*(x[, 1]*(x[, 2] - x[, 1]^2)) + 2*(1 - x[, 1]), 
                  -200*(x[, 2] - x[, 1]^2))
    attr(y, "gradient") <- grad
  }
  y
}
thresh <- -50
```

The gradient calculation is simple so it is included, but gradients are 
not necessary in general.

Although 
this function is well-defined for any $(x_1, x_2)$, for illustration we 
define the feasible region to be $\{x: x_1 \ge -2\}$.  A feasibility 
function can be coded as

```{r}
feasf <- function(x) {
     (x[, 1] >= -2)
}
```

In this package point coordinates are always represented as rows of a 
$d$-column matrix, one row per point.  Both the response and feasibility 
functions must accept as their first argument a matrix of point coordinates, 
and return a vector with one value per point:  a numeric value for the 
response function, and TRUE or FALSE for the feasibility function.

The following sections describe package functions and classes, in the order 
they are likely to be used in studying a level set.

# Specifying the response function and its input space

## The `fnObj` function and class

The primary function for this step is `fnObj()`, which creates objects of 
that class.  `fnObj` objects contain the response and feasibility functions 
along with related information.  This section covers the basics for the 
main arguments; see `?fnObj` for additional arguments and full details. 
An example of its use for the banana response function is 

```{r}
library(levelSets)
fobj <- fnObj(c("x1", "x2"), respfn=respf, feasfn=feasf, hasgrad=TRUE)
```

The first argument must be a character vector of names for the input space 
dimensions.  Dimension names must be strings that would be valid names 
for `R` objects.

### `respfn` and `...` arguments

`respfn` is a user-supplied function that takes as its first argument a 
matrix of point coordinates, and returns a vector of response function 
values at those points.  Any additional arguments in `...` will be passed 
to this function on each call.

### `feasfn`, `feasbnds` and `...` arguments

`feasfn` is an optional user-supplied function that like `respfn` takes a 
matrix of point coordinates as its first argument, and returns a logical vector 
of values at those points.  TRUE 
means the point belongs to the feasible region of the response function. 
Any additional arguments in `...` are also passed to this function on each 
call.

`feasbnds` is an optional argument that allows simple box constraints on 
any or all of the input space dimensions to be specified without writing a 
separate function.  In the example of section 2, the constraint $x_1 \ge -2$ 
could also have been specified by `feasbnds=list(x1=c(-2, Inf))` or 
`feasbnds=list(x1=c(-2, NA))`.  (`NA` is interpreted as `-Inf` for lower 
bounds and `Inf` for upper bounds.)  If both 
`feasfn` and `feasbnds` are specified, a point is considered feasible only 
if it satisfies both sets of constraints.

The algorithms in this package assume the feasible region of the response 
function is a *closed* (although possibly unbounded) set.  So, for example, 
one can impose a non-negativity constraint ($x_1 \ge 0$), but not a strict 
positivity constraint ($x_1 > 0$).

`respfn` will never be called with points that are infeasible according to 
`feasfn` or `feasbnds`.  Instead the response function value at those points 
will be set to NA.

### `hasgrad` and `derivmethod` arguments

`hasgrad` is an optional logical scalar that indicates whether `respfn` is 
able to calculate the gradient of the response function at feasible points. 
The default is FALSE.

`derivmethod` is an optional character string indicating whether and how 
to calculate the derivative of the response function with respect to position 
along a line or ray.  One of `"gradient"`, `"finite difference"`, or `"none"` 
(to skip the calculation).  The `"gradient"` option will calculate analytic 
derivatives.  It requires that `respfn` has 
the ability to calculate the gradient of the response function, and is 
the default when `hasgrad` is TRUE.  See `?fnObj` 
for details.  `"finite difference"` will use 
a numerical approximation for derivatives.  

Note that derivative calculations are entirely optional, and in fact 
play only a minor role in finding level set boundary points.^[Briefly: 
When the search along a ray finds multiple points where the ray intersects 
the boundary of the level set, it can check the signs of the derivatives 
for consistency.  Two consecutive intersections with the same sign suggest 
that there is an intermediate intersection that has been missed.]
If the response function is discontinuous, or if one is confident that 
the level set has only a single part with a fairly smooth boundary, it 
is quite reasonable to set `derivmethod` to `"none"` to save some 
computation.

### `inptol` and `resptol` arguments

These optional arguments specify numerical tolerances for point coordinates 
in input space 
and for response function values, respectively.  The idea is that coordinates 
or response values that differ by no more than the tolerance can be considered
equivalent for practical purposes.  Tolerances also allow for the fact that 
computer calculations inevitably introduce roundoff and representation errors. 
These cause values that theoretically should be identical to differ slightly. 
Without allowing some slack in comparisons of such values, the algorithms in 
this package will fail.

Tolerance specifications consist of 
both a relative (or fractional) tolerance applied to the magnitude of a 
value, and a minimum absolute tolerance, independent of value magnitude. 
The actual absolute tolerance for a coordinate or response value is the 
larger of the two.  See `?fnObj.character`, `?lsetPkgOpt`, and `?absTol` 
for more details.

There are default tolerances, which can be 
seen or changed using the `lsetPkgOpt()` function.

```{r}
str(lsetPkgOpt(c("inptol", "resptol")))
```

## `fnSpec` class and function

`fnSpec` objects contain a subset of the information in an `fnObj` object, 
including `inpnames`, `feasbnds`, `inptol` and `resptol`.  They are created 
or extracted from other objects by the `fnSpec()` function.  Their main use 
is to provide a way to specify enough information about an input space to 
create `inpRect` 
and `inpRays` objects that are not tied to a specific response function (next 
two sections).

# Specifying the search region for a level set

The user must specify a finite region of input space where the search for 
level set boundary points will be focused.  The search region takes the 
form of a $d$-dimensional box or hyper-rectangle, represented by objects 
of class `inpRect`, and created by the `inpRect()` function.  See the 
documentation for the latter for details.

Ideally the search region should be slightly larger than the smallest 
hyper-rectangle that contains the level set.  However if an initial 
guess at the 
search region turns out to be too small, too large, or simply in the 
wrong part of input space, the search can be resumed with an updated 
search region.  Functions `boundingRect()` and `segBoundingRect()` may 
help with choosing that.

Note that the search region is distinct from the feasible region of the 
response function.  The search region is allowed to include infeasible 
points.  In fact, when the level set touches the boundary of the feasible 
region, it actually helps the search algorithm if the search region extends 
beyond the feasible region in that direction.

An example search region for the banana function:

```{r}
rect0 <- inpRect(cbind(c(-3, 3), c(-3, 6)), spec=fobj)
```

# Specifying search rays

Points on the boundary of a level set are identified by finding where 
1-dimensional rays intersect the boundary.  Collections of rays, with a 
shared origin, are represented by `inpRays` objects.  As mentioned in the 
Introduction, points along a ray are identified by their 
*position* relative 
to the ray origin (position 0) and second defining point for the ray 
(position 1).  Rays are in principle infinite so there are points at any 
position value, positive or negative.  There is no requirement that the 
Euclidean distance between positions 0 and 1 be the same for different rays 
in an `inpRays` object.

Degenerate rays, where the position 0 and position 1 points are equal or 
nearly equal, are not allowed.  Whether two points are considered nearly 
equal is determined by the numerical tolerances for point coordinates, as 
described for `inptol` in section 3.

A set of rays can be created with the `inpRays()` function.  The 
`randomRays()` and `axisRays()` functions create rays with specific 
properties; see their documentation.

## Rays and hyper-rectangles

The ray-creating functions have an optional argument `rect` that accepts 
an `inpRect` object.  If specified, the ray origin must belong to the 
hyper-rectangle, and positions along each ray are (re)defined so 
that position 1 is the ray's intersection with the boundary of the 
hyper-rectangle.  This was illustrated in the first figure of the 
*Introduction* vignette. 

The following creates 25 rays 
with random directions from an origin point (1, 1), and defined so that 
position 1 for each ray is its intersection with `rect0`:

```{r, out.width='45%'}
set.seed(1)
rays0 <- randomRays(25, origin=c(1, 1), rect=rect0)
plot(rays0, main="Random rays, position 1 defined by 'rect0'")
```

# Scaling of input space dimensions

For some problems the dimensions of the input space have different units, 
or point coordinates for different dimensions have different magnitudes, 
or the level set of interest is elongated in certain directions.  In these 
cases it is important to specify that package algorithms should rescale 
and/or weight 
the dimensions when they calculate distances or angles in input space. 
`inpScale` objects provide a way to specify such scaling.  (This is 
different from the redefinition of position 1 along rays discussed in 
the previous section.)

The following example in two dimensions illustrates the issue.  Suppose 
the level set is believed to have a wider range of $x_1$ values than $x_2$ 
values, and the search region reflects this:

```{r}
fspec <- fnSpec(inpnames=c("x1", "x2"))
rect1 <- inpRect(rbind(c(0.5, 3), c(5.5, 3.5)), spec=fspec)
```

By default `randomRays()` creates ray vectors that are uniformly distributed 
around the unit circle; i.e. it treats both dimensions equally and 
symmetrically.  When such rays are used with an asymmetric search region 
we can end up undersampling the long dimension of the region, as seen in 
the first plot below.  Scaling each dimension in inverse proportion to the 
lengths of the sides 
of the search region gives a better distribution (second plot).

```{r, fig.show='hold', out.width='45%'}
set.seed(1)
# Rays with uniformly distributed directions and equal position 1 lengths:
rays_u <- randomRays(50, origin=c(1, 3.25), spec=fspec)
# Position 1 redefined by intersection with a hyper-rectangle:
rays1 <- update(rays_u, rect=rect1)
plot(rays1, main="Random rays, no dimension scaling")
# Specify scaling for input space dimensions:
scl2 <- inpScale(rectSize(rect1), spec=fspec)
# Same rectangle, but ray generation now reflects input dimension scaling:
set.seed(1)
rays2 <- randomRays(50, origin=c(1, 3.25), scale=scl2, rect=rect1)
plot(rays2, main="Random rays, dimensions scaled by 'rect' sides")
```

Scaling factors should be chosen so that division by them standardizes 
point coordinates:  coordinates for different dimensions should have roughly 
similar magnitude and spread, preferably with a magnitude not too far from 1.

`inpScale` objects in fact allow for more general types of scaling, such 
as correlation between input space dimensions.  See `?inpScale` for details.

# Profiling a response function along rays

The `respProfiles()` function allows one to evaluate the response function 
at an arbitrary set of positions along each of a set of rays.  Returning to 
the banana response function of section 2:

```{r}
rays3 <- axisRays(degree=2, origin=c(1, 1), rect=rect0)
prof3 <- respProfiles(fobj, rays=rays3, lsetthresh=-50, posns=(-10:10)/10)
```

`axisRays()` with `degree=2` generates four rays:  one parallel to each of 
the two coordinate axes, and two parallel to the diagonals with slopes of 
$\pm 1$. The `posns` argument to `respProfiles` specifies the evaluation 
positions (the same positions 
are used for each ray).  The optional `lsetthresh` argument specifies 
the level set of interest.  It triggers a search for level set boundary points 
along each ray within the range of `posns`.  There are `summary` and `plot` 
methods for the returned `respProfiles` object.  (The `plot` method requires 
the `ggplot2` package.)

```{r}
summary(prof3)
if (requireNamespace("ggplot2", quietly=TRUE))  plot(prof3, groupBy=names(rays3))
```

Function `respInfo()` applied to a `respProfiles` object returns a data frame 
with information about response at each of the positions along each ray. 
Function `ptCoord()` returns a matrix with the corresponding point coordinates.

```{r}
head(respInfo(prof3))
head(ptCoord(prof3))
```

# Level set boundary search

The emphasis of `respProfiles()` is evaluation of the response function at 
a specified set of ray positions.  Finding level set boundary points is 
optional and secondary.  When finding boundary points is the primary goal, 
other functions are available.  

## `bdryFromRays()`: Fixed set of rays

This function finds boundary points along a single set of user-supplied rays. 
It takes as arguments the response function (`fnObj` object), 
a set of rays (an 
`inpRays` object, usually with an associated hyper-rectangle defining the 
search region), and the response 
threshold that defines the level set of interest (`lsetthresh`).  It 
also has a `control` argument, a list of options to control the search 
(see `?srchControl` for full details). 
Of these, the most important are `initPosns` and `initPosns2`.  

### The search algorithm

Along a given ray, a boundary of the level set is either (a) a point where 
the response function transitions from greater than or equal to `lsetthresh` 
to less than `lsetthresh`, or vice versa; or (b) a point where the ray 
crosses the boundary of the feasible region of the response function, with 
a response value at or above `lsetthresh`.  Both the level set and the feasible 
region are assumed to be closed (although possibly unbounded) sets, so boundary 
points are always feasible and have response values greater than or 
equal to `lsetthresh`.

For a given ray the response function is evaluated at positions 
specified in `initPosns`.  If any of those positions belongs to the 
level set, in general there will be a non-trivial segment of the ray, 
containing that 
position, that lies in the level set.  (Recall that it is assumed the 
level set has non-negligible volume in the input space.)  The function 
searches for the limits of that segment by looking for an `initPosns` 
position in both directions that is *not* in the level set, and 
using bisection to find the point(s) between them where the boundary 
is crossed.  If a segment appears to extend beyond the range of `initPosns`, 
or none of the positions in `initPosns` belongs to the level set, positions 
in `initPosns2` are added.

### `lsetSegs` objects

This approach to searching for level set boundary points yields not 
just points but *line segments* that belong to the level set, with boundary 
points as the segment endpoints.  That is the form in which `bdryFromRays()` 
returns its result:  an object of class `lsetSegs`.  There are `summary` and 
`plot` methods for these objects as shown in the example in the *Introduction* 
vignette. Information about segments can be extracted with the `segInfo()` 
function, and about segment endpoints with the `respInfo()` 
and `ptCoord()` functions.  Segment validity can be checked with the 
`lsetSegsCheck()` function.

**It is a huge help to the boundary search if the ray origin is a point 
inside the level set.**  In that case every ray is guaranteed to have a 
segment belonging to the level set.  Ray origins that are outside the 
level set are 
allowed, but there is a chance that few (or even no) rays will 
intersect the set, yielding little information about it.  The following 
two figures illustrate this for the banana response function.  Twenty-four 
random rays are used in both cases, but only 6 yield level set segments 
when the origin (plotted as `0`) is outside the level set.

```{r, out.width='45%', fig.show='hold', echo=FALSE}
set.seed(1)
# Origin in the level set.
rays1 <- randomRays(24, origin=c(1, 1), rect=rect0)
segs1 <- bdryFromRays(fobj, rays=rays1, lsetthresh=thresh)
  # Level set segment found for all rays (although some are in fact invalid).
chk <- lsetSegsCheck(segs1, fnobj=fobj, posns=(1:4)/5)
#table(chk$validIn)  # 2 of 24 invalid
#table(chk$validOut)  # all TRUE
segs1 <- lsetSegs(chk)
plot(segs1, rect=rect0, main="Origin inside the level set", pt0=c(1, 1))

# Origin outside the level set.
rays2 <- update(rays1, origin=c(-1, -2))
segs2 <- bdryFromRays(fobj, rays=rays2, lsetthresh=thresh)
chk <- lsetSegsCheck(segs2, fnobj=fobj, posns=(1:4)/5)
#table(chk$validIn)  #  0 of 6 invalid
#table(chk$validOut)  # all TRUE
segs2 <- lsetSegs(chk)
plot(segs2, rect=rect0, main="Origin outside the level set", pt0=c(-1, -2))
  # Much less informative, only 6 segs found.
```

For simple, convex level sets, with a ray origin inside the level set, 
the default `initPosns` and `initPosns2` will often be adequate. 
However when the level set has a more complicated shape or multiple parts, 
or the ray origin is not in the level set, a more closely-spaced set of 
initial positions may be necessary for informative results.  See 
`?bdryFromRays` and `?srchControl` for details.

## `bdrySearch()`: Adaptive selection of rays

In some cases more information about the boundary of a level set can be 
obtained by using multiple sets of rays, with different origins, than by 
using the same total number of rays with a single origin.  The function 
`bdrySearch()` performs an adaptive boundary search.  At each step a new 
ray origin is selected that attempts to explore portions of the boundary 
that have not been well-sampled at previous steps.  The user can control the 
number of steps and the number of rays used per step.  See `?bdrySearch` 
for details. 

The following plots show `bdrySearch()` results for the 
banana function.  There are 24 total rays as in the previous subsection, 
but now distributed over 6 adaptively selected origins.  Results are 
more informative about the overall shape and extent of the level set.
See the *Example* vignettes for more examples. 

```{r, out.width='45%', fig.show='hold', echo=FALSE}
# Initial origin in the level set.
srch1b <- bdrySearch(fobj, lsetthresh=thresh, rect=rect0, initOrigin=c(1, 1), 
                     control=list(axisDegree=2, maxSteps=6))  # 24 rays
segs1b <- srch1b$segs
chk <- lsetSegsCheck(segs1b, fnobj=fobj, posns=(1:4)/5)
#table(chk$validIn)  # 1 of 24 invalid
#table(chk$validOut)  # all TRUE
segs1b <- chk$segs
plot(segs1b, rect=rect0, main="Initial origin inside the level set", 
     pt0=c(1, 1))

# Initial origin outside the level set.
srch2b <- bdrySearch(fobj, lsetthresh=thresh, rect=rect0, initOrigin=c(-1, -2), 
                     control=list(axisDegree=2, maxSteps=6))  # 24 rays
segs2b <- srch2b$segs
chk <- lsetSegsCheck(segs2b, fnobj=fobj, posns=(1:4)/5)
#table(chk$validIn)  #  0 of 23 invalid
#table(chk$validOut)  # all TRUE
segs2b <- chk$segs
plot(segs2b, rect=rect0, main="Initial origin outside the level set", pt0=c(-1, -2))
```

## Slices of input space

When the input space has more than two dimensions, it may be useful to 
explore the boundary of the level set separately within lower-dimensional 
*slices* of the space.  A slice is just an axis-aligned hyperplane within 
the input space, which is equivalent to holding a subset of the $d$ coordinates 
at fixed values while the other coordinates are free to vary.  Therefore 
a set of $k$ slices can be represented as a matrix with $k$ rows and $d$ 
columns, where each row contains specific values for the fixed coordinates 
and `NA` for the free coordinates.  For example in a 3-dimensional input 
space

```{r, eval=FALSE, echo=TRUE}
slices <- rbind(c(1, NA, NA), c(2, NA, 3), c(NA, NA, NA))
```

represents three slices.  The first is a 2-dimensional slice consisting 
of all points with $x_1 = 1$, the second is a 1-dimensional slice (i.e., 
a line) consisting of all points with $x_1 = 2$ and $x_3 = 3$, and the 
third is the *identity slice* that allows all coordinates to vary and so 
is equivalent to the whole input space.  See `?sliceMat` for more 
information about creating slices.

Functions `slicedBdryFromRays()` and `slicedBdrySearch()` are extensions 
of `bdryFromRays()` and `bdrySearch()` respectively, that carry out 
level set boundary searches separately for one or more slices.  See 
their help pages for more information, and the *Example* vignettes for 
illustrations of their use.
