Introduction to optedr: optimal designs for non-linear models

library(optedr)

The optimal design problem

In non-linear regression the information that an experiment provides about the model parameters depends on where observations are taken, not just how many. The Fisher information matrix at a design point \(x\) is

\[M(\xi) = \int f(x)\, f(x)^{\top} \, \xi(dx),\]

where \(f(x) = \nabla_\theta \mu(x, \theta)\) is the gradient of the mean response. Different optimality criteria summarise \(M(\xi)\) into a single number to minimise:

Criterion Objective Key argument
D-Optimality \(-\log\det M\)
Ds-Optimality focuses on a subset of parameters par_int
A-Optimality \(\operatorname{tr} M^{-1}\)
I-Optimality integrated prediction variance over a region reg_int
L-Optimality \(\operatorname{tr}(B \, M^{-1})\) for user-supplied \(B\) matB

All are computed with opt_des(), which implements the cocktail algorithm (Yu, 2011).

A first example: D-Optimality in one factor

The Antoine equation models vapour pressure as a function of temperature. With two parameters and one design variable the D-optimal design has exactly two support points.

result_D <- opt_des(
  criterion    = "D-Optimality",
  model        = y ~ a * exp(-b / x),
  parameters   = c("a", "b"),
  par_values   = c(1, 1500),
  design_space = c(212, 422)
)
#> 
#> ℹ Stop condition not reached, max iterations performed
#> 
⠙ Calculating optimal design 22 done (33/s) | 664ms
ℹ The lower bound for efficiency is 99.9982181686267%
#> 
                                                    
result_D
#> Optimal design for D-Optimality:
#>      Point    Weight
#> 1 329.2963 0.5000089
#> 2 422.0000 0.4999911

plot() shows the sensitivity function \(d(x, \xi)\). By the Equivalence Theorem the design is optimal if and only if \(d(x, \xi) \leq k\) everywhere, with equality at every support point.

plot(result_D)
Sensitivity function for the D-optimal design.

Sensitivity function for the D-optimal design.

summary() gives a fuller breakdown including the criterion value and convergence diagnostics.

summary(result_D)
#> Model:
#> y ~ a * exp(-b/x)
#> Weight function:
#> function (x) 
#> 1 
#> 
#> Optimal design for D-Optimality:
#>      Point    Weight
#> 1 329.2963 0.5000089
#> 2 422.0000 0.4999911
#> 
#> Minimum efficiency (Atwood): 99.9982181686267%
#> Criterion value:             9.97281e+06

The $atwood component reports the Atwood criterion (percentage deviation from the theoretical optimum bound): values close to 100 % confirm convergence.

Other optimality criteria

Ds-Optimality

par_int selects the indices of the parameters of interest. Here we focus only on th0:

result_Ds <- opt_des(
  criterion    = "Ds-Optimality",
  model        = y ~ th0 * exp(x / th1),
  parameters   = c("th0", "th1"),
  par_values   = c(10.4963, -3.2940),
  design_space = c(0.94, 30),
  par_int      = c(1)
)
#> 
#> ℹ Stop condition reached: difference between sensitivity and criterion < 1e-05
#> 
⠙ Calculating optimal design 9 done (31/s) | 288ms
ℹ The lower bound for efficiency is 99.9998969351635%
#> 
                                                   
result_Ds
#> Optimal design for Ds-Optimality:
#>      Point    Weight
#> 1 0.940000 0.6042053
#> 2 5.147919 0.3957947

A-Optimality

Minimises the average variance of the parameter estimators:

result_A <- opt_des(
  criterion    = "A-Optimality",
  model        = y ~ a * exp(-b / x),
  parameters   = c("a", "b"),
  par_values   = c(1, 1500),
  design_space = c(212, 422)
)
#> 
#> ℹ Stop condition not reached, max iterations performed
#> 
⠙ Calculating optimal design 22 done (35/s) | 629ms
ℹ The lower bound for efficiency is 99.9999982911678%
#> 
                                                    
result_A
#> Optimal design for A-Optimality:
#>      Point    Weight
#> 1 310.3784 0.7821613
#> 2 422.0000 0.2178387

I-Optimality

Minimises the integrated prediction variance over a region of interest reg_int. Useful when prediction in a specific sub-range matters more than global estimation:

result_I <- opt_des(
  criterion    = "I-Optimality",
  model        = y ~ a * exp(-b / x),
  parameters   = c("a", "b"),
  par_values   = c(1, 1500),
  design_space = c(212, 422),
  reg_int      = c(380, 422)
)
#> 
#> ℹ Stop condition reached: difference between sensitivity and criterion < 1e-05
#> 
⠙ Calculating optimal design 5 done (42/s) | 119ms
ℹ The lower bound for efficiency is 99.9999993787934%
#> 
                                                   
result_I
#> Optimal design for I-Optimality:
#>     Point    Weight
#> 1 357.518 0.4346119
#> 2 422.000 0.5653881

L-Optimality

matB is a symmetric positive-semidefinite \(k \times k\) matrix that selects which variances to minimise. Setting matB = diag(c(1, 0)) focuses entirely on the variance of a:

result_L <- opt_des(
  criterion    = "L-Optimality",
  model        = y ~ a * exp(-b / x),
  parameters   = c("a", "b"),
  par_values   = c(1, 1500),
  design_space = c(212, 422),
  matB         = diag(c(1, 0))
)
#> 
#> ℹ Stop condition reached: difference between sensitivity and criterion < 1e-05
#> 
⠙ Calculating optimal design 22 done (34/s) | 642ms
ℹ The lower bound for efficiency is 99.9999982893632%
#> 
                                                    
result_L
#> Optimal design for L-Optimality:
#>      Point    Weight
#> 1 310.3784 0.7253377
#> 2 422.0000 0.2746623

Comparing D- and L-optimal designs shows how the support points shift when precision is concentrated on one parameter:

cat("D-optimal support:\n"); print(result_D$optdes)
#> D-optimal support:
#>      Point    Weight
#> 1 329.2963 0.5000089
#> 2 422.0000 0.4999911
cat("L-optimal support:\n"); print(result_L$optdes)
#> L-optimal support:
#>      Point    Weight
#> 1 310.3784 0.7253377
#> 2 422.0000 0.2746623

Multi-dimensional design spaces

For models with two or more design variables pass design_space as a named list. Variable names in the list must match those used in model.

Two-factor model: bisubstrate Michaelis-Menten

Enzyme kinetics with two substrates: \(\mu = V_{\max} x_1 x_2 / [(K_1 + x_1)(K_2 + x_2)]\).

result_2D <- opt_des(
  criterion    = "D-Optimality",
  model        = y ~ Vmax * x1 * x2 / ((K1 + x1) * (K2 + x2)),
  parameters   = c("Vmax", "K1", "K2"),
  par_values   = c(1, 1, 1),
  design_space = list(x1 = c(0.1, 10), x2 = c(0.1, 10))
)
#> 
#> ℹ Stop condition reached: difference between sensitivity and criterion < 1e-05
#> 
⠙ Calculating optimal design 14 done (13/s) | 1s
ℹ The lower bound for efficiency is 99.9990270465501%
#> 
                                                 
result_2D
#> Optimal design for D-Optimality (2 factors):
#>           x1         x2    Weight
#> 1  0.8322886 10.0000000 0.3333304
#> 2 10.0000000  0.8343059 0.3333391
#> 3 10.0000000 10.0000000 0.3333304

plot() for a two-factor model returns a viridis heatmap of the sensitivity function, with the support points overlaid in red:

plot(result_2D)
Sensitivity heatmap for the 2D D-optimal design.

Sensitivity heatmap for the 2D D-optimal design.

L-Optimality extends naturally to multi-dimensional spaces. Here we minimise only the variance of \(K_1\):

result_2D_L <- opt_des(
  criterion    = "L-Optimality",
  model        = y ~ Vmax * x1 * x2 / ((K1 + x1) * (K2 + x2)),
  parameters   = c("Vmax", "K1", "K2"),
  par_values   = c(1, 1, 1),
  design_space = list(x1 = c(0.1, 10), x2 = c(0.1, 10)),
  matB         = diag(c(0, 1, 0))
)
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> 
#> ℹ Stop condition not reached, max iterations performed
#> 
⠙ Calculating optimal design 22 done (19/s) | 1.2s
#> Warning: The Atwood efficiency bound is -412.6%, which is outside [0, 100].
#> This indicates a degenerate information matrix -the model is likely not
#> identifiable in all specified parameters. Check for parameter redundancy (see
#> any preceding warnings).
#> ℹ The lower bound for efficiency is -412.552180354584%
#> 
                                                   
result_2D_L
#> Optimal design for L-Optimality (2 factors):
#>           x1 x2    Weight
#> 1  0.6043674 10 0.7070278
#> 2 10.0000000 10 0.2929722

Three factors and beyond

For \(d \geq 3\) factors plot() switches to a scatter-matrix with \(\binom{d}{2}\) panels, one per pair of variables. Point size is proportional to weight.

result_3D <- opt_des(
  criterion    = "D-Optimality",
  model        = y ~ Vmax * x1 * x2 * x3 / ((K1+x1) * (K2+x2) * (K3+x3)),
  parameters   = c("Vmax", "K1", "K2", "K3"),
  par_values   = c(1, 1, 1, 1),
  design_space = list(x1 = c(0.1, 10), x2 = c(0.1, 10), x3 = c(0.1, 10))
)
#> 
⠙ Calculating optimal design 6 done (2.9/s) | 2.1s

⠹ Calculating optimal design 7 done (3/s) | 2.3s  

⠸ Calculating optimal design 8 done (3.1/s) | 2.6s

⠼ Calculating optimal design 9 done (3.1/s) | 2.9s

⠴ Calculating optimal design 10 done (3.2/s) | 3.2s

⠦ Calculating optimal design 11 done (3.2/s) | 3.4s

⠧ Calculating optimal design 12 done (3.2/s) | 3.7s

⠇ Calculating optimal design 13 done (3.3/s) | 4s  

⠏ Calculating optimal design 14 done (3.3/s) | 4.3s

⠋ Calculating optimal design 15 done (3.3/s) | 4.6s

⠙ Calculating optimal design 16 done (3.3/s) | 4.9s

⠹ Calculating optimal design 17 done (3.3/s) | 5.2s

⠸ Calculating optimal design 18 done (3.3/s) | 5.4s

⠼ Calculating optimal design 19 done (3.3/s) | 5.7s
#> ℹ Stop condition reached: difference between sensitivity and criterion < 1e-05
#> 
⠴ Calculating optimal design 20 done (3.3/s) | 6s  
ℹ The lower bound for efficiency is 99.9990957865302%
#> 
                                                  
result_3D
#> Optimal design for D-Optimality (3 factors):
#>           x1         x2        x3    Weight
#> 1  0.8326969 10.0000000 10.000000 0.2499979
#> 2 10.0000000 10.0000000 10.000000 0.2499979
#> 3 10.0000000 10.0000000  0.834917 0.2500063
#> 4 10.0000000  0.8340851 10.000000 0.2499979
plot(result_3D)

Compound criterion

Sometimes no single criterion captures the experimental goals. criterion = "Compound" combines any set of criteria as a weighted sum of their sensitivity functions. Weights are normalised automatically.

result_DI <- opt_des(
  criterion    = "Compound",
  model        = y ~ 10^(a - b / (c + x)),
  parameters   = c("a", "b", "c"),
  par_values   = c(8.07131, 1730.63, 233.426),
  design_space = c(1, 100),
  compound     = list(
    list(criterion = "D-Optimality", weight = 0.7),
    list(criterion = "I-Optimality", weight = 0.3, reg_int = c(60, 100))
  )
)
#> 
#> ℹ Stop condition not reached, max iterations performed
#> 
⠙ Calculating optimal design 22 done (12/s) | 1.8s
ℹ The lower bound for efficiency is 99.9984745356004%
#> 
                                                   
result_DI
#> Optimal design for Compound (0.70*D + 0.30*I):
#>       Point    Weight
#> 1  46.50267 0.3047727
#> 2  82.87444 0.4062833
#> 3 100.00000 0.2889441

Comparing with the pure D-optimal design shows how the composite shifts one support point toward the prediction region:

result_D_ant <- opt_des(
  "D-Optimality",
  y ~ 10^(a - b / (c + x)), c("a", "b", "c"),
  c(8.07131, 1730.63, 233.426), c(1, 100)
)
#> 
#> ℹ Stop condition not reached, max iterations performed
#> 
⠙ Calculating optimal design 22 done (27/s) | 807ms
ℹ The lower bound for efficiency is 99.9986187558838%
#> 
                                                    
cat("D-optimal:\n");         print(result_D_ant$optdes)
#> D-optimal:
#>       Point    Weight
#> 1  44.90080 0.3333415
#> 2  83.19032 0.3333292
#> 3 100.00000 0.3333293
cat("Compound D+I (70/30):\n"); print(result_DI$optdes)
#> Compound D+I (70/30):
#>       Point    Weight
#> 1  46.50267 0.3047727
#> 2  82.87444 0.4062833
#> 3 100.00000 0.2889441

Design efficiency

design_efficiency() measures the fraction of the optimal information that a given design achieves: an efficiency of 0.80 means the design needs \(1/0.8 = 1.25\times\) more observations to match the optimal.

design_ad_hoc <- data.frame(
  Point  = c(220, 300, 400),
  Weight = c(1/3, 1/3, 1/3)
)
eff <- design_efficiency(design_ad_hoc, result_D)
#> ℹ The efficiency of the design is 47.3464225136928%
cat("Efficiency of ad-hoc design:", round(eff * 100, 2), "%\n")
#> Efficiency of ad-hoc design: 47.35 %

For multi-dimensional designs pass a data frame with one column per factor plus a Weight column:

corners_2d <- data.frame(
  x1     = c(0.1, 10,  0.1, 10),
  x2     = c(0.1, 0.1, 10,  10),
  Weight = rep(0.25, 4)
)
eff_2d <- design_efficiency(corners_2d, result_2D)
#> ℹ The efficiency of the design is 19.3350707865126%
cat("Efficiency of corner design vs 2D D-optimal:", round(eff_2d * 100, 2), "%\n")
#> Efficiency of corner design vs 2D D-optimal: 19.34 %

Rounding to exact designs

Optimal designs assign continuous weights. Two functions convert them to integer replication counts for a given total \(n\).

efficient_round()

Uses the multiplier \(n - l/2\) (where \(l\) is the number of support points) and adjusts to guarantee the total is exactly \(n\):

exact_design <- efficient_round(result_D$optdes, n = 20)
print(exact_design)
#>      Point Weight
#> 1 329.2963     10
#> 2 422.0000     10
cat("Total observations:", sum(exact_design$Weight), "\n")
#> Total observations: 20

combinatorial_round()

Exhaustive floor/ceiling search: optimal for small \(k\) (number of support points) since it checks all \(2^k\) combinations. Pass the optdes object so the function can use the criterion value for comparison:

combo_design <- combinatorial_round(result_D, n = 10)
print(combo_design)
#>      Point Weight
#> 1 329.2963      5
#> 2 422.0000      5

Once rounded, compare the exact design’s efficiency against the approximate optimum:

approx <- exact_design
approx$Weight <- approx$Weight / sum(approx$Weight)
cat("Efficiency of rounded design:", round(design_efficiency(approx, result_D) * 100, 2), "%\n")
#> ℹ The efficiency of the design is 100.000000015874%
#> Efficiency of rounded design: 100 %