This case study was conducted by Claude (an AI assistant) to optimize tidypredict’s random forest parsing. Two important notes about the debrief package usage:
No prior knowledge: Claude had no prior knowledge of the debrief package before this session. The package was not part of Claude’s training data.
No documentation lookup: Claude did not read
debrief’s documentation, README, or help files. The only information
provided was the user’s prompt showing the basic usage pattern
(pv_print_debrief(p)). All interpretation of the output was
done from first principles based on the section names and
content.
Test-driven verification: After each
optimization, Claude ran the package’s existing test suite
(devtools::test()) to verify the changes didn’t break
functionality. All 38 relevant tests (ranger: 18, randomForest: 13,
partykit: 7) passed after the optimizations.
This demonstrates that debrief’s output is self-explanatory enough to guide optimization work without requiring prior familiarity with the package. The section names (TOP FUNCTIONS BY SELF-TIME, HOT LINES, MEMORY ALLOCATION) clearly communicate what each part shows, and the “Next steps” suggestions in the output hinted at additional functions available.
This case study demonstrates how the debrief package can guide iterative performance optimization in R. We use tidypredict’s random forest parsing as a real-world example, achieving a 331x speedup through systematic profiling and targeted fixes.
tidymodels/orbital#83
reported that parsing random forest models in tidypredict takes 45+
seconds for a single tree. We rolled back to the problematic version
(commit 0dde432) to investigate.
The user provided the following instructions to guide the profiling session:
profile the tidypredict_fit call in the following code.
library(tidymodels)
library(orbital)
rec_spec <- recipe(Sale_Price ~ ., data = ames) |>
step_normalize(all_numeric_predictors()) |>
step_dummy(all_nominal_predictors())
mod_spec <- rand_forest("regression", trees = 1)
wf_spec <- workflow(rec_spec, mod_spec)
wf_fit <- fit(wf_spec, ames)
rf_tree <- extract_fit_engine(wf_fit)
tmp <- tidypredict_fit(rf_tree)
I want you to use the profvis and debrief package to iteratively make improvements to the function
library(profvis)
library(debrief)
# Profile some code
p <- profvis({
# your code here
})
pv_print_debrief(p)This prompt established the workflow: profile with profvis, analyze
with pv_print_debrief(), optimize, and repeat.
## PROFILING SUMMARY
Total time: 47340 ms (4734 samples @ 10 ms interval)
### TOP FUNCTIONS BY SELF-TIME
15180 ms ( 32.1%) [.data.frame
5280 ms ( 11.2%) <GC>
4220 ms ( 8.9%) sys.call
3950 ms ( 8.3%) [[.data.frame
### HOT LINES (by self-time)
12710 ms ( 26.8%) /R/model-ranger.R:6
row <- tree[tree$nodeID == j, ]
2880 ms ( 6.1%) /R/model-ranger.R:7
lc <- row["leftChild"][[1]] == find
2580 ms ( 5.4%) /R/model-ranger.R:8
lr <- row["rightChild"][[1]] == find
### MEMORY ALLOCATION (by line)
15655.00 MB /R/model-ranger.R:6
row <- tree[tree$nodeID == j, ]
3450.47 MB /R/model-ranger.R:7
lc <- row["leftChild"][[1]] == find
TOP FUNCTIONS BY SELF-TIME showed
[.data.frame at 32.1% - this is the data.frame subsetting
operator, suggesting repeated data.frame operations are the
culprit.
HOT LINES pinpointed the exact line:
row <- tree[tree$nodeID == j, ] consuming 26.8% of total
time. This single line was the primary bottleneck.
MEMORY ALLOCATION revealed this same line was allocating 15.6 GB - explaining the high GC time (11.2%). Each subsetting operation creates a new data.frame copy.
The pattern: Line 6 is inside a loop (indicated
by the variable j). Filtering a data.frame with
tree[tree$nodeID == j, ] is O(n) per call. In a loop, this
becomes O(n^2).
Pre-extract columns as vectors and use direct indexing:
| Metric | Before | After |
|---|---|---|
| Time | 47,340 ms | 280 ms |
| Memory | ~18 GB | ~20 MB |
| Speedup | - | 169x |
## PROFILING SUMMARY
Total time: 290 ms (29 samples @ 10 ms interval)
### TOP FUNCTIONS BY SELF-TIME
60 ms ( 20.7%) is.na
20 ms ( 6.9%) .f
20 ms ( 6.9%) if (lc || lr) {
20 ms ( 6.9%) lc <- left_child[idx] == find
### HOT LINES (by self-time)
60 ms ( 20.7%) /R/model-ranger.R:16
if (is.na(lc)) {
20 ms ( 6.9%) /R/model-ranger.R:14
lc <- left_child[idx] == find
20 ms ( 6.9%) /R/model-ranger.R:22
if (lc || lr) {
TOP FUNCTIONS BY SELF-TIME now showed
is.na as the top consumer at 20.7%. The bottleneck shifted
from data.frame operations to NA checking.
HOT LINES showed the is.na checks
on lines 16 and 19 were being called repeatedly in the loop.
The insight: The code was searching for parent
nodes by iterating from node_id down to 0, checking each
potential parent. This linear search could be replaced with a direct
lookup.
Pre-compute parent relationships once, then traverse directly:
# Before: Linear search for parent
for (j in node_id:0) {
idx <- j + 1L
lc <- left_child[idx] == find
lr <- right_child[idx] == find
if (is.na(lc)) lc <- FALSE
if (is.na(lr)) lr <- FALSE
if (lc || lr) { ... }
}
# After: Direct traversal using parent lookup
while (!is.na(parent[current + 1L])) {
current <- parent[current + 1L]
path <- c(path, current)
}| Metric | Before | After |
|---|---|---|
| Time | 290 ms | 160 ms |
| Speedup | - | 45% faster |
## PROFILING SUMMARY
Total time: 1650 ms (165 samples @ 10 ms interval)
### TOP FUNCTIONS BY SELF-TIME
280 ms ( 17.0%) <GC>
200 ms ( 12.1%) enexpr
110 ms ( 6.7%) reduce_impl
### HOT LINES (by self-time)
60 ms ( 3.6%) /R/model-rf.R:80
if (.x$op == "less") i <- expr(!!sym(.x$col) < !!.x$val)
30 ms ( 1.8%) /R/model-rf.R:78
if (.x$op == "more") i <- expr(!!sym(.x$col) > !!.x$val)
### TOP FUNCTIONS BY TOTAL TIME
1020 ms ( 61.8%) path_formulas
500 ms ( 30.3%) expr
TOP FUNCTIONS BY TOTAL TIME showed
path_formulas at 61.8% - the bottleneck shifted to formula
building.
HOT LINES showed multiple if-statements checking
.x$op repeatedly (lines 78-81).
TOP FUNCTIONS BY SELF-TIME showed
enexpr (expression building) and reduce_impl
as top consumers - these are doing useful work.
The insight: The if-chain could be replaced with
switch(), but we’re now optimizing code that’s doing
inherent work. Further gains will be marginal.
Replace if-chain with switch statement:
# Before:
if (.x$op == "more") i <- expr(!!sym(.x$col) > !!.x$val)
if (.x$op == "more-equal") i <- expr(!!sym(.x$col) >= !!.x$val)
if (.x$op == "less") i <- expr(!!sym(.x$col) < !!.x$val)
if (.x$op == "less-equal") i <- expr(!!sym(.x$col) <= !!.x$val)
# After:
switch(
.x$op,
"more" = expr(!!col_sym > !!val),
"more-equal" = expr(!!col_sym >= !!val),
"less" = expr(!!col_sym < !!val),
"less-equal" = expr(!!col_sym <= !!val)
)Minor improvement (~4%). The remaining time is in expression building which is inherent to the task.
pv_print_debrief()The debrief package offers several functions, and
pv_print_debrief() suggests others at the end of its
output:
### Next steps
pv_source_context(p, "/R/model-ranger.R")
pv_suggestions(p)
pv_focus(p, ".f")
pv_help()
| Function | Purpose | Why I skipped it |
|---|---|---|
pv_source_context() |
Shows more code context around hot lines | I was already reading full source files directly. Once HOT LINES said “line 6 in model-ranger.R”, I just read that file. |
pv_suggestions() |
Provides automated optimization suggestions | The bottlenecks were obvious enough (32% in
[.data.frame, 15.6 GB on one line) that I could diagnose
the issue myself. |
pv_focus() |
Filters analysis to a specific function | The initial output was already clear about where problems were. No need to drill down. |
pv_print_debrief() alone is sufficientThe single function was enough because:
I would reach for additional functions if:
pv_source_context() would help see surrounding code without
switching toolspv_suggestions()
might catch patterns I missed| Stage | Time | Speedup | Key debrief insight |
|---|---|---|---|
| Original | 47,340 ms | 1x | [.data.frame at 32% |
| After vectorization | 280 ms | 169x | is.na now at 20% |
| After parent lookup | 160 ms | 296x | path_formulas at 60% |
| After switch | 143 ms | 331x | Diminishing returns |