The sobol package generates low‑discrepancy Sobol sequences and is designed from the ground up for parameter space exploration.
This tutorial will take you from the simplest possible usage to advanced reproducibility and parallel workflows.
sobol_design()Imagine you are tuning a machine‑learning model with three hyperparameters:
learning rate in [0.0001, 0.1]momentum in [0, 0.99]dropout in [0, 0.5]You want to explore the space with 200 well‑spread points.
sobol_design() returns a data frame ready
to be fed into your objective function.
design <- sobol_design(
lower = c(learning_rate = 0.0001, momentum = 0.00, dropout = 0.0),
upper = c(learning_rate = 0.1000, momentum = 0.99, dropout = 0.5),
nseq = 200
)
head(design)
#> learning_rate momentum dropout
#> 1 0.001270703 0.29003906 0.36914062
#> 2 0.051220703 0.78503906 0.11914062
#> 3 0.076195703 0.04253906 0.49414062
#> 4 0.026245703 0.53753906 0.24414062
#> 5 0.038733203 0.90878906 0.30664062
#> 6 0.088683203 0.41378906 0.05664062Points are in the exact ranges you specified:
summary(design)
#> learning_rate momentum dropout
#> Min. :0.0004902 Min. :0.003867 Min. :0.0009766
#> 1st Qu.:0.0252701 1st Qu.:0.249917 1st Qu.:0.1252441
#> Median :0.0500500 Median :0.495967 Median :0.2495117
#> Mean :0.0498549 Mean :0.496779 Mean :0.2491016
#> 3rd Qu.:0.0748299 3rd Qu.:0.742017 3rd Qu.:0.3737793
#> Max. :0.0996098 Max. :0.988066 Max. :0.4980469The design is deterministic and space‑filling – already a big improvement over simple random or grid search.
sobol_design special?Behind the scenes it calls sobol_points() to generate a
Sobol sequence in the unit cube and then scales each
column to your bounds.
Try it against a random Latin hypercube:
If you already have your own scaling logic, or need the raw
[0,1) points, use sobol_points() directly.
raw <- sobol_points(n = 512, dim = 4)
dim(raw) # 512 rows, 4 columns
#> [1] 512 4
range(raw) # values in [0, 1)
#> [1] 0.0000000 0.9980469sobol_points() accepts an optional skip
argument that lets you start from an arbitrary index – perfect for
parallel workers (see below).
sobol_generator()Sometimes you don’t know in advance how many points you’ll need. Maybe you want to evaluate a few, check convergence, then generate more. That’s where the stateful generator shines.
gen <- sobol_generator(dimensions = 3)
# Generate one point
sobol_next(gen)
#> [1] 0 0 0
# Generate a batch of 50
batch <- sobol_next_n(gen, n = 50)
dim(batch) # 50 x 3
#> [1] 50 3
# What’s the current index?
sobol_index(gen)
#> [1] 51You can also jump to any position:
This is the key to parallel and restart‑friendly workflows.
All sequences are deterministic. So two calls with the same parameters will always match:
a <- sobol_design(lower = c(p = 0), upper = c(p = 1), nseq = 32)
b <- sobol_design(lower = c(p = 0), upper = c(p = 1), nseq = 32)
identical(a, b) # TRUE
#> [1] TRUETo distribute work across multiple cores or machines, assign each a non‑overlapping skip interval.
0 – 9991000 – 19992000 – 2999# Worker 1
w1 <- sobol_design(lower = c(lr = 0.0001, mom = 0, drop = 0),
upper = c(lr = 0.1, mom = 0.99, drop = 0.5),
nseq = 1000) # implicitly starts at 0
# Worker 2 (needs raw points + skip to 1000)
raw2 <- sobol_points(n = 1000, dim = 3, skip = 1000)
# Then scale raw2 manually, or use sobol_design in the future with a skip argument(A skip argument for sobol_design() is
under consideration – once available, parallel designs become
one‑liners.)
A generator can be “rewound” at any time to re‑evaluate a segment:
The C++ engine is heavily optimised. Even 1 000 000 points in 10 dimensions complete in under a second on modern hardware, freeing you to spend time on your actual model.
For extremely high dimensions (>1000) the engine falls back to runtime generation – still fast, but initialisation takes a tick longer. Precomputed tables cover the first 1000 dimensions instantly.
?sobol_design,
?sobol_points, ?sobol_generatorinst/examples/usage_examples.R fileThe sobol_design() function in this package was inspired
by the sobol_design()
function from the pomp package by Aaron A. King et al. — an R package for statistical inference using partially observed Markov processes.
While the interface and purpose are similar,
sobol is a ground-up reimplementation: the
core algorithm is written from scratch in C++17 and exposed to R via
Rcpp, with no shared code from pomp. We gratefully
acknowledge Aaron King’s project as the original source of inspiration
for the design of this interface.
inst/examples/usage_examples.RThat’s all you need to start exploring your parameter space smarter
and faster.
Welcome to sobol!
sobol_design()sobol_design special?sobol_generator()