Frequentist model averaging (FMA) fits the specified
model once for each dose realization, computes AIC weights across those
realization-specific fits, and then samples from the fitted normal
approximations. The realization-specific fits are independent of one
another, so they are a natural place to use parallel computation.
ameras() supports optional parallel FMA through the future
framework and future.apply.
The package does not set a future plan internally. Instead, the user
decides whether FMA should run sequentially or in parallel by setting
future::plan() before calling ameras().
If future.apply is not installed, ameras()
falls back to the ordinary sequential lapply() path.
For more detail on future plans and future-backed apply loops, see
?future::plan and
?future.apply::future_lapply, or the linked Futureverse
documentation.
The default future plan is sequential, so this call uses the ordinary single-process FMA path.
future::plan(future::sequential)
fit <- ameras(Y.gaussian ~ dose(V1:V10) + X1 + X2,
data = data,
family = "gaussian",
methods = "FMA")For parallel FMA on a local machine, multisession is
usually the most portable choice. It starts separate background R
sessions and works across the main platforms supported by R, including
Windows, macOS, and Linux.
future::plan(future::multisession, workers = 2)
fit <- ameras(Y.gaussian ~ dose(V1:V10) + X1 + X2,
data = data,
family = "gaussian",
methods = "FMA")
future::plan(future::sequential)The multicore backend can be efficient on systems that
support forking, but it is not available on Windows. For code intended
to run on multiple platforms, prefer multisession.
For more specialized settings, the future ecosystem also supports
cluster and HPC-oriented backends. See ?future::plan or the
future package documentation for details.
By default, future.apply chooses how to split dose
realizations into futures. For FMA, this can be adjusted with
future.chunk.size.FMA, the average number of dose
realizations handled by each future.
Let K be the number of dose realizations, W
the number of workers, and C the value of
future.chunk.size.FMA. K and C
determine the approximate number of future tasks,
ceiling(K / C), and at most W of those tasks
can run at the same time.
For example, with K = 100 dose realizations and
W = 4 workers:
future.chunk.size.FMA |
Approximate chunks | Interpretation |
|---|---|---|
| 1 | 100 | many small tasks; more load balancing, more overhead |
| 5 | 20 | moderate load balancing and overhead |
| 25 | 4 | about one task per worker |
| 100 | 1 | one task; only one worker is used |
future::plan(future::multisession, workers = 2)
fit <- ameras(Y.gaussian ~ dose(V1:V10) + X1 + X2,
data = data,
family = "gaussian",
methods = "FMA",
future.chunk.size.FMA = 2)
future::plan(future::sequential)Smaller chunks can improve load balancing when some realizations take
longer to fit than others. Larger chunks can reduce overhead and may be
preferable when each realization-specific fit is fast. If the
realization-specific fits have similar runtimes, a reasonable starting
point is a chunk size near ceiling(K / W) or somewhat
smaller. If runtimes vary substantially across realizations, smaller
chunks may keep workers busier.
The example below compares the same Gaussian FMA fit with a sequential plan and a two-worker multisession plan. The seed is reset before each call so that the final FMA sampling step is comparable.
old_plan <- future::plan()
on.exit(future::plan(old_plan), add = TRUE)
future::plan(future::sequential)
set.seed(2024)
sequential_elapsed <- system.time(
sequential_fit <- suppressWarnings(
ameras(Y.gaussian ~ dose(V1:V10) + X1 + X2,
data = data,
family = "gaussian",
methods = "FMA",
MFMA = 10000)
)
)[["elapsed"]]
future::plan(future::multisession, workers = 2)
set.seed(2024)
parallel_elapsed <- system.time(
parallel_fit <- suppressWarnings(
ameras(Y.gaussian ~ dose(V1:V10) + X1 + X2,
data = data,
family = "gaussian",
methods = "FMA",
MFMA = 10000,
future.chunk.size.FMA = 2)
)
)[["elapsed"]]
future::plan(old_plan)Compare the coefficient estimates and elapsed runtimes.
coefficient_comparison <- data.frame(
term = names(sequential_fit$FMA$coefficients),
sequential = unname(sequential_fit$FMA$coefficients),
parallel = unname(parallel_fit$FMA$coefficients),
row.names = NULL
)
coefficient_comparison[-1] <- round(coefficient_comparison[-1], 4)
coefficient_comparison
#> term sequential parallel
#> 1 (Intercept) -1.2802 -1.2802
#> 2 X1 0.4829 0.4829
#> 3 X2 -0.5163 -0.5163
#> 4 dose 1.0793 1.0793
#> 5 sigma 1.1376 1.1376
data.frame(
plan = c("sequential", "multisession"),
elapsed_seconds = round(c(sequential_elapsed, parallel_elapsed), 2)
)
#> plan elapsed_seconds
#> 1 sequential 1.49
#> 2 multisession 2.81The estimates should agree up to ordinary numerical precision. Parallel execution is not guaranteed to be faster for small examples because starting workers and transferring data have overhead. It is most useful when the realization-specific fits are slow enough for that overhead to matter less.
The parallelized part of FMA is the realization-specific optimization
step, which is deterministic for ordinary model fits. The random FMA
sampling step is performed afterward in the main R process. For
reproducible FMA samples, set the seed before calling
ameras(), just as in the sequential case.
future::plan(future::multisession, workers = 2)
set.seed(2024)
fit <- ameras(Y.gaussian ~ dose(V1:V10) + X1 + X2,
data = data,
family = "gaussian",
methods = "FMA")
future::plan(future::sequential)User-supplied functions passed through transform,
transform.jacobian, or other runtime arguments should also
be deterministic. Random number generation inside those functions is not
recommended.
Use parallel FMA when a direct
ameras(..., methods = "FMA") call is feasible in one R
session, but the realization-specific fits are slow enough that using
multiple workers is worthwhile.
Start conservatively:
future::plan(future::multisession, workers = 2) for
a portable local parallel plan;workers only after checking memory use;future.chunk.size.FMA = 2 or larger when each
realization-specific fit is fast;future::sequential after the fit if
the rest of the session should not use parallel futures.With multisession, each worker is a separate R process.
Large datasets may therefore be copied to multiple workers. If memory is
tight, use fewer workers, increase the chunk size, or consider the
manual FMA workflow.
The parallel option shown here is best when the built-in FMA workflow
fits in memory and you want a regular amerasfit object
returned by ameras().
The manual FMA vignette is intended for
larger or more operationally complicated analyses where it is useful to
split the realization-specific fits across separate jobs, save
intermediate fit summaries, or resume work after partial completion.
Manual FMA is more flexible, but it requires more bookkeeping and does
not return a regular amerasfit object.
In short: