This function can be used to identify the most stable variables identified as relevant by SGCCA. A Variable Importance in the Projection (VIP) based criterion is used to identify the most stable variables.
Arguments
- rgcca_res
A fitted RGCCA object (see
rgcca
).- keep
A numeric vector indicating the proportion of variables per block to select.
- n_boot
The number of bootstrap samples (default: 100).
- n_cores
The number of cores for parallelization.
- verbose
A logical value indicating if the progress of the procedure is reported.
- balanced
A logical value indicating if a balanced bootstrap procedure is performed or not (default is TRUE).
- keep_all_variables
A logical value indicating if all variables have to be kept even when some of them have null variance for at least one bootstrap sample (default is FALSE).
Value
A rgcca_stability object that can be printed and plotted.
- top
A data.frame giving the indicator (VIP) on which the variables are ranked.
- n_boot
The number of bootstrap samples, returned for further use.
- keepVar
The indices of the most stable variables.
- bootstrap
A data.frame with the block weight vectors computed on each bootstrap sample.
- rgcca_res
An RGCCA object fitted on the most stable variables.
Examples
if (FALSE) { # \dontrun{
###########################
# stability and bootstrap #
###########################
data("ge_cgh_locIGR", package = "gliomaData")
blocks <- ge_cgh_locIGR$multiblocks
Loc <- factor(ge_cgh_locIGR$y)
levels(Loc) <- colnames(ge_cgh_locIGR$multiblocks$y)
blocks[[3]] <- Loc
fit_sgcca <- rgcca(blocks,
sparsity = c(.071, .2, 1),
ncomp = c(1, 1, 1),
scheme = "centroid",
verbose = TRUE, response = 3
)
boot_out <- rgcca_bootstrap(fit_sgcca, n_boot = 100, n_cores = 1)
fit_stab <- rgcca_stability(fit_sgcca,
keep = sapply(fit_sgcca$a, function(x) mean(x != 0)),
n_cores = 1, n_boot = 10,
verbose = TRUE
)
boot_out <- rgcca_bootstrap(
fit_stab, n_boot = 500, n_cores = 1, verbose = TRUE
)
plot(boot_out, block = 1:2, n_mark = 2000, display_order = FALSE)
} # }