Run variations of SNF.
batch_snf.Rd
This is the core function of the metasnf package. Using the information stored in a settings_matrix (see ?generate_settings_matrix) and a data_list (see ?generate_data_list), run repeated complete SNF pipelines to generate a broad space of post-SNF cluster solutions.
Usage
batch_snf(
data_list,
settings_matrix,
processes = 1,
return_similarity_matrices = FALSE,
similarity_matrix_dir = NULL,
clust_algs_list = NULL,
suppress_clustering = FALSE,
distance_metrics_list = NULL,
weights_matrix = NULL,
automatic_standard_normalize = FALSE,
quiet = FALSE
)
Arguments
- data_list
A nested list of input data from
generate_data_list()
.- settings_matrix
A data.frame where each row completely defines an SNF pipeline transforming individual input dataframes into a final cluster solution. See ?generate_settings_matrix or https://branchlab.github.io/metasnf/articles/settings_matrix.html for more details.
- processes
Specify number of processes used to complete SNF iterations
1
(default) Sequential processing: function will iterate through thesettings_matrix
one row at a time with a for loop. This option will not make use of multiple CPU cores, but will show a progress bar.2
or higher: Parallel processing will use thefuture.apply::future_apply
to distribute the SNF iterations across the specified number of CPU cores. If higher than the number of available cores, a warning will be printed and the maximum number of cores will be used.max
: All available cores will be used.
- return_similarity_matrices
If TRUE, function will return a list where the first element is the solutions matrix and the second element is a list of similarity matrices for each row in the solutions_matrix. Default FALSE.
- similarity_matrix_dir
If specified, this directory will be used to save all generated similarity matrices.
- clust_algs_list
List of custom clustering algorithms to apply to the final fused network. See ?generate_clust_algs_list.
- suppress_clustering
If FALSE (default), will apply default or custom clustering algorithms to provide cluster solutions on every iteration of SNF. If TRUE, parameter
similarity_matrix_dir
must be specified.- distance_metrics_list
An optional nested list containing which distance metric function should be used for the various variable types (continuous, discrete, ordinal, categorical, and mixed). See ?generate_distance_metrics_list for details on how to build this.
- weights_matrix
A matrix containing variable weights to use during distance matrix calculation. See ?generate_weights_matrix for details on how to build this.
- automatic_standard_normalize
If TRUE, will automatically apply standard normalization prior to calculation of any distance matrices. This parameter cannot be used in conjunction with a custom distance metrics list. If you wish to supply custom distance metrics but also always have standard normalization, simply ensure that the numeric (continuous, discrete, and ordinal) distance metrics are only populated with distance metric functions that apply standard normalization. See https://branchlab.github.io/metasnf/articles/distance_metrics.html to learn more.
- quiet
If TRUE, the function won't print out time remaining estimates.