These functions calculate conventional metrics of cluster solution quality.
Arguments
- sol_df
A
solutions_df
class object created bybatch_snf()
with the parameterreturn_sim_mats = TRUE
.
Value
A list of silhouette
class objects, a vector of Dunn indices, or a
vector of Davies-Bouldin indices depending on which function was used.
Details
calculate_silhouettes: A wrapper for cluster::silhouette
that calculates
silhouette scores for all cluster solutions in a provided solutions data
frame. Silhouette values range from -1 to +1 and indicate an overall ratio
of how close together observations within a cluster are to how far apart
observations across clusters are. You can learn more about interpreting
the results of this function by calling ?cluster::silhouette
.
calculate_dunn_indices: A wrapper for clv::clv.Dunn
that calculates
Dunn indices for all cluster solutions in a provided solutions data
frame. Dunn indices, like silhouette scores, similarly reflect similarity
within clusters and separation across clusters. You can learn more about
interpreting the results of this function by calling ?clv::clv.Dunn
.
calculate_db_indices: A wrapper for clv::clv.Davies.Bouldin
that
calculates Davies-Bouldin indices for all cluster solutions in a provided
solutions data frame. These values can be interpreted similarly as those
above. You can learn more about interpreting the results of this function by
calling ?clv::clv.Davies.Bouldin
.
Examples
input_dl <- data_list(
list(gender_df, "gender", "demographics", "categorical"),
list(diagnosis_df, "diagnosis", "clinical", "categorical"),
uid = "patient_id"
)
sc <- snf_config(input_dl, n_solutions = 5)
#> ℹ No distance functions specified. Using defaults.
#> ℹ No clustering functions specified. Using defaults.
sol_df <- batch_snf(input_dl, sc, return_sim_mats = TRUE)
# calculate Davies-Bouldin indices
davies_bouldin_indices <- calculate_db_indices(sol_df)
# calculate Dunn indices
dunn_indices <- calculate_dunn_indices(sol_df)
# calculate silhouette scores
silhouette_scores <- calculate_silhouettes(sol_df)