Skip to contents

Getting Started

Getting Started

An introduction to the metasnf package and installation instructions.

Example Workflows

A Simple Example

A minimal example of generating cluster solutions.

A Complete Example

Step through a complex subtyping workflow.

Essential objects

The Settings Matrix

The object that controls all hyperparameters defining the space of cluster solutions to explore.

The Data List

The main object used to store data in the metasnf package.

Further customization

SNF Schemes

Controlling the way that individual input dataframes are combined into a final fused network.

Distance Metrics

Vary distance metrics to expand or refine the space of generated cluster solutions.

Clustering Algorithms

Vary clustering algorithm to expand or refine the space of generated cluster solutions.

Feature Weighting

Vary feature weights to expand or refine the space of generated cluster solutions.

Additional functionality

Stability Measures

Evaluating robustness of cluster solutions through resampling methods.

Quality Measures

Calculate context-agnostic measures of clustering compactness and separation.

Confounders

Linearly regress out unwanted signal from features for clustering.

Parallel Processing

Leverage parallel processing to speed up the metasnf pipeline.

Label Propagation

Validate or extend cluster insights to new observations through semi-supervised label propagation.

Imputations

Incorporate imputation approach as another source of variability in the generated space of cluster solutions.

NMI Scores

Calculate how important various features were to the final SNF cluster solution.

Plotting

Correlation Plots

Visualize correlations between data prior to clustering.

Similarity Matrices

Visualize the affinity matrices produced by SNF and how they associate with other data attributes.

Manhattan Plots

Visualize a summary of the association between cluster-feature and feature-feature relationships.

Feature Plots

Visualize how features are distributed within a cluster solution.

Alluvial Plots

Visualize how cluster number influences the distribution of observations.

Troubleshooting

Troubleshooting

What to do when things aren’t working.