, groupby, n_pcs=None, use_rep=None, var_names=None, use_raw=None, cor_method='pearson', linkage_method='complete', key_added=None)

Computes a hierarchical clustering for the given groupby categories.

By default, the PCA representation is used unless .X has less than 50 variables.

Alternatively, a list of var_names (e.g. genes) can be given.

Average values of either var_names or components are used to compute a correlation matrix.

The hierarchical clustering can be visualized using or multiple other visualizations that can include a dendrogram: matrixplot, heatmap, dotplot and stacked_violin


The computation of the hierarchical clustering is based on predefined groups and not per cell. The correlation matrix is computed using by default pearson but other methods are available.

adata : AnnDataAnnData

Annotated data matrix

n_pcs : int or None, optional (default: None)

Use this many PCs. If n_pcs==0 use .X if use_rep is None.

use_rep : {None, ‘X’} or any key for .obsm, optional (default: None)

Use the indicated representation. If None, the representation is chosen automatically: for .n_vars < 50, .X is used, otherwise ‘X_pca’ is used. If ‘X_pca’ is not present, it’s computed with default parameters.

var_names : Sequence[str], NoneOptional[Sequence[str]] (default: None)

List of var_names to use for computing the hierarchical clustering. If var_names is given, then use_rep and n_pcs is ignored.

use_raw : bool, NoneOptional[bool] (default: None)

Only when var_names is not None. Use raw attribute of adata if present.

cor_method : strstr (default: 'pearson')

correlation method to use. Options are ‘pearson’, ‘kendall’, and ‘spearman’

linkage_method : strstr (default: 'complete')

linkage method to use. See scipy.cluster.hierarchy.linkage() for more information.

key_added : str, NoneOptional[str] (default: None)

By default, the dendrogram information is added to .uns[f'dendrogram_{groupby}']. Notice that the groupby information is added to the dendrogram.

Return type



adata.uns['dendrogram'] (or instead of ‘dendrogram’ the value selected for key_added) is updated with the dendrogram information


>>> import scanpy as sc
>>> adata = sc.datasets.pbmc68k_reduced()
>>>, groupby='bulk_labels')
>>> markers = ['C1QA', 'PSAP', 'CD79A', 'CD79B', 'CST3', 'LYZ']
>>>, markers, groupby='bulk_labels', dendrogram=True)