scanpy.api.pp.magic(adata, name_list=None, k=10, a=15, t='auto', n_pca=100, knn_dist='euclidean', random_state=None, n_jobs=None, verbose=False, copy=None, **kwargs)

Markov Affinity-based Graph Imputation of Cells (MAGIC) API [vanDijk18].

MAGIC is an algorithm for denoising and transcript recover of single cells applied to single-cell sequencing data. MAGIC builds a graph from the data and uses diffusion to smooth out noise and recover the data manifold.

More information and bug reports here. For help, visit <>.

adata : AnnData

An anndata file with .raw attribute representing raw counts.

name_list : list, 'all_genes', or 'pca_only', optional (default: 'all_genes')

Denoised genes to return. Default is all genes, but this may require a large amount of memory if the input data is sparse.

k : int, optional, default: 10

number of nearest neighbors on which to build kernel

a : int, optional, default: 15

sets decay rate of kernel tails. If None, alpha decaying kernel is not used

t : int, optional, default: 'auto'

power to which the diffusion operator is powered. This sets the level of diffusion. If ‘auto’, t is selected according to the Procrustes disparity of the diffused data

n_pca : int, optional, default: 100

Number of principal components to use for calculating neighborhoods. For extremely large datasets, using n_pca < 20 allows neighborhoods to be calculated in roughly log(n_samples) time.

knn_dist : string, optional, default: 'euclidean'

recommended values: ‘euclidean’, ‘cosine’, ‘precomputed’ Any metric from scipy.spatial.distance can be used distance metric for building kNN graph. If ‘precomputed’, data should be an n_samples x n_samples distance or affinity matrix

random_state : int, numpy.RandomState or None, optional (default: None)

Random seed. Defaults to the global numpy random number generator

n_jobs : int or None, optional. Default: None

Number of threads to use in training. All cores are used by default.

verbose : bool, int or None, optional (default: sc.settings.verbosity)

If True or an integer >= 2, print status messages. If None, sc.settings.verbosity is used.

copy : bool or None, optional. Default: None.

If true, a copy of anndata is returned. If None, copy is True if genes is not 'all_genes' or 'pca_only'. copy may only be False if genes is 'all_genes' or 'pca_only', as the resultant data will otherwise have different column names from the input data.

kwargs : additional arguments to magic.MAGIC


If copy is True, AnnData object is returned.

If subset_genes is not all_genes, PCA on MAGIC values of cells are stored in adata.obsm['X_magic'] and adata.X is not modified.

The raw counts are stored in .raw attribute of AnnData object.


>>> import scanpy.api as sc
>>> import magic
>>> adata = sc.datasets.paul15()
>>> sc.pp.normalize_per_cell(adata)
>>> sc.pp.sqrt(adata)  # or sc.pp.log1p(adata)
>>> adata_magic = sc.pp.magic(adata, name_list=['Mpo', 'Klf1', 'Ifitm1'], k=5)
>>> adata_magic.shape
(2730, 3)
>>> sc.pp.magic(adata, name_list='pca_only', k=5)
>>> adata.obsm['X_magic'].shape
(2730, 100)
>>> sc.pp.magic(adata, name_list='all_genes', k=5)
>>> adata.X.shape
(2730, 3451)