scanpy.external.pp.magic#
- scanpy.external.pp.magic(adata, name_list=None, *, knn=5, decay=1, knn_max=None, t=3, n_pca=100, solver='exact', knn_dist='euclidean', random_state=None, n_jobs=None, verbose=False, copy=None, **kwargs)[source]#
- Markov Affinity-based Graph Imputation of Cells (MAGIC) API [van Dijk et al., 2018]. - MAGIC is an algorithm for denoising and transcript recover of single cells applied to single-cell sequencing data. MAGIC builds a graph from the data and uses diffusion to smooth out noise and recover the data manifold. - The algorithm implemented here has changed primarily in two ways compared to the algorithm described in van Dijk et al. [2018]. Firstly, we use the adaptive kernel described in Moon et al. [2019] for improved stability. Secondly, data diffusion is applied in the PCA space, rather than the data space, for speed and memory improvements. - More information and bug reports here. For help, visit <https://krishnaswamylab.org/get-help>. - Parameters:
- adata AnnData
- An anndata file with - .rawattribute representing raw counts.
- name_list Union[Literal['all_genes','pca_only'],Sequence[str],None] (default:None)
- Denoised genes to return. The default - 'all_genes'/- Nonemay require a large amount of memory if the input data is sparse. Another possibility is- 'pca_only'.
- knn int(default:5)
- number of nearest neighbors on which to build kernel. 
- decay float|None(default:1)
- sets decay rate of kernel tails. If None, alpha decaying kernel is not used. 
- knn_max int|None(default:None)
- maximum number of nearest neighbors with nonzero connection. If - None, will be set to 3 *- knn.
- t Union[Literal['auto'],int] (default:3)
- power to which the diffusion operator is powered. This sets the level of diffusion. If ‘auto’, t is selected according to the Procrustes disparity of the diffused data. 
- n_pca int|None(default:100)
- Number of principal components to use for calculating neighborhoods. For extremely large datasets, using n_pca < 20 allows neighborhoods to be calculated in roughly log(n_samples) time. If - None, no PCA is performed.
- solver Literal['exact','approximate'] (default:'exact')
- Which solver to use. “exact” uses the implementation described in van Dijk et al. [2018]. “approximate” uses a faster implementation that performs imputation in the PCA space and then projects back to the gene space. Note, the “approximate” solver may return negative values. 
- knn_dist str(default:'euclidean')
- recommended values: ‘euclidean’, ‘cosine’, ‘precomputed’ Any metric from - scipy.spatial.distancecan be used distance metric for building kNN graph. If ‘precomputed’,- datashould be an n_samples x n_samples distance or affinity matrix.
- random_state int|RandomState|None(default:None)
- Random seed. Defaults to the global - numpyrandom number generator.
- n_jobs int|None(default:None)
- Number of threads to use in training. All cores are used by default. 
- verbose bool(default:False)
- If - Trueor an integer- >= 2, print status messages. If- None,- sc.settings.verbosityis used.
- copy bool|None(default:None)
- If true, a copy of anndata is returned. If - None,- copyis True if- genesis not- 'all_genes'or- 'pca_only'.- copymay only be False if- genesis- 'all_genes'or- 'pca_only', as the resultant data will otherwise have different column names from the input data.
- kwargs
- Additional arguments to - magic.MAGIC.
 
- adata 
- Return type:
- Returns:
- If - copyis True, AnnData object is returned.- If - subset_genesis not- all_genes, PCA on MAGIC values of cells are stored in- adata.obsm['X_magic']and- adata.Xis not modified.- The raw counts are stored in - .rawattribute of AnnData object.
 - Examples - >>> import scanpy as sc >>> import scanpy.external as sce >>> adata = sc.datasets.paul15() >>> sc.pp.normalize_per_cell(adata) >>> sc.pp.sqrt(adata) # or sc.pp.log1p(adata) >>> adata_magic = sce.pp.magic(adata, name_list=["Mpo", "Klf1", "Ifitm1"], knn=5) >>> adata_magic.shape (2730, 3) >>> sce.pp.magic(adata, name_list="pca_only", knn=5) >>> adata.obsm["X_magic"].shape (2730, 100) >>> sce.pp.magic(adata, name_list="all_genes", knn=5) >>> adata.X.shape (2730, 3451)