Scanpy – Single-Cell Analysis in Python
Scanpy is a scalable toolkit for analyzing single-cell gene expression data built jointly with anndata. It includes preprocessing, visualization, clustering, trajectory inference and differential expression testing. The Python-based implementation efficiently deals with datasets of more than one million cells.
Get started by browsing tutorials, usage principles or the main API.
Follow changes in the release notes.
Find tools that harmonize well with anndata & Scanpy via the external API and the ecosystem page.
Consider citing Genome Biology (2018) along with original references.
News
scVelo on the cover of Nature Biotechnology 2020-12-01
Scanpy’s counterpart for RNA velocity, scVelo, made it on the cover of Nature Biotechnology [tweet].
Scanpy selected among 20 papers for 20 years of Genome Biology 2020-08-01
Genome Biology: Celebrating 20 Years of Genome Biology selected the initial Scanpy paper for the year 2018 among 20 papers for 20 years [tweet].
COVID-19 datasets distributed as h5ad 2020-04-01
In a joint initiative, the Wellcome Sanger Institute, the Human Cell Atlas, and the CZI distribute datasets related to COVID-19 via anndata’s h5ad files: covid19cellatlas.org. It wasn’t anticipated that the initial idea of sharing and backing an on-disk representation of AnnData would become so widely adopted. Curious? Read up more on the format.
Latest additions
Version 1.8
1.8.3 the future
Docs
Bug fixes
Fixed finding variables with
use_raw=Trueandbasis=Noneinscanpy.pl.scatter()PR 2027 E RiceFixed
scanpy.external.pp.scrublet()to address issue 1957 FlMai and ensure raw counts are used for simulationFunctions in
scanpy.datasetsno longer throwOldFormatWarningswhen usinganndata0.8PR 2096 I Virshup
Performance
Ecosystem
1.8.2 2021-11-3
Docs
Update conda installation instructions PR 1974 L Heumos
Bug fixes
Fix plotting after
scanpy.tl.filter_rank_genes_groups()PR 1942 S RybakovFix
use_raw=Noneusinganndata.AnnData.var_namesifanndata.AnnData.rawis present inscanpy.tl.score_genes()PR 1999 M KleinFix compatibility with UMAP 0.5.2 PR 2028 L Mcinnes
Fixed non-determinism in
scanpy.pl.paga()node positions PR 1922 I Virshup
Ecosystem
Added PASTE (a tool to align and integrate spatial transcriptomics data) to scanpy ecosystem.
1.8.1 2021-07-07
Bug fixes
Fixed reproducibility of
scanpy.tl.score_genes(). Calculation and output is now float64 type. PR 1890 I KucinskiWorkarounds for some changes/ bugs in pandas 1.3 PR 1918 I Virshup
Fixed bug where
sc.pl.paga_comparecould mislabel nodes on the paga graph PR 1898 I VirshupFixed handling of
use_rawwithscanpy.tl.rank_genes_groups()PR 1934 I Virshup
1.8.0 2021-06-28
Metrics module
Added
scanpy.metricsmodule!Added
scanpy.metrics.gearys_c()for spatial autocorrelation PR 915 I VirshupAdded
scanpy.metrics.morans_i()for global spatial autocorrelation PR 1740 I Virshup, G PallaAdded
scanpy.metrics.confusion_matrix()for comparing labellings PR 915 I Virshup
Features
Added
layerandcopykwargs tonormalize_total()PR 1667 I VirshupAdded
vcenterandnormarguments to the plotting functions PR 1551 G EraslanStandardized and expanded available arguments to the
sc.pl.rank_genes_groups*family of functions. PR 1529 F Ramirez I Virshup - See examples sections ofrank_genes_groups_dotplot()andrank_genes_groups_matrixplot()for demonstrations.scanpy.tl.tsne()now supports the metric argument and records the passed parameters PR 1854 I Virshup
Ecosystem
Added
trikua feature selection method to the ecosystem page PR 1722 AM AscensiónAdded
dorotheaandprogenyto the ecosystem page PR 1767 P Badia-i-Mompel
Documentation
Added rendered examples to many plotting functions issue 1664 A Schaar L Zappia bio-la L Hetzel L Dony M Buttner K Hrovatin F Ramirez I Virshup LouisK92 mayarali
Integrated DocSearch, a find-as-you-type documentation index search. PR 1754 P Angerer
Reorganized reference docs PR 1753 I Virshup
Clarified docs issues for
neighbors(),diffmap(),calculate_qc_metrics()PR 1680 G PallaFixed typos in grouped plot doc-strings PR 1877 C Rands
Extended examples for differential expression plotting. PR 1529 F Ramirez - See
rank_genes_groups_dotplot()orrank_genes_groups_matrixplot()for examples.
Bug fixes
Fix
scanpy.pl.paga_path()TypeErrorwith recent versions of anndata PR 1047 P AngererFix detection of whether IPython is running PR 1844 I Virshup
Fixed reproducibility of
scanpy.tl.diffmap()(added random_state) PR 1858 I KucinskiFixed errors and warnings from embedding plots with small numbers of categories after
sns.set_palettewas called PR 1886 I VirshupFixed handling of
gene_symbolsargument in a number ofsc.pl.rank_genes_groups*functions PR 1529 F Ramirez I VirshupFixed handling of
use_rawforsc.tl.rank_genes_groupswhen no.rawis present PR 1895 I Virshupscanpy.pl.rank_genes_groups_violin()now works forraw=FalsePR 1669 M van den Beek
Development processes
Switched to flit for building and deploying the package, a simple tool with an easy to understand command line interface and metadata PR 1527 P Angerer
Use pre-commit for style checks PR 1684 PR 1848 L Heumos I Virshup
Deprecations
Dropped support for Python 3.6. More details here. PR 1897 I Virshup
Deprecated
layersandlayers_normkwargs tonormalize_total()PR 1667 I VirshupDeprecated
MulticoreTSNEbackend forscanpy.tl.tsne()PR 1854 I Virshup