Scanpy – Single-Cell Analysis in Python
Scanpy is a scalable toolkit for analyzing single-cell gene expression data built jointly with anndata. It includes preprocessing, visualization, clustering, trajectory inference and differential expression testing. The Python-based implementation efficiently deals with datasets of more than one million cells.
Get started by browsing tutorials, usage principles or the main API.
Follow changes in the release notes.
Find tools that harmonize well with anndata & Scanpy via the external API and the ecosystem page.
Consider citing Genome Biology (2018) along with original references.
News
scVelo on the cover of Nature Biotechnology 2020-12-01
Scanpy’s counterpart for RNA velocity, scVelo, made it on the cover of Nature Biotechnology [tweet].
Scanpy selected among 20 papers for 20 years of Genome Biology 2020-08-01
Genome Biology: Celebrating 20 Years of Genome Biology selected the initial Scanpy paper for the year 2018 among 20 papers for 20 years [tweet].
COVID-19 datasets distributed as h5ad
2020-04-01
In a joint initiative, the Wellcome Sanger Institute, the Human Cell Atlas, and the CZI distribute datasets related to COVID-19 via anndata’s h5ad
files: covid19cellatlas.org. It wasn’t anticipated that the initial idea of sharing and backing an on-disk representation of AnnData
would become so widely adopted. Curious? Read up more on the format.
Latest additions
Version 1.8
1.8.3 the future
Docs
Bug fixes
Fixed finding variables with
use_raw=True
andbasis=None
inscanpy.pl.scatter()
PR 2027 E RiceFixed
scanpy.external.pp.scrublet()
to address issue 1957 FlMai and ensure raw counts are used for simulationFunctions in
scanpy.datasets
no longer throwOldFormatWarnings
when usinganndata
0.8
PR 2096 I Virshup
Performance
Ecosystem
1.8.2 2021-11-3
Docs
Update conda installation instructions PR 1974 L Heumos
Bug fixes
Fix plotting after
scanpy.tl.filter_rank_genes_groups()
PR 1942 S RybakovFix
use_raw=None
usinganndata.AnnData.var_names
ifanndata.AnnData.raw
is present inscanpy.tl.score_genes()
PR 1999 M KleinFix compatibility with UMAP 0.5.2 PR 2028 L Mcinnes
Fixed non-determinism in
scanpy.pl.paga()
node positions PR 1922 I Virshup
Ecosystem
Added PASTE (a tool to align and integrate spatial transcriptomics data) to scanpy ecosystem.
1.8.1 2021-07-07
Bug fixes
Fixed reproducibility of
scanpy.tl.score_genes()
. Calculation and output is now float64 type. PR 1890 I KucinskiWorkarounds for some changes/ bugs in pandas 1.3 PR 1918 I Virshup
Fixed bug where
sc.pl.paga_compare
could mislabel nodes on the paga graph PR 1898 I VirshupFixed handling of
use_raw
withscanpy.tl.rank_genes_groups()
PR 1934 I Virshup
1.8.0 2021-06-28
Metrics module
Added
scanpy.metrics
module!Added
scanpy.metrics.gearys_c()
for spatial autocorrelation PR 915 I VirshupAdded
scanpy.metrics.morans_i()
for global spatial autocorrelation PR 1740 I Virshup, G PallaAdded
scanpy.metrics.confusion_matrix()
for comparing labellings PR 915 I Virshup
Features
Added
layer
andcopy
kwargs tonormalize_total()
PR 1667 I VirshupAdded
vcenter
andnorm
arguments to the plotting functions PR 1551 G EraslanStandardized and expanded available arguments to the
sc.pl.rank_genes_groups*
family of functions. PR 1529 F Ramirez I Virshup - See examples sections ofrank_genes_groups_dotplot()
andrank_genes_groups_matrixplot()
for demonstrations.scanpy.tl.tsne()
now supports the metric argument and records the passed parameters PR 1854 I Virshup
Ecosystem
Added
triku
a feature selection method to the ecosystem page PR 1722 AM AscensiónAdded
dorothea
andprogeny
to the ecosystem page PR 1767 P Badia-i-Mompel
Documentation
Added rendered examples to many plotting functions issue 1664 A Schaar L Zappia bio-la L Hetzel L Dony M Buttner K Hrovatin F Ramirez I Virshup LouisK92 mayarali
Integrated DocSearch, a find-as-you-type documentation index search. PR 1754 P Angerer
Reorganized reference docs PR 1753 I Virshup
Clarified docs issues for
neighbors()
,diffmap()
,calculate_qc_metrics()
PR 1680 G PallaFixed typos in grouped plot doc-strings PR 1877 C Rands
Extended examples for differential expression plotting. PR 1529 F Ramirez - See
rank_genes_groups_dotplot()
orrank_genes_groups_matrixplot()
for examples.
Bug fixes
Fix
scanpy.pl.paga_path()
TypeError
with recent versions of anndata PR 1047 P AngererFix detection of whether IPython is running PR 1844 I Virshup
Fixed reproducibility of
scanpy.tl.diffmap()
(added random_state) PR 1858 I KucinskiFixed errors and warnings from embedding plots with small numbers of categories after
sns.set_palette
was called PR 1886 I VirshupFixed handling of
gene_symbols
argument in a number ofsc.pl.rank_genes_groups*
functions PR 1529 F Ramirez I VirshupFixed handling of
use_raw
forsc.tl.rank_genes_groups
when no.raw
is present PR 1895 I Virshupscanpy.pl.rank_genes_groups_violin()
now works forraw=False
PR 1669 M van den Beek
Development processes
Switched to flit for building and deploying the package, a simple tool with an easy to understand command line interface and metadata PR 1527 P Angerer
Use pre-commit for style checks PR 1684 PR 1848 L Heumos I Virshup
Deprecations
Dropped support for Python 3.6. More details here. PR 1897 I Virshup
Deprecated
layers
andlayers_norm
kwargs tonormalize_total()
PR 1667 I VirshupDeprecated
MulticoreTSNE
backend forscanpy.tl.tsne()
PR 1854 I Virshup