Stars PyPI PyPIDownloads BiocondaDownloads Docs Build Status

Scanpy – Single-Cell Analysis in Python

Scanpy is a scalable toolkit for analyzing single-cell gene expression data built jointly with anndata. It includes preprocessing, visualization, clustering, trajectory inference and differential expression testing. The Python-based implementation efficiently deals with datasets of more than one million cells.


scVelo on the cover of Nature Biotechnology 2020-12-01

Scanpy’s counterpart for RNA velocity, scVelo, made it on the cover of Nature Biotechnology [tweet].

Scanpy selected among 20 papers for 20 years of Genome Biology 2020-08-01

Genome Biology: Celebrating 20 Years of Genome Biology selected the initial Scanpy paper for the year 2018 among 20 papers for 20 years [tweet].

COVID-19 datasets distributed as h5ad 2020-04-01

In a joint initiative, the Wellcome Sanger Institute, the Human Cell Atlas, and the CZI distribute datasets related to COVID-19 via anndata’s h5ad files: It wasn’t anticipated that the initial idea of sharing and backing an on-disk representation of AnnData would become so widely adopted. Curious? Read up more on the format.

Latest additions

Version 1.8

1.8.0 the future


External tools

Performance enhancements

Bug fixes


Version 1.7

1.7.2 the future

Performance enhancements

Bug fixes



1.7.1 2021-02-24


  • More twitter handles for core devs PR 1676 G Eraslan

Bug fixes

1.7.0 2021-02-03


  • Add new 10x Visium datasets to visium_sge() PR 1473 G Palla

  • Enable download of source image for 10x visium datasets in visium_sge() PR 1506 H Spitzer

  • Refactor of Better support for plotting without an image, as well as directly providing images PR 1512 G Palla

  • Dict input for scanpy.queries.enrich() PR 1488 G Eraslan

  • rank_genes_groups_df() can now return fraction of cells in a group expressing a gene, and allows retrieving values for multiple groups at once PR 1388 G Eraslan

  • Color annotations for gene sets in heatmap() are now matched to color for cluster PR 1511 L Sikkema

  • PCA plots can now annotate axes with variance explained PR 1470 bfurtwa

  • Plots with groupby arguments can now group by values in the index by passing the index’s name (like pd.DataFrame.groupby). PR 1583 F Ramirez

  • Added na_color and na_in_legend keyword arguments to embedding() plots. Allows specifying color for missing or filtered values in plots like umap() or spatial() PR 1356 I Virshup

  • embedding() plots now support passing dict of {cluster_name: cluster_color, ...} for palette argument PR 1392 I Virshup

External tools (new)

External tools (changes)




  • Consistent fold-change, fractions calculation for filter_rank_genes_groups PR 1391 S Rybakov

  • Fixed bug where score_genes would error if one gene was passed PR 1398 I Virshup

  • Fixed log1p inplace on integer dense arrays PR 1400 I Virshup

  • Fix docstring formatting for rank_genes_groups() PR 1417 P Weiler

  • Removed PendingDeprecationWarning`s from use of `np.matrix PR 1424 P Weiler

  • Fixed indexing byg in ~scanpy.pp.highly_variable_genes PR 1456 V Bergen

  • Fix default number of genes for marker_genes_overlap PR 1464 MD Luecken

  • Fixed passing groupby and dendrogram_key to dendrogram() PR 1465 M Varma

  • Fixed download path of pbmc3k_processed PR 1472 D Strobl

  • Better error message when computing DE with a group of size 1 PR 1490 J Manning

  • Update cugraph API usage for v0.16 PR 1494 R Ilango

  • Fixed marker_gene_overlap default value for top_n_markers PR 1464 MD Luecken

  • Pass random_state to RAPIDs UMAP PR 1474 C Nolet

  • Fixed anndata version requirement for concat() (re-exported from scanpy as sc.concat) PR 1491 I Virshup

  • Fixed the width of the progress bar when downloading data PR 1507 M Klein

  • Updated link for moignard15 dataset PR 1542 I Virshup

  • Fixed bug where calling set_figure_params could block if IPython was installed, but not used. PR 1547 I Virshup

  • violin() no longer fails if .raw not present PR 1548 I Virshup

  • spatial() refactoring and better handling of spatial data PR 1512 G Palla

  • pca() works with chunked=True again PR 1592 I Virshup

  • ingest() now works with umap-learn 0.5.0 PR 1601 S Rybakov