Stars PyPI PyPIDownloads BiocondaDownloads Docs Build Status

Scanpy – Single-Cell Analysis in Python

Scanpy is a scalable toolkit for analyzing single-cell gene expression data built jointly with anndata. It includes preprocessing, visualization, clustering, trajectory inference and differential expression testing. The Python-based implementation efficiently deals with datasets of more than one million cells.


scVelo on the cover of Nature Biotechnology 2020-12-01

Scanpy’s counterpart for RNA velocity, scVelo, made it on the cover of Nature Biotechnology [tweet].

Scanpy selected among 20 papers for 20 years of Genome Biology 2020-08-01

Genome Biology: Celebrating 20 Years of Genome Biology selected the initial Scanpy paper for the year 2018 among 20 papers for 20 years [tweet].

COVID-19 datasets distributed as h5ad 2020-04-01

In a joint initiative, the Wellcome Sanger Institute, the Human Cell Atlas, and the CZI distribute datasets related to COVID-19 via anndata’s h5ad files: It wasn’t anticipated that the initial idea of sharing and backing an on-disk representation of AnnData would become so widely adopted. Curious? Read up more on the format.

Latest additions

Version 1.7

1.7.2 2021-04-07

Bug fixes


  • Added triku a feature selection method to the ecosystem page PR 1722 AM Ascensión

  • Added dorothea and progeny to the ecosystem page PR 1767 P Badia-i-Mompel

1.7.1 2021-02-24


  • More twitter handles for core devs PR 1676 G Eraslan

Bug fixes

1.7.0 2021-02-03


  • Add new 10x Visium datasets to visium_sge() PR 1473 G Palla

  • Enable download of source image for 10x visium datasets in visium_sge() PR 1506 H Spitzer

  • Refactor of Better support for plotting without an image, as well as directly providing images PR 1512 G Palla

  • Dict input for scanpy.queries.enrich() PR 1488 G Eraslan

  • rank_genes_groups_df() can now return fraction of cells in a group expressing a gene, and allows retrieving values for multiple groups at once PR 1388 G Eraslan

  • Color annotations for gene sets in heatmap() are now matched to color for cluster PR 1511 L Sikkema

  • PCA plots can now annotate axes with variance explained PR 1470 bfurtwa

  • Plots with groupby arguments can now group by values in the index by passing the index’s name (like pd.DataFrame.groupby). PR 1583 F Ramirez

  • Added na_color and na_in_legend keyword arguments to embedding() plots. Allows specifying color for missing or filtered values in plots like umap() or spatial() PR 1356 I Virshup

  • embedding() plots now support passing dict of {cluster_name: cluster_color, ...} for palette argument PR 1392 I Virshup

External tools (new)

External tools (changes)




  • Consistent fold-change, fractions calculation for filter_rank_genes_groups PR 1391 S Rybakov

  • Fixed bug where score_genes would error if one gene was passed PR 1398 I Virshup

  • Fixed log1p inplace on integer dense arrays PR 1400 I Virshup

  • Fix docstring formatting for rank_genes_groups() PR 1417 P Weiler

  • Removed PendingDeprecationWarning`s from use of `np.matrix PR 1424 P Weiler

  • Fixed indexing byg in ~scanpy.pp.highly_variable_genes PR 1456 V Bergen

  • Fix default number of genes for marker_genes_overlap PR 1464 MD Luecken

  • Fixed passing groupby and dendrogram_key to dendrogram() PR 1465 M Varma

  • Fixed download path of pbmc3k_processed PR 1472 D Strobl

  • Better error message when computing DE with a group of size 1 PR 1490 J Manning

  • Update cugraph API usage for v0.16 PR 1494 R Ilango

  • Fixed marker_gene_overlap default value for top_n_markers PR 1464 MD Luecken

  • Pass random_state to RAPIDs UMAP PR 1474 C Nolet

  • Fixed anndata version requirement for concat() (re-exported from scanpy as sc.concat) PR 1491 I Virshup

  • Fixed the width of the progress bar when downloading data PR 1507 M Klein

  • Updated link for moignard15 dataset PR 1542 I Virshup

  • Fixed bug where calling set_figure_params could block if IPython was installed, but not used. PR 1547 I Virshup

  • violin() no longer fails if .raw not present PR 1548 I Virshup

  • spatial() refactoring and better handling of spatial data PR 1512 G Palla

  • pca() works with chunked=True again PR 1592 I Virshup

  • Compatibility with UMAP v0.5 PR 1601 PR 1589 S Rybakov, I Virshup