Scanpy – Single-Cell Analysis in Python¶
Scanpy is a scalable toolkit for analyzing single-cell gene expression data. It includes preprocessing, visualization, clustering, pseudotime and trajectory inference, differential expression testing and simulation of gene regulatory networks. The Python-based implementation efficiently deals with datasets of more than one million cells.
See all releases here. The following lists selected improvements.
- better scalability when analyzing data with more than 100K cells; new graph class…
- canonical analyses steps like clustering genes, computing correlations…
- exporting to Gephi…
February 26, 2018: version 0.4.4
- embed cells using
- score sets of genes, e.g. for cell cycle, using
February 9, 2018: version 0.4.3
clustermap(): heatmap from hierarchical clustering, based on seaborn.clustermap [Waskom16]
- only return matplotlib.Axis in plotting functions of
sc.plwhen show=False, otherwise None
… and through anndata v0.5
- inform about duplicates in
var_namesand resolve them using
- by default, generate unique observation names in
- automatically remove unused categories after slicing
- read/write .loom files using loompy 2
January 7, 2018: version 0.4.2
- amendments in AGA and its plotting functions
December 23, 2017: version 0.4
… and through anndata v0.4
- towards a common file format for exchanging
AnnDatawith packages such as Seurat and SCDE by reading and writing .loom files
AnnDataprovides scalability beyond dataset sizes that fit into memory: see this blog post
rawattribute that simplifies storing the data matrix when you consider it “raw”: see the clustering tutorial
November 29, 2017: version 0.3.2
November 16, 2017: version 0.3
AnnDatacan be concatenated
AnnDatais available as a separate package
- results of approximate graph abstraction (AGA) are simplified
October 25, 2017: version 0.2.9
Initial release of approximate graph abstraction (AGA).
July 24, 2017: version 0.2.1
Scanpy now includes preprocessing, visualization, clustering, pseudotime and trajectory inference, differential expression testing and simulation of gene regulatory networks. The implementation efficiently deals with datasets of more than one million cells.
May 1, 2017: version 0.1
Scanpy computationally outperforms the Cell Ranger R kit and allows reproducing most of Seurat’s guided clustering tutorial.