Release Notes

Note

Also see the release notes of anndata.

Version 1.5

1.5.1 2020-05-21

Bug fixes

  • Fixed a bug in pca(), where random_state did not have an effect for sparse input PR 1240 I Virshup

  • Fixed docstring in pca() which included an unused argument PR 1240 I Virshup

1.5.0 2020-05-15

The 1.5.0 release adds a lot of new functionality, much of which takes advantage of anndata updates 0.7.0 - 0.7.2. Highlights of this release include support for spatial data, dedicated handling of graphs in AnnData, sparse PCA, an interface with scvi, and others.

Spatial data support

New functionality

External tools

Performance

  • pca() now uses efficient implicit centering for sparse matrices. This can lead to signifigantly improved performance for large datasets PR 1066 A Tarashansky

  • score_genes() now has an efficient implementation for sparse matrices with missing values PR 1196 redst4r.

Warning

The new pca() implementation can result in slightly different results for sparse matrices. See the pr (PR 1066) and documentation for more info.

Code design

Bug fixes

Version 1.4

1.4.6 2020-03-17

Functionality in external

Code design

Bug fixes

1.4.5 2019-12-30

Please install scanpy==1.4.5.post3 instead of scanpy==1.4.5.

New functionality

Code design

Warning

  • changed default solver in pca() from auto to arpack

  • changed default use_raw in score_genes() from False to None

1.4.4 2019-07-20

New functionality

  • scanpy.get adds helper functions for extracting data in convenient formats PR 619 I Virshup

Bug fixes

  • Stopped deprecations warnings from AnnData 0.6.22 I Virshup

Code design

  • normalize_total() gains param exclude_highly_expressed, and fraction is renamed to max_fraction with better docs A Wolf

1.4.3 2019-05-14

Bug fixes

  • neighbors() correctly infers n_neighbors again from params, which was temporarily broken in v1.4.2 I Virshup

Code design

1.4.2 2019-05-06

New functionality

  • combat() supports additional covariates which may include adjustment variables or biological condition PR 618 G Eraslan

  • highly_variable_genes() has a batch_key option which performs HVG selection in each batch separately to avoid selecting genes that vary strongly across batches PR 622 G Eraslan

Bug fixes

  • rank_genes_groups() t-test implementation doesn’t return NaN when variance is 0, also changed to scipy’s implementation PR 621 I Virshup

  • umap() with init_pos='paga' detects correct dtype A Wolf

  • louvain() and leiden() auto-generate key_added=louvain_R upon passing restrict_to, which was temporarily changed in 1.4.1 A Wolf

Code design

1.4.1 2019-04-26

New functionality

Code design

  • .layers support of scatter plots F Ramirez

  • fix double-logarithmization in compute of log fold change in rank_genes_groups() A Muñoz-Rojas

  • fix return sections of docs P Angerer

Version 1.3

1.3.8 2019-02-05

  • read_10x_h5() throws more stringent errors and doesn’t require speciying default genomes anymore. PR 442 and PR 444 I Vishrup

1.3.7 2019-01-02

Major updates

  • one can import scanpy as sc instead of import scanpy.api as sc, see scanpy

New functionality

1.3.6 2018-12-11

Major updates

Interactive exploration of analysis results through manifold viewers

Code design

1.3.5 2018-12-09

  • uncountable figure improvements PR 369 F Ramirez

1.3.4 2018-11-24

1.3.3 2018-11-05

Major updates

  • a fully distributed preprocessing backend T White and the Laserson Lab

Code design

Note

Also see changes in anndata 0.6.

  • changed default compression to None in write_h5ad() to speed up read and write, disk space use is usually less critical

  • performance gains in write_h5ad() due to better handling of strings and categories S Rybakov

1.3.1 2018-09-03

RNA velocity in single cells [Manno18]

  • Scanpy and AnnData support loom’s layers so that computations for single-cell RNA velocity [Manno18] become feasible S Rybakov and V Bergen

  • scvelo perfectly harmonizes with Scanpy and is able to process loom files with splicing information produced by Velocyto [Manno18], it runs a lot faster than the count matrix analysis of Velocyto and provides several conceptual developments

Plotting (Generic)

There now is a section on imputation in external:

Version 1.2

1.2.1 2018-06-08

Plotting of Generic marker genes and quality control.

1.2.0 2018-06-08

  • paga() improved, see PAGA; the default model changed, restore the previous default model by passing model='v1.0'

Version 1.1

1.1.0 2018-06-01

Version 1.0

1.0.0 2018-03-30

Major updates

  • Scanpy is much faster and more memory efficient: preprocess, cluster and visualize 1.3M cells in 6h, 130K cells in 14min, and 68K cells in 3min A Wolf

  • the API gained a preprocessing function neighbors() and a class Neighbors() to which all basic graph computations are delegated A Wolf

Warning

Upgrading to 1.0 isn’t fully backwards compatible in the following changes

  • the graph-based tools louvain() dpt() draw_graph() umap() diffmap() paga() require prior computation of the graph: sc.pp.neighbors(adata, n_neighbors=5); sc.tl.louvain(adata) instead of previously sc.tl.louvain(adata, n_neighbors=5)

  • install numba via conda install numba, which replaces cython

  • the default connectivity measure (dpt will look different using default settings) changed. setting method='gauss' in sc.pp.neighbors uses gauss kernel connectivities and reproduces the previous behavior, see, for instance in the example paul15.

  • namings of returned annotation have changed for less bloated AnnData objects, which means that some of the unstructured annotation of old AnnData files is not recognized anymore

  • replace occurances of group_by with groupby (consistency with pandas)

  • it is worth checking out the notebook examples to see changes, e.g. the seurat example.

  • upgrading scikit-learn from 0.18 to 0.19 changed the implementation of PCA, some results might therefore look slightly different

Further updates

  • UMAP [McInnes18] can serve as a first visualization of the data just as tSNE, in contrast to tSNE, UMAP directly embeds the single-cell graph and is faster; UMAP is also used for measuring connectivities and computing neighbors, see neighbors() A Wolf

  • graph abstraction: AGA is renamed to PAGA: paga(); now, it only measures connectivities between partitions of the single-cell graph, pseudotime and clustering need to be computed separately via louvain() and dpt(), the connectivity measure has been improved A Wolf

  • logistic regression for finding marker genes rank_genes_groups() with parameter method='logreg' A Wolf

  • louvain() provides a better implementation for reclustering via restrict_to A Wolf

  • scanpy no longer modifies rcParams upon import, call settings.set_figure_params to set the ‘scanpy style’ A Wolf

  • default cache directory is ./cache/, set settings.cachedir to change this; nested directories in this are avoided A Wolf

  • show edges in scatter plots based on graph visualization draw_graph() and umap() by passing edges=True A Wolf

  • downsample_counts() for downsampling counts MD Luecken

  • default 'louvain_groups' are called 'louvain' A Wolf

  • 'X_diffmap' contains the zero component, plotting remains unchanged A Wolf

Version 0.4

0.4.4 2018-02-26

0.4.3 2018-02-09

0.4.2 2018-01-07

  • amendments in PAGA and its plotting functions A Wolf

0.4.0 2017-12-23

Version 0.3

0.3.2 2017-11-29

0.3.0 2017-11-16

Version 0.2

0.2.9 2017-10-25

Initial release of the new trajectory inference method PAGA

  • paga() computes an abstracted, coarse-grained (PAGA) graph of the neighborhood graph A Wolf

  • paga_compare() plot this graph next an embedding A Wolf

  • paga_path() plots a heatmap through a node sequence in the PAGA graph A Wolf

0.2.1 2017-07-24

Scanpy includes preprocessing, visualization, clustering, pseudotime and trajectory inference, differential expression testing and simulation of gene regulatory networks. The implementation efficiently deals with datasets of more than one million cells. A Wolf, P Angerer

Version 0.1

0.1.0 2017-05-17

Scanpy computationally outperforms and allows reproducing both the Cell Ranger R kit’s and most of Seurat’s clustering workflows. A Wolf, P Angerer