Release Notes

Note

Also see the release notes of anndata.

Version 1.6

1.6.0 2020-07-31

This release includes several new visualization options and improvements after an overhaul of the dotplot, matrixplot and stacked_violin functions (see PR 1210 F Ramirez). In addition, the internals for the differential expression code were overhauled (rank_genes_groups(), PR 1156 S Rybakov).

Plotting improvements for dotplot(), matrixplot() and stacked_violin()

  • Plots are now a wrapper to classes that allow fine-tuning of the images by allowing more options. The classes can be accessed directly (eg. DotPlot) or using the new return_fig parameter.

  • If the plots are called after scanpy.tl.rank_genes_groups() (eg. rank_genes_groups_dotplot()) now is also possible to plot log fold change and p-values.

  • Added ax parameter which allows embedding the plot in other images

  • Added option to include a bar plot instead of the dendrogram containing the cell/observation totals per category.

  • Return a dictionary of axes for further manipulation. This includes the main plot, legend and dendrogram to totals

  • Set a title to the image.

  • Legend can be removed

  • groupby can be a list of categories. E.g. groupby=[‘tissue’, ‘cell type’]

  • Added padding parameter to dotplot and stacked_violin to address PR 1270

  • Updated documentation and tutorial

dotplot() changes

  • Improved the colorbar and size legend for dotplots. Now the colorbar and size have titles, which can be modified using the colorbar_title and size_title arguments. They also align at the bottom of the image and do not shrink if the dotplot image is smaller.

  • Allow plotting genes in rows and categories in columns (swap_axes).

  • Using the DotPlot object the dot_edge_color and line width can be set up, a grid added as well as several other features

  • New style was added in which the dots are replaced by an empty circle and the square behind the circle is colored (like in matrixplots).

stacked_violin() changes

  • violin colors can be colored based on average gene expression as in dotplots

  • made the linewidth of the violin plots smaller.

  • removed the tics for the y axis as they tend to overlap with each other. Using the style method they can be visualized if needed.

Other visualization changes

  • Added title for colorbar and positioned as in dotplot for matrixplot()

  • heatmap() and tracksplot() now return a dictionary of axes when show=False as for the other plots.

  • interpolation can be passed as parameter for heatmap()

Additions

Bug fixes

Version 1.5

1.5.1 2020-05-21

Bug fixes

  • Fixed a bug in pca(), where random_state did not have an effect for sparse input PR 1240 I Virshup

  • Fixed docstring in pca() which included an unused argument PR 1240 I Virshup

1.5.0 2020-05-15

The 1.5.0 release adds a lot of new functionality, much of which takes advantage of anndata updates 0.7.0 - 0.7.2. Highlights of this release include support for spatial data, dedicated handling of graphs in AnnData, sparse PCA, an interface with scvi, and others.

Spatial data support

New functionality

External tools

Performance

  • pca() now uses efficient implicit centering for sparse matrices. This can lead to signifigantly improved performance for large datasets PR 1066 A Tarashansky

  • score_genes() now has an efficient implementation for sparse matrices with missing values PR 1196 redst4r.

Warning

The new pca() implementation can result in slightly different results for sparse matrices. See the pr (PR 1066) and documentation for more info.

Code design

Bug fixes

Version 1.4

1.4.6 2020-03-17

Functionality in external

Code design

Bug fixes

1.4.5 2019-12-30

Please install scanpy==1.4.5.post3 instead of scanpy==1.4.5.

New functionality

Code design

Warning

  • changed default solver in pca() from auto to arpack

  • changed default use_raw in score_genes() from False to None

1.4.4 2019-07-20

New functionality

  • scanpy.get adds helper functions for extracting data in convenient formats PR 619 I Virshup

Bug fixes

  • Stopped deprecations warnings from AnnData 0.6.22 I Virshup

Code design

  • normalize_total() gains param exclude_highly_expressed, and fraction is renamed to max_fraction with better docs A Wolf

1.4.3 2019-05-14

Bug fixes

  • neighbors() correctly infers n_neighbors again from params, which was temporarily broken in v1.4.2 I Virshup

Code design

1.4.2 2019-05-06

New functionality

  • combat() supports additional covariates which may include adjustment variables or biological condition PR 618 G Eraslan

  • highly_variable_genes() has a batch_key option which performs HVG selection in each batch separately to avoid selecting genes that vary strongly across batches PR 622 G Eraslan

Bug fixes

  • rank_genes_groups() t-test implementation doesn’t return NaN when variance is 0, also changed to scipy’s implementation PR 621 I Virshup

  • umap() with init_pos='paga' detects correct dtype A Wolf

  • louvain() and leiden() auto-generate key_added=louvain_R upon passing restrict_to, which was temporarily changed in 1.4.1 A Wolf

Code design

1.4.1 2019-04-26

New functionality

Code design

  • .layers support of scatter plots F Ramirez

  • fix double-logarithmization in compute of log fold change in rank_genes_groups() A Muñoz-Rojas

  • fix return sections of docs P Angerer

Version 1.3

1.3.8 2019-02-05

  • read_10x_h5() throws more stringent errors and doesn’t require speciying default genomes anymore. PR 442 and PR 444 I Vishrup

1.3.7 2019-01-02

Major updates

  • one can import scanpy as sc instead of import scanpy.api as sc, see scanpy

New functionality

1.3.6 2018-12-11

Major updates

Interactive exploration of analysis results through manifold viewers

Code design

1.3.5 2018-12-09

  • uncountable figure improvements PR 369 F Ramirez

1.3.4 2018-11-24

1.3.3 2018-11-05

Major updates

  • a fully distributed preprocessing backend T White and the Laserson Lab

Code design

Note

Also see changes in anndata 0.6.

  • changed default compression to None in write_h5ad() to speed up read and write, disk space use is usually less critical

  • performance gains in write_h5ad() due to better handling of strings and categories S Rybakov

1.3.1 2018-09-03

RNA velocity in single cells [Manno18]

  • Scanpy and AnnData support loom’s layers so that computations for single-cell RNA velocity [Manno18] become feasible S Rybakov and V Bergen

  • scvelo harmonizes with Scanpy and is able to process loom files with splicing information produced by Velocyto [Manno18], it runs a lot faster than the count matrix analysis of Velocyto and provides several conceptual developments

Plotting (Generic)

There now is a section on imputation in external:

Version 1.2

1.2.1 2018-06-08

Plotting of Generic marker genes and quality control.

1.2.0 2018-06-08

  • paga() improved, see PAGA; the default model changed, restore the previous default model by passing model='v1.0'

Version 1.1

1.1.0 2018-06-01

Version 1.0

1.0.0 2018-03-30

Major updates

  • Scanpy is much faster and more memory efficient: preprocess, cluster and visualize 1.3M cells in 6h, 130K cells in 14min, and 68K cells in 3min A Wolf

  • the API gained a preprocessing function neighbors() and a class Neighbors() to which all basic graph computations are delegated A Wolf

Warning

Upgrading to 1.0 isn’t fully backwards compatible in the following changes

  • the graph-based tools louvain() dpt() draw_graph() umap() diffmap() paga() require prior computation of the graph: sc.pp.neighbors(adata, n_neighbors=5); sc.tl.louvain(adata) instead of previously sc.tl.louvain(adata, n_neighbors=5)

  • install numba via conda install numba, which replaces cython

  • the default connectivity measure (dpt will look different using default settings) changed. setting method='gauss' in sc.pp.neighbors uses gauss kernel connectivities and reproduces the previous behavior, see, for instance in the example paul15.

  • namings of returned annotation have changed for less bloated AnnData objects, which means that some of the unstructured annotation of old AnnData files is not recognized anymore

  • replace occurances of group_by with groupby (consistency with pandas)

  • it is worth checking out the notebook examples to see changes, e.g. the seurat example.

  • upgrading scikit-learn from 0.18 to 0.19 changed the implementation of PCA, some results might therefore look slightly different

Further updates

  • UMAP [McInnes18] can serve as a first visualization of the data just as tSNE, in contrast to tSNE, UMAP directly embeds the single-cell graph and is faster; UMAP is also used for measuring connectivities and computing neighbors, see neighbors() A Wolf

  • graph abstraction: AGA is renamed to PAGA: paga(); now, it only measures connectivities between partitions of the single-cell graph, pseudotime and clustering need to be computed separately via louvain() and dpt(), the connectivity measure has been improved A Wolf

  • logistic regression for finding marker genes rank_genes_groups() with parameter method='logreg' A Wolf

  • louvain() provides a better implementation for reclustering via restrict_to A Wolf

  • scanpy no longer modifies rcParams upon import, call settings.set_figure_params to set the ‘scanpy style’ A Wolf

  • default cache directory is ./cache/, set settings.cachedir to change this; nested directories in this are avoided A Wolf

  • show edges in scatter plots based on graph visualization draw_graph() and umap() by passing edges=True A Wolf

  • downsample_counts() for downsampling counts MD Luecken

  • default 'louvain_groups' are called 'louvain' A Wolf

  • 'X_diffmap' contains the zero component, plotting remains unchanged A Wolf

Version 0.4

0.4.4 2018-02-26

0.4.3 2018-02-09

0.4.2 2018-01-07

  • amendments in PAGA and its plotting functions A Wolf

0.4.0 2017-12-23

Version 0.3

0.3.2 2017-11-29

0.3.0 2017-11-16

Version 0.2

0.2.9 2017-10-25

Initial release of the new trajectory inference method PAGA

  • paga() computes an abstracted, coarse-grained (PAGA) graph of the neighborhood graph A Wolf

  • paga_compare() plot this graph next an embedding A Wolf

  • paga_path() plots a heatmap through a node sequence in the PAGA graph A Wolf

0.2.1 2017-07-24

Scanpy includes preprocessing, visualization, clustering, pseudotime and trajectory inference, differential expression testing and simulation of gene regulatory networks. The implementation efficiently deals with datasets of more than one million cells. A Wolf, P Angerer

Version 0.1

0.1.0 2017-05-17

Scanpy computationally outperforms and allows reproducing both the Cell Ranger R kit’s and most of Seurat’s clustering workflows. A Wolf, P Angerer