Release notes

Release notes#

Version 1.12#

1.12.0.dev94+g260d33bd 2025-07-24#

Breaking changes#

Adopt the Scientific Python deprecation schedule: remove Python 3.10 support and add Python 3.13 support, require anndata≥0.9 P Angerer (#3485)

Bug fixes#

Raise fewer redundant warnings, mainly in scanpy.pl and scanpy.datasets functions P Angerer (#3724)

Development Process#

Replaced several internal utilities with their fast_array_utils counterparts P Angerer (#3598)

Features#

Added n_components parameter to tsne() Kitsune (#2803)
Add zarr support and convert_strings_to_categoricals parameter to scanpy.write() P Angerer (#3498)
Add support for scipy.sparse.csr_array and scipy.sparse.csc_array P Angerer (#3563)
Added a new compressed parameter to the read_10x_mtx function to support reading uncompressed matrix files produced by tools like STARsolo. This parameter allows users to read uncompressed outputs from tools that don’t produce gzipped files by default. (#3564)
Make scanpy.get.aggregate() dask compatible with all aggregations except median. I Gold (#3700)

Miscellaneous improvements#

Deprecate scanpy.tl.louvain(). P Angerer (#3658)

Version 1.11#

1.11.3 2025-07-01#

Bug fixes#

Ensure axis_nnz calculates its chunk size/shape correctly with dask I Gold (#3667)
Replace deprecated np.in1d with np.isin to silence deprecation warnings. E Ferdman (#3685)
Upperbound scipy to 1.16.0 due to statsmodels/statsmodels#9584 I Gold (#3695)

Documentation#

Fix documentation location for scanpy.settings P Angerer (#3672)

1.11.2 2025-05-28#

Bug fixes#

Fix zappy compatibility for clip_array P Angerer (#3351)
Fixes an error where regress_out would fail to work with integer types S Dicks (#3461)
Prevent plotting with mask_obs from mutating data V Menon (#3496)
Prevent scanpy.pp.scale() from creating a dask Array with numpy.matrix chunks P Angerer (#3597)
Allow using sklearn ≥1.6, Dask ≥2024.8, and sphinx ≥8.2.1 P Angerer (#3611)
Fixed handling of ext argument in scanpy.read() I Gold #3643
Fix error message when trying to use sc.pp.pca(x, zero_center=False) with a sparse dask array. P Angerer (#3646)

Documentation#

Clarify use of implementations in scanpy.pp.pca() docs. P Angerer (#3655)

Performance#

Speed up for a categorical regressor in regress_out() S Dicks I Gold (#3353)
In pp.normalize_total, the median is now computed in-memory when using Dask S Dicks (#3379)
Speed up pp.normalize_total with a numba kernel for csr-matrices S Dicks (#3571)

1.11.1 2025-03-31#

Bug fixes#

Fix compatibility with IPython 9 P Angerer (#3499)
Prevent too-low matplotlib version from being used P Angerer (#3534)

Features#

Allow covariance_eigh as a solver option for pca() with dask.array.Array dense data ilan-gold (#3528)

Performance#

Speed up wilcoxon rank-sum test with numba G Wu (#3529)

1.11.0 2025-02-14#

Release candidates:

rc2 2025-01-24
rc1 2024-12-20

Features#

rc1 sample() supports both upsampling and downsampling of observations and variables. subsample() is now deprecated. G Eraslan & P Angerer (#943)
rc1 Add layer argument to scanpy.tl.score_genes() and scanpy.tl.score_genes_cell_cycle() L Zappia (#2921)
rc1 Prevent raw conflict with layer in score_genes() S Dicks (#3155)
rc1 Add support for median as an aggregation function to aggregate(). This allows for median-based aggregation of data (e.g., pseudobulk), complementing existing methods like mean- and sum-based aggregation M Dehkordi (Farhad) (#3180)
rc1 Add key_added argument to pca(), tsne() and umap() P Angerer (#3184)
rc1 Support running scanpy.pp.pca() on sparse Dask arrays with the 'covariance_eigh' solver P Angerer (#3263)
rc1 Use upstreamed PCA implementation for csr_array and csr_matrix (see scikit-learn Version 1.4.0) P Angerer (#3267)
rc1 Add explicit support to scanpy.pp.pca() for svd_solver='covariance_eigh' P Angerer (#3296)
rc1 Add support for dask.array.Array to scanpy.pp.calculate_qc_metrics() I Gold (#3307)
rc1 Support layer parameter in scanpy.pl.highest_expr_genes() P Angerer (#3324)
rc1 Run numba functions single-threaded when called from inside of a ThreadPool P Angerer (#3335)
rc1 Switch print_header() and print_versions() to session_info2 P Angerer (#3384)
rc1 Add sampling probabilities/mask parameter p to sample() P Angerer (#3410)

Performance#

rc1 Speed up regress_out() P Ashish, P Angerer & S Dicks (#3284)

Documentation#

rc1 Improve harmony_integrate() docs D Kühl (#3362)
rc1 Raise FutureWarning when calling deprecated scanpy.pp functions P Angerer (#3380)
rc1 P Angerer (#3407)

Deprecate …

in favor of …

scanpy.read_visium()

squidpy.read.visium()

scanpy.datasets.visium_sge()

squidpy.datasets.visium()

scanpy.pl.spatial()

squidpy.pl.spatial_scatter()
rc2 Fix reference in scanpy.pp page D Kazemi (#3418)

Bug fixes#

rc1 Upper-bound sklearn <1.6.0 due to dask/dask-ml#1002 Ilan Gold (#3393)
rc2 Fix rank_genes_groups() compatibility with data >10M cells P Angerer (#3426)
rc2 Fix scanpy.pl.rank_genes_groups()’s ax parameter P Angerer (#3428)

Development Process#

rc2 Fix version number inference in development environments (CI and local) P Angerer (#3441)

Version 1.10#

1.10.4 2024-11-12#

Breaking changes#

Remove Python 3.9 support P Angerer (#3283)

Bug fixes#

Fix scanpy.pl.DotPlot.style(), scanpy.pl.MatrixPlot.style(), and scanpy.pl.StackedViolin.style() resetting all non-specified parameters P Angerer (#3206)
Accept 'group' instead of 'obs' for standard_scale parameter in stacked_violin() P Angerer (#3243)
Use density_norm instead of of scale (cont. from #2844) in violin() and stacked_violin() P Angerer (#3244)
Switched all compatibility adapters for positional parameters to FutureWarning P Angerer (#3264)
Catch PerfectSeparationWarning during regress_out() J Wagner (#3275)
Fix scanpy.pp.highly_variable_genes() for batches of size 1 P Angerer (#3286)
Fix scanpy.pl.scatter()’s color parameter to take collections as advertised P Angerer (#3299)
Fix scanpy.pl.highest_expr_genes() when used with a categorical gene symbol column P Angerer (#3302)

1.10.3 2024-09-17#

Bug fixes#

Prevent empty control gene set in score_genes() M Müller (#2875)
Fix subset=True of highly_variable_genes() when flavor is seurat or cell_ranger, and batch_key!=None E Roellin (#3042)
Add compatibility with numpy 2.0 P Angerer #3065 and (#3115)
Fix legend_loc argument in scanpy.pl.embedding() not accepting matplotlib parameters P Angerer (#3163)
Fix dispersion cutoff in highly_variable_genes() in presence of NaNs P Angerer (#3176)
Fix axis labeling for swapped axes in rank_genes_groups_stacked_violin() Ilan Gold (#3196)
Upper bound dask on account of scverse/anndata#1579 Ilan Gold (#3217)
The fa2-modified package replaces forceatlas2 for the latter’s lack of maintenance A Alam (#3220)

1.10.2 2024-06-25#

Development Process#

Add performance benchmarking #2977 R Shrestha, P Angerer

Documentation#

Document several missing parameters in docstring #2888 S Cheney
Fixed incorrect instructions in “testing” dev docs #2994 I Virshup
Update marsilea tutorial to use group_ methods #3001 I Virshup
Fixed citations #3032 P Angerer
Improve dataset documentation #3060 P Angerer

Bug fixes#

Compatibility with matplotlib 3.9 #2999 I Virshup
Add clear errors where backed mode-like matrices (i.e., from sparse_dataset) are not supported #3048 I gold
Write out full pca results when _choose_representation is called i.e., neighbors() without pca() #3078 I gold
Fix deprecated use of .A with sparse matrices #3084 P Angerer
Fix zappy support #3089 P Angerer
Fix dotplot group order with pandas 1.x #3101 P Angerer

Performance#

sparse_mean_variance_axis now uses all cores for the calculations #3015 S Dicks
pp.highly_variable_genes with flavor=seurat_v3 now uses a numba kernel #3017 S Dicks
Speed up scrublet() #3044 S Dicks and #3056 P Angerer
Speed up clipping of array in scale() #3100 P Ashish & S Dicks

1.10.1 2024-04-09#

Documentation#

Added how-to example on plotting with Marsilea #2974 Y Zheng

Bug fixes#

Fix aggregate when aggregating by more than two groups #2965 I Virshup

Performance#

scale() now uses numba kernels for sparse.csr_matrix and sparse.csc_matrix when zero_center==False and mask_obs is provided. This greatly speed up execution #2942 S Dicks

1.10.0 2024-03-26#

scanpy 1.10 brings a large amount of new features, performance improvements, and improved documentation.

Some highlights:

Improved support for out-of-core workflows via dask. See new tutorial: Using dask with Scanpy demonstrating counts-to-clusters for 1.4 million cells in <10 min.
A new basic clustering tutorial demonstrating an updated workflow.
Opt-in increased performance for neighbor search and clustering (how to guide).
Ability to mask observations or variables from a number of methods (see Customizing Scanpy plots for an example with plotting embeddings)
A new function aggregate() for computing aggregations of your data, very useful for pseudo bulking!

Features#

scrublet() and scrublet_simulate_doublets() were moved from scanpy.external.pp to scanpy.pp. The scrublet implementation is now maintained as part of scanpy #2703 P Angerer
scanpy.pp.pca(), scanpy.pp.scale(), scanpy.pl.embedding(), and scanpy.experimental.pp.normalize_pearson_residuals_pca() now support a mask parameter #2272 C Bright, T Marcella, & P Angerer
Enhanced dask support for some internal utilities, paving the way for more extensive dask support #2696 P Angerer
scanpy.pp.highly_variable_genes() supports dask for the default seurat and cell_ranger flavors #2809 P Angerer
New function scanpy.get.aggregate() which allows grouped aggregations over your data. Useful for pseudobulking! #2590 Isaac Virshup Ilan Gold Jon Bloom
scanpy.pp.neighbors() now has a transformer argument allowing the use of different ANN/ KNN libraries #2536 P Angerer
scanpy.experimental.pp.highly_variable_genes() using flavor='pearson_residuals' now uses numba for variance computation and is faster #2612 S Dicks & P Angerer
scanpy.tl.leiden() now offers igraph’s implementation of the leiden algorithm via via flavor when set to igraph. leidenalg’s implementation is still default, but discouraged. #2815 I Gold
scanpy.pp.highly_variable_genes() has new flavor seurat_v3_paper that is in its implementation consistent with the paper description in Stuart et al 2018. #2792 E Roellin
scanpy.datasets.blobs() now accepts a random_state argument #2683 E Roellin
scanpy.pp.pca() and scanpy.pp.regress_out() now accept a layer argument #2588 S Dicks
scanpy.pp.subsample() with copy=True can now be called in backed mode #2624 E Roellin
scanpy.external.pp.harmony_integrate() now runs with 64 bit floats improving reproducibility #2655 S Dicks
scanpy.tl.rank_genes_groups() no longer warns that it’s default was changed from t-test_overestim_var to t-test #2798 L Heumos
scanpy.pp.calculate_qc_metrics now allows qc_vars to be passed as a string #2859 N Teyssier
scanpy.tl.leiden() and scanpy.tl.louvain() now store clustering parameters in the key provided by the key_added parameter instead of always writing to (or overwriting) a default key #2864 J Fan
scanpy.pp.scale() now clips np.ndarray also at - max_value for zero-centering #2913 S Dicks
Support sparse chunks in dask scale(), normalize_total() and highly_variable_genes() (seurat and cell-ranger tested) #2856 ilan-gold

Documentation#

Doc style overhaul #2220 A Gayoso
Re-add search-as-you-type, this time via readthedocs-sphinx-search #2805 P Angerer
Fixed a lot of broken usage examples #2605 P Angerer
Improved harmonization of return field of sc.pp and sc.tl functions #2742 E Roellin
Improved docs for percent_top argument of calculate_qc_metrics() #2849 I Virshup
New basic clustering tutorial (Preprocessing and clustering), based on one from scverse-tutorials #2901 I Virshup
Overhauled Tutorials page, and added new How to section to docs #2901 I Virshup
Added a new tutorial on working with dask (Using dask with Scanpy) #2901 I Gold I Virshup

Bug fixes#

Updated read_visium() such that it can read spaceranger 2.0 files L Lehner
Fix normalize_total() for dask #2466 P Angerer
Fix setting :attr:scanpy.settings.verbosity in some cases #2605 P Angerer
Fix all remaining pandas warnings #2789 P Angerer
Fix some annoying plotting warnings around violin plots #2844 P Angerer
Scanpy now has a test job which tests against the minumum versions of the dependencies. In the process of implementing this, many bugs associated with using older versions of pandas, anndata, numpy, and matplotlib were fixed. #2816 I Virshup
Fix warnings caused by internal usage of pandas.DataFrame.stack with pandas>=2.1 #2864I Virshup
scanpy.get.aggregate() now always returns numpy.ndarray #2893 S Dicks
Removes self from array of neighbors for use_approx_neighbors = True in scrublet() #2896S Dicks
Compatibility with scipy 1.13 #2943 I Virshup
Fix use of dendrogram() on highly correlated low precision data #2928 P Angerer
Fix pytest deprecation warning #2879 P Angerer

Development Process#

Scanpy is now tested against python 3.12 #2863 ivirshup
Fix testing package build #2468 P Angerer

Deprecations#

Dropped support for Python 3.8. More details here. #2695 P Angerer
Deprecated specifying large numbers of function parameters by position as opposed to by name/keyword in all public APIs. e.g. prefer sc.tl.umap(adata, min_dist=0.1, spread=0.8) over sc.tl.umap(adata, 0.1, 0.8) #2702 P Angerer
Dropped support for umap<0.5 for performance reasons. #2870 P Angerer

Version 1.9#

1.9.8 2024-01-26#

Bug fixes#

Fix handling of numpy array palettes for old numpy versions #2832 P Angerer

1.9.7 2024-01-25#

Bug fixes#

Fix handling of numpy array palettes (e.g. after write-read cycle) #2734 P Angerer
Specify correct version of matplotlib dependency #2733 P Fisher
Fix scanpy.pl.violin() usage of seaborn.catplot #2739 E Roellin
Fix scanpy.pp.highly_variable_genes() to handle the combinations of inplace and subset consistently #2757 E Roellin
Replace usage of various deprecated functionality from anndata and pandas #2678 #2779 P Angerer
Allow to use default n_top_genes when using scanpy.pp.highly_variable_genes() flavor 'seurat_v3' #2782 P Angerer
Fix scanpy.read_10x_mtx()’s gex_only=True mode #2801 P Angerer

1.9.6 2023-10-31#

Bug fixes#

Allow scanpy.pl.scatter() to accept a str palette name #2571 P Angerer
Make scanpy.external.tl.palantir() compatible with palantir >=1.3 #2672 DJ Otto
Fix scanpy.pl.pca() when return_fig=True and annotate_var_explained=True #2682 J Wagner
Temp fix for #2680 by skipping seaborn version 0.13.0 #2661 P Angerer
Fix scanpy.pp.highly_variable_genes() to not modify the used layer when flavor=seurat #2698 E Roellin
Prevent pandas from causing infinite recursion when setting a slice of a categorical column #2719 P Angerer

1.9.5 2023-09-08#

Bug fixes#

Remove use of deprecated dtype argument to AnnData constructor #2658 Isaac Virshup

1.9.4 2023-08-24#

Bug fixes#

Support scikit-learn 1.3 #2515 P Angerer
Deal with None value vanishing from things like .uns['log1p'] #2546 SP Shen
Depend on igraph instead of python-igraph #2566 P Angerer
rank_genes_groups() now handles unsorted groups as intended #2589 S Dicks
rank_genes_groups_df() now works for rank_genes_groups() with method="logreg" #2601 S Dicks
scanpy.tl._utils._choose_representation now works with n_pcs if bigger than settings.N_PCS #2610 S Dicks

1.9.3 2023-03-02#

Bug fixes#

Variety of fixes against pandas 2.0.0rc0 #2434 I Virshup

1.9.2 2023-02-16#

Bug fixes#

highly_variable_genes() layer argument now works in tandem with batches #2302 D Schaumont
highly_variable_genes() with flavor='cell_ranger' now handles the case in #2230 where the number of calculated dispersions is less than n_top_genes #2231 L Zappia
Fix compatibility with matplotlib 3.7 #2414 I Virshup P Fisher
Fix scrublet numpy matrix compatibility issue #2395 A Gayoso

1.9.1 2022-04-05#

Bug fixes#

normalize_total() works when Dask is not installed #2209 R Cannoodt
Fix embedding plots by bumping matplotlib dependency to version 3.4 #2212 I Virshup

1.9.0 2022-04-01#

Tutorials#

New tutorial on the usage of Pearson Residuals: How to preprocess UMI count data with analytic Pearson residuals J Lause, G Palla
Materials and recordings for Scanpy workshops by Maren Büttner

Experimental module#

Added scanpy.experimental module! Currently contains functionality related to pearson residuals in scanpy.experimental.pp #1715 J Lause, G Palla, I Virshup. This includes:
- normalize_pearson_residuals() for Pearson Residuals normalization
- highly_variable_genes() for HVG selection with Pearson Residuals
- normalize_pearson_residuals_pca() for Pearson Residuals normalization and dimensionality reduction with PCA
- recipe_pearson_residuals() for Pearson Residuals normalization, HVG selection and dimensionality reduction with PCA

Features#

filter_rank_genes_groups() now allows to filter with absolute values of log fold change #1649 S Rybakov
_choose_representation now subsets the provided representation to n_pcs, regardless of the name of the provided representation (should affect mostly neighbors()) #2179 I Virshup PG Majev
scanpy.pp.scrublet() (and related functions) can now be used on AnnData objects containing multiple batches #1965 J Manning
Number of variables plotted with pca_loadings() can now be controlled with n_points argument. Additionally, variables are no longer repeated if the anndata has less than 30 variables #2075 Yves33
Dask arrays now work with scanpy.pp.normalize_total() #1663 G Buckley, I Virshup
embedding_density() now allows more than 10 groups #1936 A Wolf
Embedding plots can now pass colorbar_loc to specify the location of colorbar legend, or pass None to not show a colorbar #1821 A Schaar I Virshup
Embedding plots now have a dimensions argument, which lets users select which dimensions of their embedding to plot and uses the same broadcasting rules as other arguments #1538 I Virshup
print_versions() now uses session_info #2089 P Angerer I Virshup

Ecosystem#

Multiple packages have been added to our ecosystem page, including:

decoupler a for footprint analysis and pathway enrichement #2186 PB Mompel
dandelion for B-cell receptor analysis #1953 Z Tuong
CIARA a feature selection tools for identifying rare cell types #2175 M Stock

Bug fixes#

Fixed finding variables with use_raw=True and basis=None in scanpy.pl.scatter() #2027 E Rice
Fixed scanpy.pp.scrublet() to address #1957 FlMai and ensure raw counts are used for simulation
Functions in scanpy.datasets no longer throw OldFormatWarnings when using anndata 0.8 #2096 I Virshup
Fixed use of scanpy.pp.neighbors() with method='rapids': RAPIDS cuML no longer returns a squared Euclidean distance matrix, so we should not square-root the kNN distance matrix. #1828 M Zaslavsky
Removed pytables dependency by implementing read_10x_h5 with h5py due to installation errors on Windows #2064
Fixed bug in scanpy.external.pp.hashsolo() where default value was set improperly #2190 B Reiz
Fixed bug in scanpy.pl.embedding() functions where an error could be raised when there were missing values and large numbers of categories #2187 I Virshup

Version 1.8#

1.8.2 2021-11-3#

Documentation#

Update conda installation instructions #1974 L Heumos

Bug fixes#

Fix plotting after scanpy.tl.filter_rank_genes_groups() #1942 S Rybakov
Fix use_raw=None using anndata.AnnData.var_names if anndata.AnnData.raw is present in scanpy.tl.score_genes() #1999 M Klein
Fix compatibility with UMAP 0.5.2 #2028 L Mcinnes
Fixed non-determinism in scanpy.pl.paga() node positions #1922 I Virshup

Ecosystem#

Added PASTE (a tool to align and integrate spatial transcriptomics data) to scanpy ecosystem.

1.8.1 2021-07-07#

Bug fixes#

Fixed reproducibility of scanpy.tl.score_genes(). Calculation and output is now float64 type. #1890 I Kucinski
Workarounds for some changes/ bugs in pandas 1.3 #1918 I Virshup
Fixed bug where sc.pl.paga_compare could mislabel nodes on the paga graph #1898 I Virshup
Fixed handling of use_raw with scanpy.tl.rank_genes_groups() #1934 I Virshup

1.8.0 2021-06-28#

Metrics module#

Added scanpy.metrics module!
- Added scanpy.metrics.gearys_c() for spatial autocorrelation #915 I Virshup
- Added scanpy.metrics.morans_i() for global spatial autocorrelation #1740 I Virshup, G Palla
- Added scanpy.metrics.confusion_matrix() for comparing labellings #915 I Virshup

Features#

Added layer and copy kwargs to normalize_total() #1667 I Virshup
Added vcenter and norm arguments to the plotting functions #1551 G Eraslan
Standardized and expanded available arguments to the sc.pl.rank_genes_groups* family of functions. #1529 F Ramirez I Virshup
- See examples sections of rank_genes_groups_dotplot() and rank_genes_groups_matrixplot() for demonstrations.
scanpy.tl.tsne() now supports the metric argument and records the passed parameters #1854 I Virshup
scanpy.pl.scrublet_score_distribution() now uses same API as other scanpy functions for saving/ showing plots #1741 J Manning

Ecosystem#

Added Cubé to ecosystem page #1878 C Lambden
Added triku a feature selection method to the ecosystem page #1722 AM Ascensión
Added dorothea and progeny to the ecosystem page #1767 P Badia-i-Mompel

Documentation#

Added Community page to docs #1856 I Virshup
Added rendered examples to many plotting functions #1664 A Schaar L Zappia bio-la L Hetzel L Dony M Buttner K Hrovatin F Ramirez I Virshup LouisK92 mayarali
Integrated DocSearch, a find-as-you-type documentation index search. #1754 P Angerer
Reorganized reference docs #1753 I Virshup
Clarified docs issues for neighbors(), diffmap(), calculate_qc_metrics() #1680 G Palla
Fixed typos in grouped plot doc-strings #1877 C Rands
Extended examples for differential expression plotting. #1529 F Ramirez
- See rank_genes_groups_dotplot() or rank_genes_groups_matrixplot() for examples.

Bug fixes#

Fix scanpy.pl.paga_path() TypeError with recent versions of anndata #1047 P Angerer
Fix detection of whether IPython is running #1844 I Virshup
Fixed reproducibility of scanpy.tl.diffmap() (added random_state) #1858 I Kucinski
Fixed errors and warnings from embedding plots with small numbers of categories after sns.set_palette was called #1886 I Virshup
Fixed handling of gene_symbols argument in a number of sc.pl.rank_genes_groups* functions #1529 F Ramirez I Virshup
Fixed handling of use_raw for sc.tl.rank_genes_groups when no .raw is present #1895 I Virshup
scanpy.pl.rank_genes_groups_violin() now works for raw=False #1669 M van den Beek
scanpy.pl.dotplot() now uses smallest_dot argument correctly #1771 S Flemming

Development Process#

Switched to flit for building and deploying the package, a simple tool with an easy to understand command line interface and metadata #1527 P Angerer
Use pre-commit for style checks #1684 #1848 L Heumos I Virshup

Deprecations#

Dropped support for Python 3.6. More details here. #1897 I Virshup
Deprecated layers and layers_norm kwargs to normalize_total() #1667 I Virshup
Deprecated MulticoreTSNE backend for scanpy.tl.tsne() #1854 I Virshup

Version 1.7#

1.7.2 2021-04-07#

Bug fixes#

scanpy.logging.print_versions() now works when python<3.8 #1691 I Virshup
scanpy.pp.regress_out() now uses joblib as the parallel backend, and should stop oversubscribing threads #1694 I Virshup
scanpy.pp.highly_variable_genes() with flavor="seurat_v3" now returns correct gene means and -variances when used with batch_key #1732 J Lause
scanpy.pp.highly_variable_genes() now throws a warning instead of an error when non-integer values are passed for method "seurat_v3". The check can be skipped by passing check_values=False. #1679 G Palla

Ecosystem#

Added triku a feature selection method to the ecosystem page #1722 AM Ascensión
Added dorothea and progeny to the ecosystem page #1767 P Badia-i-Mompel

1.7.1 2021-02-24#

Documentation#

More twitter handles for core devs #1676 G Eraslan

Bug fixes#

dendrogram() use 1 - correlation as distance matrix to compute the dendrogram #1614 F Ramirez
Fixed obs_df()/ var_df() erroring when keys not passed #1637 I Virshup
Fixed argument handling for scanpy.pp.scrublet() J Manning
Fixed passing of kwargs to scanpy.pl.violin() when stripplot was also used #1655 M van den Beek
Fixed colorbar creation in scanpy.pl.timeseries_as_heatmap #1654 M van den Beek

1.7.0 2021-02-03#

Features#

Add new 10x Visium datasets to visium_sge() #1473 G Palla
Enable download of source image for 10x visium datasets in visium_sge() #1506 H Spitzer
Refactor of scanpy.pl.spatial(). Better support for plotting without an image, as well as directly providing images #1512 G Palla
Dict input for scanpy.queries.enrich() #1488 G Eraslan
rank_genes_groups_df() can now return fraction of cells in a group expressing a gene, and allows retrieving values for multiple groups at once #1388 G Eraslan
Color annotations for gene sets in heatmap() are now matched to color for cluster #1511 L Sikkema
PCA plots can now annotate axes with variance explained #1470 bfurtwa
Plots with groupby arguments can now group by values in the index by passing the index’s name (like pd.DataFrame.groupby). #1583 F Ramirez
Added na_color and na_in_legend keyword arguments to embedding() plots. Allows specifying color for missing or filtered values in plots like umap() or spatial() #1356 I Virshup
embedding() plots now support passing dict of {cluster_name: cluster_color, ...} for palette argument #1392 I Virshup

External tools (new)#

Add Scanorama integration to scanpy external API (scanorama_integrate(), Hie et al. [2019]) #1332 B Hie
Scrublet [Wolock et al., 2019] integration: scrublet(), scrublet_simulate_doublets(), and plotting method scrublet_score_distribution() #1476 J Manning
hashsolo() for HTO demultiplexing [Bernstein et al., 2020] #1432 NJ Bernstein
Added scirpy (sc-AIRR analysis) to ecosystem page #1453 G Sturm
Added scvi-tools to ecosystem page #1421 A Gayoso

External tools (changes)#

Updates for palantir() and palantir_results() #1245 A Mousa
Fixes to harmony_timeseries() docs #1248 A Mousa
Support for leiden clustering by scanpy.external.tl.phenograph() #1080 A Mousa
Deprecate scanpy.external.pp.scvi #1554 G Xing
Updated default params of sam() to work with larger data #1540 A Tarashansky

Documentation#

New contribution guide #1544 I Virshup
zsh installation instructions #1444 P Angerer

Performance#

Speed up read_10x_h5() #1402 P Weiler
Speed ups for obs_df() #1499 F Ramirez

Bugfixes#

Consistent fold-change, fractions calculation for filter_rank_genes_groups #1391 S Rybakov
Fixed bug where score_genes would error if one gene was passed #1398 I Virshup
Fixed log1p inplace on integer dense arrays #1400 I Virshup
Fix docstring formatting for rank_genes_groups() #1417 P Weiler
Removed PendingDeprecationWarning`s from use of `np.matrix #1424 P Weiler
Fixed indexing byg in ~scanpy.pp.highly_variable_genes #1456 V Bergen
Fix default number of genes for marker_genes_overlap #1464 MD Luecken
Fixed passing groupby and dendrogram_key to dendrogram() #1465 M Varma
Fixed download path of pbmc3k_processed #1472 D Strobl
Better error message when computing DE with a group of size 1 #1490 J Manning
Update cugraph API usage for v0.16 #1494 R Ilango
Fixed marker_gene_overlap default value for top_n_markers #1464 MD Luecken
Pass random_state to RAPIDs UMAP #1474 C Nolet
Fixed anndata version requirement for concat() (re-exported from scanpy as sc.concat) #1491 I Virshup
Fixed the width of the progress bar when downloading data #1507 M Klein
Updated link for moignard15 dataset #1542 I Virshup
Fixed bug where calling set_figure_params could block if IPython was installed, but not used. #1547 I Virshup
violin() no longer fails if .raw not present #1548 I Virshup
spatial() refactoring and better handling of spatial data #1512 G Palla
pca() works with chunked=True again #1592 I Virshup
ingest() now works with umap-learn 0.5.0 #1601 S Rybakov

Version 1.6#

1.6.0 2020-08-15#

This release includes an overhaul of dotplot(), matrixplot(), and stacked_violin() (#1210 F Ramirez), and of the internals of rank_genes_groups() (#1156 S Rybakov).

Overhaul of `dotplot()`, `matrixplot()`, and `stacked_violin()` #1210 F Ramirez#

An overhauled tutorial Core plotting functions.
New plotting classes can be accessed directly (e.g., DotPlot) or using the return_fig param.
It is possible to plot log fold change and p-values in the rank_genes_groups_dotplot() family of functions.
Added ax parameter which allows embedding the plot in other images.
Added option to include a bar plot instead of the dendrogram containing the cell/observation totals per category.
Return a dictionary of axes for further manipulation. This includes the main plot, legend and dendrogram to totals
Legends can be removed.
The groupby param can take a list of categories, e.g., groupby=[‘tissue’, ‘cell type’].
Added padding parameter to dotplot and stacked_violin. #1270
Added title for colorbar and positioned as in dotplot for matrixplot().
dotplot() changes:
- Improved the colorbar and size legend for dotplots. Now the colorbar and size have titles, which can be modified using the colorbar_title and size_title params. They also align at the bottom of the image and do not shrink if the dotplot image is smaller.
- Allow plotting genes in rows and categories in columns (swap_axes).
- Using DotPlot, the dot_edge_color and line width can be modified, a grid can be added, and other modifications are enabled.
- A new style was added in which the dots are replaced by an empty circle and the square behind the circle is colored (like in matrixplots).
stacked_violin() changes:
- Violin colors can be colored based on average gene expression as in dotplots.
- The linewidth of the violin plots is thinner.
- Removed the tics for the y-axis as they tend to overlap with each other. Using the style method they can be displayed if needed.

Additions#

concat() is now exported from scanpy, see Concatenation for more info. #1338 I Virshup
Added highly variable gene selection strategy from Seurat v3 #1204 A Gayoso
Added CellRank to scanpy ecosystem #1304 giovp
Added backup_url param to read_10x_h5() #1296 A Gayoso
Allow prefix for read_10x_mtx() #1250 G Sturm
Optional tie correction for the 'wilcoxon' method in rank_genes_groups() #1330 S Rybakov
Use sinfo for print_versions() and add print_header() to do what it previously did. #1338 I Virshup #1373

Bug fixes#

Avoid warning in rank_genes_groups() if ‘t-test’ is passed #1303 A Wolf
Restrict sphinx version to <3.1, >3.0 #1297 I Virshup
Clean up _ranks and fix dendrogram for scipy 1.5 #1290 S Rybakov
Use .raw to translate gene symbols if applicable #1278 E Rice
Fix diffmap (#1262) G Eraslan
Fix neighbors in spring_project #1260 S Rybakov
Fix default size of dot in spatial plots #1255 #1253 giovp
Bumped version requirement of scipy to scipy>1.4 to support rmatmat argument of LinearOperator #1246 I Virshup
Fix asymmetry of scores for the 'wilcoxon' method in rank_genes_groups() #754 S Rybakov
Avoid trimming of gene names in rank_genes_groups() #753 S Rybakov

Version 1.5#

1.5.1 2020-05-21#

Bug fixes#

Fixed a bug in pca(), where random_state did not have an effect for sparse input #1240 I Virshup
Fixed docstring in pca() which included an unused argument #1240 I Virshup

1.5.0 2020-05-15#

The 1.5.0 release adds a lot of new functionality, much of which takes advantage of anndata updates 0.7.0 - 0.7.2. Highlights of this release include support for spatial data, dedicated handling of graphs in AnnData, sparse PCA, an interface with scvi, and others.

Spatial data support#

Tutorials for basic analysis and integration with single cell data G Palla
read_visium() read 10x Visium data #1034 G Palla, P Angerer, I Virshup
visium_sge() load Visium data directly from 10x Genomics #1013 M Mirkazemi, G Palla, P Angerer
spatial() plot spatial data #1012 G Palla, P Angerer

New functionality#

Many functions, like neighbors() and umap(), now store cell-by-cell graphs in obsp #1118 S Rybakov
scale() and log1p() can be used on any element in layers or obsm #1173 I Virshup

External tools#

scanpy.external.pp.scvi for preprocessing with scVI #1085 G Xing
Guide for using Scanpy in R #1186 L Zappia

Performance#

pca() now uses efficient implicit centering for sparse matrices. This can lead to signifigantly improved performance for large datasets #1066 A Tarashansky
score_genes() now has an efficient implementation for sparse matrices with missing values #1196 redst4r.

Warning

The new pca() implementation can result in slightly different results for sparse matrices. See the pr (#1066) and documentation for more info.

Code design#

stacked_violin() can now be used as a subplot #1084 P Angerer
score_genes() has improved logging #1119 G Eraslan
scale() now saves mean and standard deviation in the var #1173 A Wolf
harmony_timeseries() #1091 A Mousa

Bug fixes#

combat() now works when obs_names aren’t unique. #1215 I Virshup
scale() can now be used on dense arrays without centering #1160 simonwm
regress_out() now works when some features are constant #1194 simonwm
normalize_total() errored if the passed object was a view #1200 I Virshup
neighbors() sometimes ignored the n_pcs param #1124 V Bergen
ebi_expression_atlas() which contained some out-of-date URLs #1102 I Virshup
ingest() for UMAP 0.4 #1165 S Rybakov
louvain() for Louvain 0.6 #1197 I Virshup
highly_variable_genes() which could lead to incorrect results when the batch_key argument was used #1180 G Eraslan
ingest() where an inconsistent number of neighbors was used #1111 S Rybakov

Version 1.4#

1.4.6 2020-03-17#

Functionality in `external`#

sam() self-assembling manifolds [Tarashansky et al., 2019] #903 A Tarashansky
harmony_timeseries() for trajectory inference on discrete time points #994 A Mousa
wishbone() for trajectory inference (bifurcations) #1063 A Mousa

Code design#

violin now reads .uns['colors_...'] #1029 michalk8

Bug fixes#

adapt ingest() for UMAP 0.4 #1038 #1106 S Rybakov
compat with matplotlib 3.1 and 3.2 #1090 I Virshup, P Angerer
fix PAGA for new igraph #1037 P Angerer
fix rapids compat of louvain #1079 LouisFaure

1.4.5 2019-12-30#

Please install scanpy==1.4.5.post3 instead of scanpy==1.4.5.

New functionality#

ingest() maps labels and embeddings of reference data to new data Integrating data using ingest and BBKNN #651 S Rybakov, A Wolf
queries recieved many updates including enrichment through gprofiler and more advanced biomart queries #467 I Virshup
set_figure_params() allows setting figsize and accepts facecolor='white', useful for working in dark mode A Wolf

Code design#

downsample_counts now always preserves the dtype of it’s input, instead of converting floats to ints #865 I Virshup
allow specifying a base for log1p() #931 G Eraslan
run neighbors on a GPU using rapids #830 T White
param docs from typed params P Angerer
embedding_density() now only takes one positional argument; similar for embedding_density(), which gains a param groupby #965 A Wolf
webpage overhaul, ecosystem page, release notes, tutorials overhaul #960 #966 A Wolf

Warning

changed default solver in pca() from auto to arpack
changed default use_raw in score_genes() from False to None

1.4.4 2019-07-20#

New functionality#

scanpy.get adds helper functions for extracting data in convenient formats #619 I Virshup

Bug fixes#

Stopped deprecations warnings from AnnData 0.6.22 I Virshup

Code design#

normalize_total() gains param exclude_highly_expressed, and fraction is renamed to max_fraction with better docs A Wolf

1.4.3 2019-05-14#

Bug fixes#

neighbors() correctly infers n_neighbors again from params, which was temporarily broken in v1.4.2 I Virshup

Code design#

calculate_qc_metrics() is single threaded by default for datasets under 300,000 cells – allowing cached compilation #615 I Virshup

1.4.2 2019-05-06#

New functionality#

combat() supports additional covariates which may include adjustment variables or biological condition #618 G Eraslan
highly_variable_genes() has a batch_key option which performs HVG selection in each batch separately to avoid selecting genes that vary strongly across batches #622 G Eraslan

Bug fixes#

rank_genes_groups() t-test implementation doesn’t return NaN when variance is 0, also changed to scipy’s implementation #621 I Virshup
umap() with init_pos='paga' detects correct dtype A Wolf
louvain() and leiden() auto-generate key_added=louvain_R upon passing restrict_to, which was temporarily changed in 1.4.1 A Wolf

Code design#

neighbors() and umap() got rid of UMAP legacy code and introduced UMAP as a dependency #576 S Rybakov

1.4.1 2019-04-26#

New functionality#

Scanpy has a command line interface again. Invoking it with scanpy somecommand [args] calls scanpy-somecommand [args], except for builtin commands (currently scanpy settings) #604 P Angerer
ebi_expression_atlas() allows convenient download of EBI expression atlas I Virshup
marker_gene_overlap() computes overlaps of marker genes M Luecken
filter_rank_genes_groups() filters out genes based on fold change and fraction of cells expressing genes F Ramirez
normalize_total() replaces normalize_per_cell(), is more efficient and provides a parameter to only normalize using a fraction of expressed genes S Rybakov
downsample_counts() has been sped up, changed default value of replace parameter to False #474 I Virshup
embedding_density() computes densities on embeddings #543 M Luecken
palantir() interfaces Palantir [Setty et al., 2019] #493 A Mousa

Code design#

.layers support of scatter plots F Ramirez
fix double-logarithmization in compute of log fold change in rank_genes_groups() A Muñoz-Rojas
fix return sections of docs P Angerer

Version 1.3#

1.3.8 2019-02-05#

various documentation and dev process improvements
Added combat() function for batch effect correction [Johnson et al., 2006, Leek et al., 2017, Pedersen, 2012] #398 M Lange

1.3.7 2019-01-02#

API changed from import scanpy as sc to import scanpy.api as sc.
phenograph() wraps the graph clustering package Phenograph [Levine et al., 2015] thanks to A Mousa

1.3.6 2018-12-11#

Major updates#

a new plotting gallery for visualizing-marker-genes F Ramirez
tutorials are integrated on ReadTheDocs, pbmc3k and paga-paul15 A Wolf

Interactive exploration of analysis results through manifold viewers#

CZI’s cellxgene directly reads .h5ad files the cellxgene developers
the UCSC Single Cell Browser requires exporting via cellbrowser() M Haeussler

Code design#

highly_variable_genes() supersedes filter_genes_dispersion(), it gives the same results but, by default, expects logarithmized data and doesn’t subset A Wolf

1.3.5 2018-12-09#

uncountable figure improvements #369 F Ramirez

1.3.4 2018-11-24#

leiden() wraps the recent graph clustering package by Traag et al. [2019] K Polanski
bbknn() wraps the recent batch correction package [Polański et al., 2019] K Polanski
calculate_qc_metrics() caculates a number of quality control metrics, similar to calculateQCMetrics from Scater [McCarthy et al., 2017] I Virshup

1.3.3 2018-11-05#

Major updates#

a fully distributed preprocessing backend T White and the Laserson Lab

Code design#

read_10x_h5() and read_10x_mtx() read Cell Ranger 3.0 outputs #334 Q Gong

Note

Also see changes in anndata 0.6.

changed default compression to None in write_h5ad() to speed up read and write, disk space use is usually less critical
performance gains in write_h5ad() due to better handling of strings and categories S Rybakov

1.3.1 2018-09-03#

RNA velocity in single cells [La Manno et al., 2018]#

Scanpy and AnnData support loom’s layers so that computations for single-cell RNA velocity [La Manno et al., 2018] become feasible S Rybakov and V Bergen
scvelo harmonizes with Scanpy and is able to process loom files with splicing information produced by Velocyto [La Manno et al., 2018], it runs a lot faster than the count matrix analysis of Velocyto and provides several conceptual developments

Plotting (Generic)#

dotplot() for visualizing genes across conditions and clusters, see here #199 F Ramirez
heatmap() for pretty heatmaps #175 F Ramirez
violin() produces very compact overview figures with many panels #175 F Ramirez

There now is a section on imputation in external:#

magic() for imputation using data diffusion [van Dijk et al., 2018] #187 S Gigante
dca() for imputation and latent space construction using an autoencoder [Eraslan et al., 2019] #186 G Eraslan

Version 1.2#

1.2.1 2018-06-08#

Plotting of Generic marker genes and quality control.#

highest_expr_genes() for quality control; plot genes with highest mean fraction of cells, similar to plotQC of Scater [McCarthy et al., 2017] #169 F Ramirez

1.2.0 2018-06-08#

paga() improved, see PAGA; the default model changed, restore the previous default model by passing model='v1.0'

Version 1.1#

1.1.0 2018-06-01#

set_figure_params() by default passes vector_friendly=True and allows you to produce reasonablly sized pdfs by rasterizing large scatter plots A Wolf
draw_graph() defaults to the ForceAtlas2 layout [Chippada, 2018, Jacomy et al., 2014], which is often more visually appealing and whose computation is much faster S Wollock
scatter() also plots along variables axis MD Luecken
pca() and log1p() support chunk processing S Rybakov
regress_out() is back to multiprocessing F Ramirez
read() reads compressed text files G Eraslan
mitochondrial_genes() for querying mito genes FG Brundu
mnn_correct() for batch correction [Haghverdi et al., 2018, Kang, 2018]
phate() for low-dimensional embedding [Moon et al., 2019] S Gigante
sandbag(), cyclone() for scoring genes [Fechtner, 2018, Scialdone et al., 2015]

Version 1.0#

1.0.0 2018-03-30#

Major updates#

Scanpy is much faster and more memory efficient: preprocess, cluster and visualize 1.3M cells in 6h, 130K cells in 14min, and 68K cells in 3min A Wolf
the API gained a preprocessing function neighbors() and a class Neighbors() to which all basic graph computations are delegated A Wolf

Warning

Upgrading to 1.0 isn’t fully backwards compatible in the following changes

the graph-based tools louvain() dpt() draw_graph() umap() diffmap() paga() require prior computation of the graph: sc.pp.neighbors(adata, n_neighbors=5); sc.tl.louvain(adata) instead of previously sc.tl.louvain(adata, n_neighbors=5)
install numba via conda install numba, which replaces cython
the default connectivity measure (dpt will look different using default settings) changed. setting method='gauss' in sc.pp.neighbors uses gauss kernel connectivities and reproduces the previous behavior, see, for instance in the example paul15.
namings of returned annotation have changed for less bloated AnnData objects, which means that some of the unstructured annotation of old AnnData files is not recognized anymore
replace occurances of group_by with groupby (consistency with pandas)
it is worth checking out the notebook examples to see changes, e.g. the seurat example.
upgrading scikit-learn from 0.18 to 0.19 changed the implementation of PCA, some results might therefore look slightly different

Further updates#

UMAP [McInnes et al., 2018] can serve as a first visualization of the data just as tSNE, in contrast to tSNE, UMAP directly embeds the single-cell graph and is faster; UMAP is also used for measuring connectivities and computing neighbors, see neighbors() A Wolf
graph abstraction: AGA is renamed to PAGA: paga(); now, it only measures connectivities between partitions of the single-cell graph, pseudotime and clustering need to be computed separately via louvain() and dpt(), the connectivity measure has been improved A Wolf
logistic regression for finding marker genes rank_genes_groups() with parameter method='logreg' A Wolf
louvain() provides a better implementation for reclustering via restrict_to A Wolf
scanpy no longer modifies rcParams upon import, call :func:scanpy.set_figure_params to set the ‘scanpy style’ A Wolf
default cache directory is ./cache/, set settings.cachedir to change this; nested directories in this are avoided A Wolf
show edges in scatter plots based on graph visualization draw_graph() and umap() by passing edges=True A Wolf
downsample_counts() for downsampling counts MD Luecken
default 'louvain_groups' are called 'louvain' A Wolf
'X_diffmap' contains the zero component, plotting remains unchanged A Wolf

Version 0.4#

0.4.4 2018-02-26#

embed cells using umap() [McInnes et al., 2018] #92 G Eraslan
score sets of genes, e.g. for cell cycle, using score_genes() [Satija et al., 2015]: notebook

0.4.3 2018-02-09#

clustermap(): heatmap from hierarchical clustering, based on seaborn.clustermap() [Waskom et al., 2016] A Wolf
only return matplotlib.axes.Axes in plotting functions of sc.pl when show=False, otherwise None A Wolf

0.4.2 2018-01-07#

amendments in PAGA and its plotting functions A Wolf

0.4.0 2017-12-23#

export to SPRING [Weinreb et al., 2017] for interactive visualization of data: spring tutorial S Wollock

Version 0.3#

0.3.2 2017-11-29#

finding marker genes via rank_genes_groups_violin() improved, see #51 F Ramirez

0.3.0 2017-11-16#

AnnData gains method concatenate() A Wolf
AnnData is available as the separate anndata package P Angerer, A Wolf
results of PAGA simplified A Wolf

Version 0.2#

0.2.9 2017-10-25#

Initial release of the new trajectory inference method PAGA #

paga() computes an abstracted, coarse-grained (PAGA) graph of the neighborhood graph A Wolf
paga_compare() plot this graph next an embedding A Wolf
paga_path() plots a heatmap through a node sequence in the PAGA graph A Wolf

0.2.1 2017-07-24#

Scanpy includes preprocessing, visualization, clustering, pseudotime and trajectory inference, differential expression testing and simulation of gene regulatory networks. The implementation efficiently deals with datasets of more than one million cells. A Wolf, P Angerer

Version 0.1#

0.1.0 2017-05-17#

Scanpy computationally outperforms and allows reproducing both the Cell Ranger R kit’s and most of Seurat’s clustering workflows. A Wolf, P Angerer

Deprecate …	in favor of …
`scanpy.read_visium()`	`squidpy.read.visium()`
`scanpy.datasets.visium_sge()`	`squidpy.datasets.visium()`
`scanpy.pl.spatial()`	`squidpy.pl.spatial_scatter()`

Release notes

Contents

Release notes#

Version 1.12#

1.12.0.dev94+g260d33bd 2025-07-24#

Breaking changes#

Bug fixes#

Development Process#

Features#

Miscellaneous improvements#

Version 1.11#

1.11.3 2025-07-01#

Bug fixes#

Documentation#

1.11.2 2025-05-28#

Bug fixes#

Documentation#

Performance#

1.11.1 2025-03-31#

Bug fixes#

Features#

Performance#

1.11.0 2025-02-14#

Features#

Performance#

Documentation#

Bug fixes#

Development Process#

Version 1.10#

1.10.4 2024-11-12#

Breaking changes#

Bug fixes#

1.10.3 2024-09-17#

Bug fixes#

1.10.2 2024-06-25#

Development Process#

Documentation#

Bug fixes#

Performance#

1.10.1 2024-04-09#

Documentation#

Bug fixes#

Performance#

1.10.0 2024-03-26#

Features#

Documentation#

Bug fixes#

Development Process#

Deprecations#

Version 1.9#

1.9.8 2024-01-26#

Bug fixes#

1.9.7 2024-01-25#

Bug fixes#

1.9.6 2023-10-31#

Bug fixes#

1.9.5 2023-09-08#

Bug fixes#

1.9.4 2023-08-24#

Bug fixes#

1.9.3 2023-03-02#

Bug fixes#

1.9.2 2023-02-16#

Bug fixes#

1.9.1 2022-04-05#

Bug fixes#

1.9.0 2022-04-01#

Tutorials#

Experimental module#

Features#

Ecosystem#

Bug fixes#

Version 1.8#

1.8.2 2021-11-3#

Documentation#

Bug fixes#

Ecosystem#

1.8.1 2021-07-07#

Bug fixes#

1.8.0 2021-06-28#

Overhaul of `dotplot()`, `matrixplot()`, and `stacked_violin()` #1210 F Ramirez#

Functionality in `external`#