Release notes#
Version 1.11#
1.11.0.dev11+g0cfd0224 2024-12-16#
Bug fixes#
Raise
FutureWarningwhen calling deprecatedscanpy.ppfunctions P Angerer (pr3380)Upper-bound
sklearn<1.6.0due to issuedask/dask-ml#1002 Ilan Gold (pr3393)
Documentation#
Improve
harmony_integrate()docs D Kühl (pr3362)
Performance#
Speed up
regress_out()P Ashish, P Angerer & S Dicks (pr3284)
Version 1.10#
1.10.4 2024-11-12#
Breaking changes#
Remove Python 3.9 support P Angerer (pr3283)
Bug fixes#
Fix
scanpy.pl.DotPlot.style(),scanpy.pl.MatrixPlot.style(), andscanpy.pl.StackedViolin.style()resetting all non-specified parameters P Angerer (pr3206)Accept
'group'instead of'obs'forstandard_scaleparameter instacked_violin()P Angerer (pr3243)Use
density_norminstead of ofscale(cont. from pr2844) inviolin()andstacked_violin()P Angerer (pr3244)Switched all compatibility adapters for positional parameters to
FutureWarningP Angerer (pr3264)Catch
PerfectSeparationWarningduringregress_out()J Wagner (pr3275)Fix
scanpy.pp.highly_variable_genes()for batches of size 1 P Angerer (pr3286)Fix
scanpy.pl.scatter()’scolorparameter to take collections as advertised P Angerer (pr3299)Fix
scanpy.pl.highest_expr_genes()when used with a categorical gene symbol column P Angerer (pr3302)
1.10.3 2024-09-17#
Bug fixes#
Prevent empty control gene set in
score_genes()M Müller (pr2875)Fix
subset=Trueofhighly_variable_genes()whenflavorisseuratorcell_ranger, andbatch_key!=NoneE Roellin (pr3042)Add compatibility with
numpy2.0 P Angerer pr3065 and (pr3115)Fix
legend_locargument inscanpy.pl.embedding()not accepting matplotlib parameters P Angerer (pr3163)Fix dispersion cutoff in
highly_variable_genes()in presence ofNaNs P Angerer (pr3176)Fix axis labeling for swapped axes in
rank_genes_groups_stacked_violin()Ilan Gold (pr3196)Upper bound dask on account of issuescverse/anndata#1579 Ilan Gold (pr3217)
The fa2-modified package replaces forceatlas2 for the latter’s lack of maintenance A Alam (pr3220)
1.10.2 2024-06-25#
Development Process#
Add performance benchmarking pr2977 R Shrestha, P Angerer
Documentation#
Bug fixes#
Compatibility with
matplotlib3.9 pr2999 I VirshupAdd clear errors where
backedmode-like matrices (i.e., fromsparse_dataset) are not supported pr3048 I goldWrite out full pca results when
_choose_representationis called i.e.,neighbors()withoutpca()pr3079 I goldFix deprecated use of
.Awith sparse matrices pr3084 P AngererFix zappy support pr3089 P Angerer
Performance#
1.10.1 2024-04-09#
Documentation#
Added how-to example on plotting with Marsilea pr2974 Y Zheng
Bug fixes#
Fix
aggregatewhen aggregating by more than two groups pr2965 I Virshup
Performance#
1.10.0 2024-03-26#
scanpy 1.10 brings a large amount of new features, performance improvements, and improved documentation.
Some highlights:
Improved support for out-of-core workflows via
dask. See new tutorial: Using dask with Scanpy demonstrating counts-to-clusters for 1.4 million cells in <10 min.A new basic clustering tutorial demonstrating an updated workflow.
Opt-in increased performance for neighbor search and clustering (how to guide).
Ability to
maskobservations or variables from a number of methods (see Customizing Scanpy plots for an example with plotting embeddings)A new function
aggregate()for computing aggregations of your data, very useful for pseudo bulking!
Features#
scrublet()andscrublet_simulate_doublets()were moved fromscanpy.external.pptoscanpy.pp. Thescrubletimplementation is now maintained as part of scanpy pr2703 P Angererscanpy.pp.pca(),scanpy.pp.scale(),scanpy.pl.embedding(), andscanpy.experimental.pp.normalize_pearson_residuals_pca()now support amaskparameter pr2272 C Bright, T Marcella, & P AngererEnhanced dask support for some internal utilities, paving the way for more extensive dask support pr2696 P Angerer
scanpy.pp.highly_variable_genes()supports dask for the defaultseuratandcell_rangerflavors pr2809 P AngererNew function
scanpy.get.aggregate()which allows grouped aggregations over your data. Useful for pseudobulking! pr2590 Isaac Virshup Ilan Gold Jon Bloomscanpy.pp.neighbors()now has atransformerargument allowing the use of different ANN/ KNN libraries pr2536 P Angererscanpy.experimental.pp.highly_variable_genes()usingflavor='pearson_residuals'now uses numba for variance computation and is faster pr2612 S Dicks & P Angererscanpy.tl.leiden()now offersigraph’s implementation of the leiden algorithm via viaflavorwhen set toigraph.leidenalg’s implementation is still default, but discouraged. pr2815 I Goldscanpy.pp.highly_variable_genes()has new flavorseurat_v3_paperthat is in its implementation consistent with the paper description in Stuart et al 2018. pr2792 E Roellinscanpy.datasets.blobs()now accepts arandom_stateargument pr2683 E Roellinscanpy.pp.pca()andscanpy.pp.regress_out()now accept a layer argument pr2588 S Dicksscanpy.pp.subsample()withcopy=Truecan now be called in backed mode pr2624 E Roellinscanpy.external.pp.harmony_integrate()now runs with 64 bit floats improving reproducibility pr2655 S Dicksscanpy.tl.rank_genes_groups()no longer warns that it’s default was changed from t-test_overestim_var to t-test pr2798 L Heumosscanpy.pp.calculate_qc_metricsnow allowsqc_varsto be passed as a string pr2859 N Teyssierscanpy.tl.leiden()andscanpy.tl.louvain()now store clustering parameters in the key provided by thekey_addedparameter instead of always writing to (or overwriting) a default key pr2864 J Fanscanpy.pp.scale()now clipsnp.ndarrayalso at- max_valuefor zero-centering pr2913 S DicksSupport sparse chunks in dask
scale(),normalize_total()andhighly_variable_genes()(seuratandcell-rangertested) pr2856 ilan-gold
Documentation#
Doc style overhaul pr2220 A Gayoso
Re-add search-as-you-type, this time via
readthedocs-sphinx-searchpr2805 P AngererFixed a lot of broken usage examples pr2605 P Angerer
Improved harmonization of return field of
sc.ppandsc.tlfunctions pr2742 E RoellinImproved docs for
percent_topargument ofcalculate_qc_metrics()pr2849 I VirshupNew basic clustering tutorial (Preprocessing and clustering), based on one from scverse-tutorials pr2901 I Virshup
Overhauled Tutorials page, and added new How to section to docs pr2901 I Virshup
Added a new tutorial on working with dask (Using dask with Scanpy) pr2901 I Gold I Virshup
Bug fixes#
Updated
read_visium()such that it can read spaceranger 2.0 files L LehnerFix
normalize_total()for dask pr2466 P AngererFix setting
sc.settings.verbosityin some cases pr2605 P AngererFix all remaining pandas warnings pr2789 P Angerer
Fix some annoying plotting warnings around violin plots pr2844 P Angerer
Scanpy now has a test job which tests against the minumum versions of the dependencies. In the process of implementing this, many bugs associated with using older versions of
pandas,anndata,numpy, andmatplotlibwere fixed. pr2816 I VirshupFix warnings caused by internal usage of
pandas.DataFrame.stackwithpandas>=2.1pr2864I Virshupscanpy.get.aggregate()now always returnsnumpy.ndarraypr2893 S DicksRemoves self from array of neighbors for
use_approx_neighbors = Trueinscrublet()pr2896S DicksCompatibility with scipy 1.13 pr2943 I Virshup
Fix use of
dendrogram()on highly correlated low precision data pr2928 P AngererFix pytest deprecation warning pr2879 P Angerer
Development Process#
Deprecations#
Dropped support for Python 3.8. More details here. pr2695 P Angerer
Deprecated specifying large numbers of function parameters by position as opposed to by name/keyword in all public APIs. e.g. prefer
sc.tl.umap(adata, min_dist=0.1, spread=0.8)oversc.tl.umap(adata, 0.1, 0.8)pr2702 P AngererDropped support for
umap<0.5for performance reasons. pr2870 P Angerer
Version 1.9#
1.9.8 2024-01-26#
Bug fixes#
Fix handling of numpy array palettes for old numpy versions pr2832 P Angerer
1.9.7 2024-01-25#
Bug fixes#
Fix handling of numpy array palettes (e.g. after write-read cycle) pr2734 P Angerer
Specify correct version of
matplotlibdependency pr2733 P FisherFix
scanpy.pl.violin()usage ofseaborn.catplotpr2739 E RoellinFix
scanpy.pp.highly_variable_genes()to handle the combinations ofinplaceandsubsetconsistently pr2757 E RoellinReplace usage of various deprecated functionality from
anndataandpandaspr2678 pr2779 P AngererAllow to use default
n_top_geneswhen usingscanpy.pp.highly_variable_genes()flavor'seurat_v3'pr2782 P AngererFix
scanpy.read_10x_mtx()’sgex_only=Truemode pr2801 P Angerer
1.9.6 2023-10-31#
Bug fixes#
Allow
scanpy.pl.scatter()to accept astrpalette name pr2571 P AngererMake
scanpy.external.tl.palantir()compatible with palantir >=1.3 pr2672 DJ OttoFix
scanpy.pl.pca()whenreturn_fig=Trueandannotate_var_explained=Truepr2682 J WagnerTemp fix for issue2680 by skipping
seabornversion 0.13.0 pr2661 P AngererFix
scanpy.pp.highly_variable_genes()to not modify the used layer whenflavor=seuratpr2698 E RoellinPrevent pandas from causing infinite recursion when setting a slice of a categorical column pr2719 P Angerer
1.9.5 2023-09-08#
Bug fixes#
Remove use of deprecated
dtypeargument to AnnData constructor pr2658 Isaac Virshup
1.9.4 2023-08-24#
Bug fixes#
Support scikit-learn 1.3 pr2515 P Angerer
Deal with
Nonevalue vanishing from things like.uns['log1p']pr2546 SP ShenDepend on
igraphinstead ofpython-igraphpr2566 P Angererrank_genes_groups()now handles unsorted groups as intended pr2589 S Dicksrank_genes_groups_df()now works forrank_genes_groups()withmethod="logreg"pr2601 S Dicksscanpy.tl._utils._choose_representationnow works withn_pcsif bigger thansettings.N_PCSpr2610 S Dicks
1.9.3 2023-03-02#
Bug fixes#
Variety of fixes against pandas 2.0.0rc0 pr2434 I Virshup
1.9.2 2023-02-16#
Bug fixes#
highly_variable_genes()layerargument now works in tandem withbatchespr2302 D Schaumonthighly_variable_genes()withflavor='cell_ranger'now handles the case in issue2230 where the number of calculated dispersions is less thann_top_genespr2231 L ZappiaFix compatibility with matplotlib 3.7 pr2414 I Virshup P Fisher
Fix scrublet numpy matrix compatibility issue pr2395 A Gayoso
1.9.1 2022-04-05#
Bug fixes#
normalize_total()works when Dask is not installed pr2209 R CannoodtFix embedding plots by bumping matplotlib dependency to version 3.4 pr2212 I Virshup
1.9.0 2022-04-01#
Tutorials#
New tutorial on the usage of Pearson Residuals: How to preprocess UMI count data with analytic Pearson residuals J Lause, G Palla
Materials and recordings for Scanpy workshops by Maren Büttner
Experimental module#
Added
scanpy.experimentalmodule! Currently contains functionality related to pearson residuals inscanpy.experimental.pppr1715 J Lause, G Palla, I Virshup. This includes:normalize_pearson_residuals()for Pearson Residuals normalizationhighly_variable_genes()for HVG selection with Pearson Residualsnormalize_pearson_residuals_pca()for Pearson Residuals normalization and dimensionality reduction with PCArecipe_pearson_residuals()for Pearson Residuals normalization, HVG selection and dimensionality reduction with PCA
Features#
filter_rank_genes_groups()now allows to filter with absolute values of log fold change pr1649 S Rybakov_choose_representationnow subsets the provided representation to n_pcs, regardless of the name of the provided representation (should affect mostlyneighbors()) pr2179 I Virshup PG Majevscanpy.pp.scrublet()(and related functions) can now be used onAnnDataobjects containing multiple batches pr1965 J ManningNumber of variables plotted with
pca_loadings()can now be controlled withn_pointsargument. Additionally, variables are no longer repeated if the anndata has less than 30 variables pr2075 Yves33Dask arrays now work with
scanpy.pp.normalize_total()pr1663 G Buckley, I Virshupembedding_density()now allows more than 10 groups pr1936 A WolfEmbedding plots can now pass
colorbar_locto specify the location of colorbar legend, or passNoneto not show a colorbar pr1821 A Schaar I VirshupEmbedding plots now have a
dimensionsargument, which lets users select which dimensions of their embedding to plot and uses the same broadcasting rules as other arguments pr1538 I Virshupprint_versions()now usessession_infopr2089 P Angerer I Virshup
Ecosystem#
Multiple packages have been added to our ecosystem page, including:
Bug fixes#
Fixed finding variables with
use_raw=Trueandbasis=Noneinscanpy.pl.scatter()pr2027 E RiceFixed
scanpy.pp.scrublet()to address issue1957 FlMai and ensure raw counts are used for simulationFunctions in
scanpy.datasetsno longer throwOldFormatWarningswhen usinganndata0.8pr2096 I VirshupFixed use of
scanpy.pp.neighbors()withmethod='rapids': RAPIDS cuML no longer returns a squared Euclidean distance matrix, so we should not square-root the kNN distance matrix. pr1828 M ZaslavskyRemoved
pytablesdependency by implementingread_10x_h5withh5pydue to installation errors on Windows pr2064Fixed bug in
scanpy.external.pp.hashsolo()where default value was set improperly pr2190 B ReizFixed bug in
scanpy.pl.embedding()functions where an error could be raised when there were missing values and large numbers of categories pr2187 I Virshup
Version 1.8#
1.8.2 2021-11-3#
Documentation#
Update conda installation instructions pr1974 L Heumos
Bug fixes#
Fix plotting after
scanpy.tl.filter_rank_genes_groups()pr1942 S RybakovFix
use_raw=Noneusinganndata.AnnData.var_namesifanndata.AnnData.rawis present inscanpy.tl.score_genes()pr1999 M KleinFix compatibility with UMAP 0.5.2 pr2028 L Mcinnes
Fixed non-determinism in
scanpy.pl.paga()node positions pr1922 I Virshup
Ecosystem#
Added PASTE (a tool to align and integrate spatial transcriptomics data) to scanpy ecosystem.
1.8.1 2021-07-07#
Bug fixes#
Fixed reproducibility of
scanpy.tl.score_genes(). Calculation and output is now float64 type. pr1890 I KucinskiWorkarounds for some changes/ bugs in pandas 1.3 pr1918 I Virshup
Fixed bug where
sc.pl.paga_comparecould mislabel nodes on the paga graph pr1898 I VirshupFixed handling of
use_rawwithscanpy.tl.rank_genes_groups()pr1934 I Virshup
1.8.0 2021-06-28#
Metrics module#
Added
scanpy.metricsmodule!Added
scanpy.metrics.gearys_c()for spatial autocorrelation pr915 I VirshupAdded
scanpy.metrics.morans_i()for global spatial autocorrelation pr1740 I Virshup, G PallaAdded
scanpy.metrics.confusion_matrix()for comparing labellings pr915 I Virshup
Features#
Added
layerandcopykwargs tonormalize_total()pr1667 I VirshupAdded
vcenterandnormarguments to the plotting functions pr1551 G EraslanStandardized and expanded available arguments to the
sc.pl.rank_genes_groups*family of functions. pr1529 F Ramirez I VirshupSee examples sections of
rank_genes_groups_dotplot()andrank_genes_groups_matrixplot()for demonstrations.
scanpy.tl.tsne()now supports the metric argument and records the passed parameters pr1854 I Virshupscanpy.pl.scrublet_score_distribution()now uses same API as other scanpy functions for saving/ showing plots pr1741 J Manning
Ecosystem#
Documentation#
Added rendered examples to many plotting functions issue1664 A Schaar L Zappia bio-la L Hetzel L Dony M Buttner K Hrovatin F Ramirez I Virshup LouisK92 mayarali
Integrated DocSearch, a find-as-you-type documentation index search. pr1754 P Angerer
Reorganized reference docs pr1753 I Virshup
Clarified docs issues for
neighbors(),diffmap(),calculate_qc_metrics()pr1680 G PallaFixed typos in grouped plot doc-strings pr1877 C Rands
Extended examples for differential expression plotting. pr1529 F Ramirez
See
rank_genes_groups_dotplot()orrank_genes_groups_matrixplot()for examples.
Bug fixes#
Fix
scanpy.pl.paga_path()TypeErrorwith recent versions of anndata pr1047 P AngererFix detection of whether IPython is running pr1844 I Virshup
Fixed reproducibility of
scanpy.tl.diffmap()(added random_state) pr1858 I KucinskiFixed errors and warnings from embedding plots with small numbers of categories after
sns.set_palettewas called pr1886 I VirshupFixed handling of
gene_symbolsargument in a number ofsc.pl.rank_genes_groups*functions pr1529 F Ramirez I VirshupFixed handling of
use_rawforsc.tl.rank_genes_groupswhen no.rawis present pr1895 I Virshupscanpy.pl.rank_genes_groups_violin()now works forraw=Falsepr1669 M van den Beekscanpy.pl.dotplot()now usessmallest_dotargument correctly pr1771 S Flemming
Development Process#
Switched to flit for building and deploying the package, a simple tool with an easy to understand command line interface and metadata pr1527 P Angerer
Use pre-commit for style checks pr1684 pr1848 L Heumos I Virshup
Deprecations#
Dropped support for Python 3.6. More details here. pr1897 I Virshup
Deprecated
layersandlayers_normkwargs tonormalize_total()pr1667 I VirshupDeprecated
MulticoreTSNEbackend forscanpy.tl.tsne()pr1854 I Virshup
Version 1.7#
1.7.2 2021-04-07#
Bug fixes#
scanpy.logging.print_versions()now works whenpython<3.8pr1691 I Virshupscanpy.pp.regress_out()now usesjoblibas the parallel backend, and should stop oversubscribing threads pr1694 I Virshupscanpy.pp.highly_variable_genes()withflavor="seurat_v3"now returns correct gene means and -variances when used withbatch_keypr1732 J Lausescanpy.pp.highly_variable_genes()now throws a warning instead of an error when non-integer values are passed for method"seurat_v3". The check can be skipped by passingcheck_values=False. pr1679 G Palla
Ecosystem#
1.7.1 2021-02-24#
Documentation#
More twitter handles for core devs pr1676 G Eraslan
Bug fixes#
dendrogram()use1 - correlationas distance matrix to compute the dendrogram pr1614 F RamirezFixed
obs_df()/var_df()erroring whenkeysnot passed pr1637 I VirshupFixed argument handling for
scanpy.pp.scrublet()J ManningFixed passing of
kwargstoscanpy.pl.violin()whenstripplotwas also used pr1655 M van den BeekFixed colorbar creation in
scanpy.pl.timeseries_as_heatmappr1654 M van den Beek
1.7.0 2021-02-03#
Features#
Add new 10x Visium datasets to
visium_sge()pr1473 G PallaEnable download of source image for 10x visium datasets in
visium_sge()pr1506 H SpitzerRefactor of
scanpy.pl.spatial(). Better support for plotting without an image, as well as directly providing images pr1512 G PallaDict input for
scanpy.queries.enrich()pr1488 G Eraslanrank_genes_groups_df()can now return fraction of cells in a group expressing a gene, and allows retrieving values for multiple groups at once pr1388 G EraslanColor annotations for gene sets in
heatmap()are now matched to color for cluster pr1511 L SikkemaPCA plots can now annotate axes with variance explained pr1470 bfurtwa
Plots with
groupbyarguments can now group by values in the index by passing the index’s name (likepd.DataFrame.groupby). pr1583 F RamirezAdded
na_colorandna_in_legendkeyword arguments toembedding()plots. Allows specifying color for missing or filtered values in plots likeumap()orspatial()pr1356 I Virshupembedding()plots now support passingdictof{cluster_name: cluster_color, ...}for palette argument pr1392 I Virshup
External tools (new)#
Add Scanorama integration to scanpy external API (
scanorama_integrate(), Hie et al. [2019]) pr1332 B HieScrublet [Wolock et al., 2019] integration:
scrublet(),scrublet_simulate_doublets(), and plotting methodscrublet_score_distribution()pr1476 J Manninghashsolo()for HTO demultiplexing [Bernstein et al., 2020] pr1432 NJ BernsteinAdded scirpy (sc-AIRR analysis) to ecosystem page pr1453 G Sturm
Added scvi-tools to ecosystem page pr1421 A Gayoso
External tools (changes)#
Updates for
palantir()andpalantir_results()pr1245 A MousaFixes to
harmony_timeseries()docs pr1248 A MousaSupport for
leidenclustering byscanpy.external.tl.phenograph()pr1080 A MousaDeprecate
scanpy.external.pp.scvipr1554 G XingUpdated default params of
sam()to work with larger data pr1540 A Tarashansky
Documentation#
New contribution guide pr1544 I Virshup
zshinstallation instructions pr1444 P Angerer
Performance#
Speed up
read_10x_h5()pr1402 P Weiler
Bugfixes#
Consistent fold-change, fractions calculation for filter_rank_genes_groups pr1391 S Rybakov
Fixed bug where
score_geneswould error if one gene was passed pr1398 I VirshupFixed
log1pinplace on integer dense arrays pr1400 I VirshupFix docstring formatting for
rank_genes_groups()pr1417 P WeilerRemoved
PendingDeprecationWarning`s from use of `np.matrixpr1424 P WeilerFixed indexing byg in
~scanpy.pp.highly_variable_genespr1456 V BergenFix default number of genes for marker_genes_overlap pr1464 MD Luecken
Fixed passing
groupbyanddendrogram_keytodendrogram()pr1465 M VarmaFixed download path of
pbmc3k_processedpr1472 D StroblBetter error message when computing DE with a group of size 1 pr1490 J Manning
Update cugraph API usage for v0.16 pr1494 R Ilango
Fixed
marker_gene_overlapdefault value fortop_n_markerspr1464 MD LueckenPass
random_stateto RAPIDs UMAP pr1474 C NoletFixed
anndataversion requirement forconcat()(re-exported from scanpy assc.concat) pr1491 I VirshupFixed the width of the progress bar when downloading data pr1507 M Klein
Updated link for
moignard15dataset pr1542 I VirshupFixed bug where calling
set_figure_paramscould block if IPython was installed, but not used. pr1547 I Virshupviolin()no longer fails if.rawnot present pr1548 I Virshupspatial()refactoring and better handling of spatial data pr1512 G Palla
Version 1.6#
1.6.0 2020-08-15#
This release includes an overhaul of dotplot(), matrixplot(), and stacked_violin() (pr1210 F Ramirez), and of the internals of rank_genes_groups() (pr1156 S Rybakov).
Overhaul of dotplot(), matrixplot(), and stacked_violin() pr1210 F Ramirez#
An overhauled tutorial Core plotting functions.
New plotting classes can be accessed directly (e.g.,
DotPlot) or using thereturn_figparam.It is possible to plot log fold change and p-values in the
rank_genes_groups_dotplot()family of functions.Added
axparameter which allows embedding the plot in other images.Added option to include a bar plot instead of the dendrogram containing the cell/observation totals per category.
Return a dictionary of axes for further manipulation. This includes the main plot, legend and dendrogram to totals
Legends can be removed.
The
groupbyparam can take a list of categories, e.g.,groupby=[‘tissue’, ‘cell type’].Added padding parameter to
dotplotandstacked_violin. pr1270Added title for colorbar and positioned as in dotplot for
matrixplot().dotplot()changes:Improved the colorbar and size legend for dotplots. Now the colorbar and size have titles, which can be modified using the
colorbar_titleandsize_titleparams. They also align at the bottom of the image and do not shrink if the dotplot image is smaller.Allow plotting genes in rows and categories in columns (
swap_axes).Using
DotPlot, thedot_edge_colorand line width can be modified, a grid can be added, and other modifications are enabled.A new style was added in which the dots are replaced by an empty circle and the square behind the circle is colored (like in matrixplots).
stacked_violin()changes:Violin colors can be colored based on average gene expression as in dotplots.
The linewidth of the violin plots is thinner.
Removed the tics for the y-axis as they tend to overlap with each other. Using the style method they can be displayed if needed.
Additions#
concat()is now exported from scanpy, see Concatenation for more info. pr1338 I VirshupAdded highly variable gene selection strategy from Seurat v3 pr1204 A Gayoso
Added
backup_urlparam toread_10x_h5()pr1296 A GayosoAllow prefix for
read_10x_mtx()pr1250 G SturmOptional tie correction for the
'wilcoxon'method inrank_genes_groups()pr1330 S RybakovUse
sinfoforprint_versions()and addprint_header()to do what it previously did. pr1338 I Virshup pr1373
Bug fixes#
Avoid warning in
rank_genes_groups()if ‘t-test’ is passed pr1303 A WolfRestrict sphinx version to <3.1, >3.0 pr1297 I Virshup
Clean up
_ranksand fixdendrogramfor scipy 1.5 pr1290 S RybakovUse
.rawto translate gene symbols if applicable pr1278 E RiceFix
diffmap(issue1262) G EraslanFix
neighborsinspring_projectissue1260 S RybakovFix default size of dot in spatial plots pr1255 issue1253 giovp
Bumped version requirement of
scipytoscipy>1.4to supportrmatmatargument ofLinearOperatorissue1246 I VirshupFix asymmetry of scores for the
'wilcoxon'method inrank_genes_groups()issue754 S RybakovAvoid trimming of gene names in
rank_genes_groups()issue753 S Rybakov
Version 1.5#
1.5.1 2020-05-21#
Bug fixes#
1.5.0 2020-05-15#
The 1.5.0 release adds a lot of new functionality, much of which takes advantage of anndata updates 0.7.0 - 0.7.2. Highlights of this release include support for spatial data, dedicated handling of graphs in AnnData, sparse PCA, an interface with scvi, and others.
Spatial data support#
Basic analysis Analysis and visualization of spatial transcriptomics data and integration with single cell data Integrating spatial data with scRNA-seq using scanorama G Palla
read_visium()read 10x Visium data pr1034 G Palla, P Angerer, I Virshupvisium_sge()load Visium data directly from 10x Genomics pr1013 M Mirkazemi, G Palla, P Angerer
New functionality#
External tools#
Performance#
pca()now uses efficient implicit centering for sparse matrices. This can lead to signifigantly improved performance for large datasets pr1066 A Tarashanskyscore_genes()now has an efficient implementation for sparse matrices with missing values pr1196 redst4r.
Code design#
stacked_violin()can now be used as a subplot pr1084 P Angererscore_genes()has improved logging pr1119 G Eraslanscale()now saves mean and standard deviation in thevarpr1173 A Wolfharmony_timeseries()pr1091 A Mousa
Bug fixes#
combat()now works whenobs_namesaren’t unique. pr1215 I Virshupscale()can now be used on dense arrays without centering pr1160 simonwmregress_out()now works when some features are constant pr1194 simonwmnormalize_total()errored if the passed object was a view pr1200 I Virshupneighbors()sometimes ignored then_pcsparam pr1124 V Bergenebi_expression_atlas()which contained some out-of-date URLs pr1102 I Virshuphighly_variable_genes()which could lead to incorrect results when thebatch_keyargument was used pr1180 G Eraslaningest()where an inconsistent number of neighbors was used pr1111 S Rybakov
Version 1.4#
1.4.6 2020-03-17#
Functionality in external#
sam()self-assembling manifolds [Tarashansky et al., 2019] pr903 A Tarashanskyharmony_timeseries()for trajectory inference on discrete time points pr994 A Mousawishbone()for trajectory inference (bifurcations) pr1063 A Mousa
Code design#
Bug fixes#
1.4.5 2019-12-30#
Please install scanpy==1.4.5.post3 instead of scanpy==1.4.5.
New functionality#
ingest()maps labels and embeddings of reference data to new data Integrating data using ingest and BBKNN pr651 S Rybakov, A Wolfqueriesrecieved many updates including enrichment through gprofiler and more advanced biomart queries pr467 I Virshupset_figure_params()allows settingfigsizeand acceptsfacecolor='white', useful for working in dark mode A Wolf
Code design#
downsample_countsnow always preserves the dtype of it’s input, instead of converting floats to ints pr865 I Virshuprun neighbors on a GPU using rapids pr830 T White
param docs from typed params P Angerer
embedding_density()now only takes one positional argument; similar forembedding_density(), which gains a paramgroupbypr965 A Wolfwebpage overhaul, ecosystem page, release notes, tutorials overhaul pr960 pr966 A Wolf
Warning
changed default
solverinpca()fromautotoarpackchanged default
use_rawinscore_genes()fromFalsetoNone
1.4.4 2019-07-20#
New functionality#
scanpy.getadds helper functions for extracting data in convenient formats pr619 I Virshup
Bug fixes#
Stopped deprecations warnings from AnnData
0.6.22I Virshup
Code design#
normalize_total()gains paramexclude_highly_expressed, andfractionis renamed tomax_fractionwith better docs A Wolf
1.4.3 2019-05-14#
Bug fixes#
neighbors()correctly infersn_neighborsagain fromparams, which was temporarily broken inv1.4.2I Virshup
Code design#
calculate_qc_metrics()is single threaded by default for datasets under 300,000 cells – allowing cached compilation pr615 I Virshup
1.4.2 2019-05-06#
New functionality#
combat()supports additional covariates which may include adjustment variables or biological condition pr618 G Eraslanhighly_variable_genes()has abatch_keyoption which performs HVG selection in each batch separately to avoid selecting genes that vary strongly across batches pr622 G Eraslan
Bug fixes#
rank_genes_groups()t-test implementation doesn’t return NaN when variance is 0, also changed to scipy’s implementation pr621 I Virshupumap()withinit_pos='paga'detects correctdtypeA Wolflouvain()andleiden()auto-generatekey_added=louvain_Rupon passingrestrict_to, which was temporarily changed in1.4.1A Wolf
Code design#
neighbors()andumap()got rid of UMAP legacy code and introduced UMAP as a dependency pr576 S Rybakov
1.4.1 2019-04-26#
New functionality#
Scanpy has a command line interface again. Invoking it with
scanpy somecommand [args]callsscanpy-somecommand [args], except for builtin commands (currentlyscanpy settings) pr604 P Angererebi_expression_atlas()allows convenient download of EBI expression atlas I Virshupmarker_gene_overlap()computes overlaps of marker genes M Lueckenfilter_rank_genes_groups()filters out genes based on fold change and fraction of cells expressing genes F Ramireznormalize_total()replacesnormalize_per_cell(), is more efficient and provides a parameter to only normalize using a fraction of expressed genes S Rybakovdownsample_counts()has been sped up, changed default value ofreplaceparameter toFalsepr474 I Virshupembedding_density()computes densities on embeddings pr543 M Lueckenpalantir()interfaces Palantir [Setty et al., 2019] pr493 A Mousa
Code design#
.layerssupport of scatter plots F Ramirezfix double-logarithmization in compute of log fold change in
rank_genes_groups()A Muñoz-Rojasfix return sections of docs P Angerer
Version 1.3#
1.3.8 2019-02-05#
various documentation and dev process improvements
Added
combat()function for batch effect correction [Johnson et al., 2006, Leek et al., 2017, Pedersen, 2012] pr398 M Lange
1.3.7 2019-01-02#
API changed from
import scanpy as sctoimport scanpy.api as sc.phenograph()wraps the graph clustering package Phenograph [Levine et al., 2015] thanks to A Mousa
1.3.6 2018-12-11#
Major updates#
a new plotting gallery for
visualizing-marker-genesF Ramireztutorials are integrated on ReadTheDocs,
pbmc3kandpaga-paul15A Wolf
Interactive exploration of analysis results through manifold viewers#
CZI’s cellxgene directly reads
.h5adfiles the cellxgene developersthe UCSC Single Cell Browser requires exporting via
cellbrowser()M Haeussler
Code design#
highly_variable_genes()supersedesfilter_genes_dispersion(), it gives the same results but, by default, expects logarithmized data and doesn’t subset A Wolf
1.3.5 2018-12-09#
uncountable figure improvements pr369 F Ramirez
1.3.4 2018-11-24#
leiden()wraps the recent graph clustering package by Traag et al. [2019] K Polanskibbknn()wraps the recent batch correction package [Polański et al., 2019] K Polanskicalculate_qc_metrics()caculates a number of quality control metrics, similar tocalculateQCMetricsfrom Scater [McCarthy et al., 2017] I Virshup
1.3.3 2018-11-05#
Major updates#
a fully distributed preprocessing backend T White and the Laserson Lab
Code design#
read_10x_h5()andread_10x_mtx()read Cell Ranger 3.0 outputs pr334 Q Gong
Note
Also see changes in anndata 0.6.
changed default compression to
Noneinwrite_h5ad()to speed up read and write, disk space use is usually less criticalperformance gains in
write_h5ad()due to better handling of strings and categories S Rybakov
1.3.1 2018-09-03#
RNA velocity in single cells [La Manno et al., 2018]#
Scanpy and AnnData support loom’s layers so that computations for single-cell RNA velocity [La Manno et al., 2018] become feasible S Rybakov and V Bergen
scvelo harmonizes with Scanpy and is able to process loom files with splicing information produced by Velocyto [La Manno et al., 2018], it runs a lot faster than the count matrix analysis of Velocyto and provides several conceptual developments
Plotting (Generic)#
There now is a section on imputation in external:#
magic()for imputation using data diffusion [van Dijk et al., 2018] pr187 S Gigantedca()for imputation and latent space construction using an autoencoder [Eraslan et al., 2019] pr186 G Eraslan
Version 1.2#
1.2.1 2018-06-08#
Plotting of Generic marker genes and quality control.#
highest_expr_genes()for quality control; plot genes with highest mean fraction of cells, similar toplotQCof Scater [McCarthy et al., 2017] pr169 F Ramirez
1.2.0 2018-06-08#
Version 1.1#
1.1.0 2018-06-01#
set_figure_params()by default passesvector_friendly=Trueand allows you to produce reasonablly sized pdfs by rasterizing large scatter plots A Wolfdraw_graph()defaults to the ForceAtlas2 layout [Chippada, 2018, Jacomy et al., 2014], which is often more visually appealing and whose computation is much faster S Wollockscatter()also plots along variables axis MD Lueckenregress_out()is back to multiprocessing F Ramirezread()reads compressed text files G Eraslanmitochondrial_genes()for querying mito genes FG Brundumnn_correct()for batch correction [Haghverdi et al., 2018, Kang, 2018]phate()for low-dimensional embedding [Moon et al., 2019] S Gigantesandbag(),cyclone()for scoring genes [Fechtner, 2018, Scialdone et al., 2015]
Version 1.0#
1.0.0 2018-03-30#
Major updates#
Scanpy is much faster and more memory efficient: preprocess, cluster and visualize 1.3M cells in 6h, 130K cells in 14min, and 68K cells in 3min A Wolf
the API gained a preprocessing function
neighbors()and a classNeighbors()to which all basic graph computations are delegated A Wolf
Warning
Upgrading to 1.0 isn’t fully backwards compatible in the following changes
the graph-based tools
louvain()dpt()draw_graph()umap()diffmap()paga()require prior computation of the graph:sc.pp.neighbors(adata, n_neighbors=5); sc.tl.louvain(adata)instead of previouslysc.tl.louvain(adata, n_neighbors=5)install
numbaviaconda install numba, which replaces cythonthe default connectivity measure (dpt will look different using default settings) changed. setting
method='gauss'insc.pp.neighborsuses gauss kernel connectivities and reproduces the previous behavior, see, for instance in the example paul15.namings of returned annotation have changed for less bloated AnnData objects, which means that some of the unstructured annotation of old AnnData files is not recognized anymore
replace occurances of
group_bywithgroupby(consistency withpandas)it is worth checking out the notebook examples to see changes, e.g. the seurat example.
upgrading scikit-learn from 0.18 to 0.19 changed the implementation of PCA, some results might therefore look slightly different
Further updates#
UMAP [McInnes et al., 2018] can serve as a first visualization of the data just as tSNE, in contrast to tSNE, UMAP directly embeds the single-cell graph and is faster; UMAP is also used for measuring connectivities and computing neighbors, see
neighbors()A Wolfgraph abstraction: AGA is renamed to PAGA:
paga(); now, it only measures connectivities between partitions of the single-cell graph, pseudotime and clustering need to be computed separately vialouvain()anddpt(), the connectivity measure has been improved A Wolflogistic regression for finding marker genes
rank_genes_groups()with parametermethod='logreg'A Wolflouvain()provides a better implementation for reclustering viarestrict_toA Wolfscanpy no longer modifies rcParams upon import, call
settings.set_figure_paramsto set the ‘scanpy style’ A Wolfdefault cache directory is
./cache/, setsettings.cachedirto change this; nested directories in this are avoided A Wolfshow edges in scatter plots based on graph visualization
draw_graph()andumap()by passingedges=TrueA Wolfdownsample_counts()for downsampling counts MD Lueckendefault
'louvain_groups'are called'louvain'A Wolf'X_diffmap'contains the zero component, plotting remains unchanged A Wolf
Version 0.4#
0.4.4 2018-02-26#
embed cells using
umap()[McInnes et al., 2018] pr92 G Eraslanscore sets of genes, e.g. for cell cycle, using
score_genes()[Satija et al., 2015]: notebook
0.4.3 2018-02-09#
clustermap(): heatmap from hierarchical clustering, based onseaborn.clustermap()[Waskom et al., 2016] A Wolfonly return
matplotlib.axes.Axesin plotting functions ofsc.plwhenshow=False, otherwiseNoneA Wolf
0.4.2 2018-01-07#
amendments in PAGA and its plotting functions A Wolf
0.4.0 2017-12-23#
export to SPRING [Weinreb et al., 2017] for interactive visualization of data: spring tutorial S Wollock
Version 0.3#
0.3.2 2017-11-29#
finding marker genes via
rank_genes_groups_violin()improved, see issue51 F Ramirez
0.3.0 2017-11-16#
AnnDatagains methodconcatenate()A WolfAnnDatais available as the separate anndata package P Angerer, A Wolfresults of PAGA simplified A Wolf
Version 0.2#
0.2.9 2017-10-25#
Initial release of the new trajectory inference method PAGA#
paga()computes an abstracted, coarse-grained (PAGA) graph of the neighborhood graph A Wolfpaga_compare()plot this graph next an embedding A Wolfpaga_path()plots a heatmap through a node sequence in the PAGA graph A Wolf
0.2.1 2017-07-24#
Scanpy includes preprocessing, visualization, clustering, pseudotime and trajectory inference, differential expression testing and simulation of gene regulatory networks. The implementation efficiently deals with datasets of more than one million cells. A Wolf, P Angerer
Version 0.1#
0.1.0 2017-05-17#
Scanpy computationally outperforms and allows reproducing both the Cell Ranger R kit’s and most of Seurat’s clustering workflows. A Wolf, P Angerer