Release notes
Version 1.9
1.9.8 2024-01-26
Bug fixes
Fix handling of numpy array palettes for old numpy versions PR 2832 P Angerer
1.9.7 2024-01-25
Bug fixes
Fix handling of numpy array palettes (e.g. after write-read cycle) PR 2734 P Angerer
Specify correct version of
matplotlib
dependency PR 2733 P FisherFix
scanpy.pl.violin()
usage ofseaborn.catplot
PR 2739 E RoellinFix
scanpy.pp.highly_variable_genes()
to handle the combinations ofinplace
andsubset
consistently PR 2757 E RoellinReplace usage of various deprecated functionality from
anndata
andpandas
PR 2678 PR 2779 P AngererAllow to use default
n_top_genes
when usingscanpy.pp.highly_variable_genes()
flavor'seurat_v3'
PR 2782 P AngererFix
scanpy.read_10x_mtx()
’sgex_only=True
mode PR 2801 P Angerer
1.9.6 2023-10-31
Bug fixes
Allow
scanpy.pl.scatter()
to accept astr
palette name PR 2571 P AngererMake
scanpy.external.tl.palantir()
compatible with palantir >=1.3 PR 2672 DJ OttoFix
scanpy.pl.pca()
whenreturn_fig=True
andannotate_var_explained=True
PR 2682 J WagnerTemp fix for issue 2680 by skipping
seaborn
version 0.13.0 PR 2661 P AngererFix
scanpy.pp.highly_variable_genes()
to not modify the used layer whenflavor=seurat
PR 2698 E RoellinPrevent pandas from causing infinite recursion when setting a slice of a categorical column PR 2719 P Angerer
1.9.5 2023-09-08
Bug fixes
Remove use of deprecated
dtype
argument to AnnData constructor PR 2658 Isaac Virshup
1.9.4 2023-08-24
Bug fixes
Support scikit-learn 1.3 PR 2515 P Angerer
Deal with
None
value vanishing from things like.uns['log1p']
PR 2546 SP ShenDepend on
igraph
instead ofpython-igraph
PR 2566 P Angererrank_genes_groups()
now handles unsorted groups as intended PR 2589 S Dicksrank_genes_groups_df()
now works forrank_genes_groups()
withmethod="logreg"
PR 2601 S Dicks_choose_representation()
now works withn_pcs
if bigger thansettings.N_PCS
PR 2610 S Dicks
1.9.3 2023-03-02
Bug fixes
1.9.2 2023-02-16
Bug fixes
highly_variable_genes()
layer
argument now works in tandem withbatches
PR 2302 D Schaumonthighly_variable_genes()
withflavor='cell_ranger'
now handles the case in issue 2230 where the number of calculated dispersions is less thann_top_genes
PR 2231 L ZappiaFix compatibility with matplotlib 3.7 PR 2414 I Virshup P Fisher
Fix scrublet numpy matrix compatibility issue PR 2395 A Gayoso
1.9.1 2022-04-05
Bug fixes
normalize_total()
works when Dask is not installed PR 2209 R CannoodtFix embedding plots by bumping matplotlib dependency to version 3.4 PR 2212 I Virshup
1.9.0 2022-04-01
Tutorials
New tutorial on the usage of Pearson Residuals: → tutorial: tutorial_pearson_residuals J Lause, G Palla
Materials and recordings for Scanpy workshops by Maren Büttner
Experimental module
Added
scanpy.experimental
module! Currently contains functionality related to pearson residuals inscanpy.experimental.pp
PR 1715 J Lause, G Palla, I Virshup. This includes:normalize_pearson_residuals()
for Pearson Residuals normalizationhighly_variable_genes()
for HVG selection with Pearson Residualsnormalize_pearson_residuals_pca()
for Pearson Residuals normalization and dimensionality reduction with PCArecipe_pearson_residuals()
for Pearson Residuals normalization, HVG selection and dimensionality reduction with PCA
Features
filter_rank_genes_groups()
now allows to filter with absolute values of log fold change PR 1649 S Rybakov_choose_representation
now subsets the provided representation to n_pcs, regardless of the name of the provided representation (should affect mostlyneighbors()
) PR 2179 I Virshup PG Majevscanpy.external.pp.scrublet()
(and related functions) can now be used onAnnData
objects containing multiple batches PR 1965 J ManningNumber of variables plotted with
pca_loadings()
can now be controlled withn_points
argument. Additionally, variables are no longer repeated if the anndata has less than 30 variables PR 2075 Yves33Dask arrays now work with
scanpy.pp.normalize_total()
PR 1663 G Buckley, I Virshupembedding_density()
now allows more than 10 groups PR 1936 A WolfEmbedding plots can now pass
colorbar_loc
to specify the location of colorbar legend, or passNone
to not show a colorbar PR 1821 A Schaar I VirshupEmbedding plots now have a
dimensions
argument, which lets users select which dimensions of their embedding to plot and uses the same broadcasting rules as other arguments PR 1538 I Virshupprint_versions()
now usessession_info
PR 2089 P Angerer I Virshup
Ecosystem
Multiple packages have been added to our ecosystem page, including:
decoupler a for footprint analysis and pathway enrichement PR 2186 PB Mompel
CIARA a feature selection tools for identifying rare cell types PR 2175 M Stock
Bug fixes
Fixed finding variables with
use_raw=True
andbasis=None
inscanpy.pl.scatter()
PR 2027 E RiceFixed
scanpy.external.pp.scrublet()
to address issue 1957 FlMai and ensure raw counts are used for simulationFunctions in
scanpy.datasets
no longer throwOldFormatWarnings
when usinganndata
0.8
PR 2096 I VirshupFixed use of
scanpy.pp.neighbors()
withmethod='rapids'
: RAPIDS cuML no longer returns a squared Euclidean distance matrix, so we should not square-root the kNN distance matrix. PR 1828 M ZaslavskyRemoved
pytables
dependency by implementingread_10x_h5
withh5py
due to installation errors on Windows PR 2064Fixed bug in
scanpy.external.pp.hashsolo()
where default value was set improperly PR 2190 B ReizFixed bug in
scanpy.pl.embedding()
functions where an error could be raised when there were missing values and large numbers of categories PR 2187 I Virshup
Version 1.8
1.8.2 2021-11-3
Docs
Update conda installation instructions PR 1974 L Heumos
Bug fixes
Fix plotting after
scanpy.tl.filter_rank_genes_groups()
PR 1942 S RybakovFix
use_raw=None
usinganndata.AnnData.var_names
ifanndata.AnnData.raw
is present inscanpy.tl.score_genes()
PR 1999 M KleinFix compatibility with UMAP 0.5.2 PR 2028 L Mcinnes
Fixed non-determinism in
scanpy.pl.paga()
node positions PR 1922 I Virshup
Ecosystem
Added PASTE (a tool to align and integrate spatial transcriptomics data) to scanpy ecosystem.
1.8.1 2021-07-07
Bug fixes
Fixed reproducibility of
scanpy.tl.score_genes()
. Calculation and output is now float64 type. PR 1890 I KucinskiWorkarounds for some changes/ bugs in pandas 1.3 PR 1918 I Virshup
Fixed bug where
sc.pl.paga_compare
could mislabel nodes on the paga graph PR 1898 I VirshupFixed handling of
use_raw
withscanpy.tl.rank_genes_groups()
PR 1934 I Virshup
1.8.0 2021-06-28
Metrics module
Added
scanpy.metrics
module!Added
scanpy.metrics.gearys_c()
for spatial autocorrelation PR 915 I VirshupAdded
scanpy.metrics.morans_i()
for global spatial autocorrelation PR 1740 I Virshup, G PallaAdded
scanpy.metrics.confusion_matrix()
for comparing labellings PR 915 I Virshup
Features
Added
layer
andcopy
kwargs tonormalize_total()
PR 1667 I VirshupAdded
vcenter
andnorm
arguments to the plotting functions PR 1551 G EraslanStandardized and expanded available arguments to the
sc.pl.rank_genes_groups*
family of functions. PR 1529 F Ramirez I Virshup - See examples sections ofrank_genes_groups_dotplot()
andrank_genes_groups_matrixplot()
for demonstrations.scanpy.tl.tsne()
now supports the metric argument and records the passed parameters PR 1854 I Virshupscanpy.external.pl.scrublet_score_distribution()
now uses same API as other scanpy functions for saving/ showing plots PR 1741 J Manning
Ecosystem
Added
triku
a feature selection method to the ecosystem page PR 1722 AM AscensiónAdded
dorothea
andprogeny
to the ecosystem page PR 1767 P Badia-i-Mompel
Documentation
Added rendered examples to many plotting functions issue 1664 A Schaar L Zappia bio-la L Hetzel L Dony M Buttner K Hrovatin F Ramirez I Virshup LouisK92 mayarali
Integrated DocSearch, a find-as-you-type documentation index search. PR 1754 P Angerer
Reorganized reference docs PR 1753 I Virshup
Clarified docs issues for
neighbors()
,diffmap()
,calculate_qc_metrics()
PR 1680 G PallaFixed typos in grouped plot doc-strings PR 1877 C Rands
Extended examples for differential expression plotting. PR 1529 F Ramirez - See
rank_genes_groups_dotplot()
orrank_genes_groups_matrixplot()
for examples.
Bug fixes
Fix
scanpy.pl.paga_path()
TypeError
with recent versions of anndata PR 1047 P AngererFix detection of whether IPython is running PR 1844 I Virshup
Fixed reproducibility of
scanpy.tl.diffmap()
(added random_state) PR 1858 I KucinskiFixed errors and warnings from embedding plots with small numbers of categories after
sns.set_palette
was called PR 1886 I VirshupFixed handling of
gene_symbols
argument in a number ofsc.pl.rank_genes_groups*
functions PR 1529 F Ramirez I VirshupFixed handling of
use_raw
forsc.tl.rank_genes_groups
when no.raw
is present PR 1895 I Virshupscanpy.pl.rank_genes_groups_violin()
now works forraw=False
PR 1669 M van den Beekscanpy.pl.dotplot()
now usessmallest_dot
argument correctly PR 1771 S Flemming
Development processes
Switched to flit for building and deploying the package, a simple tool with an easy to understand command line interface and metadata PR 1527 P Angerer
Use pre-commit for style checks PR 1684 PR 1848 L Heumos I Virshup
Deprecations
Dropped support for Python 3.6. More details here. PR 1897 I Virshup
Deprecated
layers
andlayers_norm
kwargs tonormalize_total()
PR 1667 I VirshupDeprecated
MulticoreTSNE
backend forscanpy.tl.tsne()
PR 1854 I Virshup
Version 1.7
1.7.2 2021-04-07
Bug fixes
scanpy.logging.print_versions()
now works whenpython<3.8
PR 1691 I Virshupscanpy.pp.regress_out()
now usesjoblib
as the parallel backend, and should stop oversubscribing threads PR 1694 I Virshupscanpy.pp.highly_variable_genes()
withflavor="seurat_v3"
now returns correct gene means and -variances when used withbatch_key
PR 1732 J Lausescanpy.pp.highly_variable_genes()
now throws a warning instead of an error when non-integer values are passed for method"seurat_v3"
. The check can be skipped by passingcheck_values=False
. PR 1679 G Palla
Ecosystem
1.7.1 2021-02-24
Documentation
More twitter handles for core devs PR 1676 G Eraslan
Bug fixes
dendrogram()
use1 - correlation
as distance matrix to compute the dendrogram PR 1614 F RamirezFixed
obs_df()
/var_df()
erroring whenkeys
not passed PR 1637 I VirshupFixed argument handling for
scanpy.external.pp.scrublet()
J ManningFixed passing of
kwargs
toscanpy.pl.violin()
whenstripplot
was also used PR 1655 M van den BeekFixed colorbar creation in
scanpy.pl.timeseries_as_heatmap
PR 1654 M van den Beek
1.7.0 2021-02-03
Features
Add new 10x Visium datasets to
visium_sge()
PR 1473 G PallaEnable download of source image for 10x visium datasets in
visium_sge()
PR 1506 H SpitzerRefactor of
scanpy.pl.spatial()
. Better support for plotting without an image, as well as directly providing images PR 1512 G PallaDict input for
scanpy.queries.enrich()
PR 1488 G Eraslanrank_genes_groups_df()
can now return fraction of cells in a group expressing a gene, and allows retrieving values for multiple groups at once PR 1388 G EraslanColor annotations for gene sets in
heatmap()
are now matched to color for cluster PR 1511 L SikkemaPCA plots can now annotate axes with variance explained PR 1470 bfurtwa
Plots with
groupby
arguments can now group by values in the index by passing the index’s name (likepd.DataFrame.groupby
). PR 1583 F RamirezAdded
na_color
andna_in_legend
keyword arguments toembedding()
plots. Allows specifying color for missing or filtered values in plots likeumap()
orspatial()
PR 1356 I Virshupembedding()
plots now support passingdict
of{cluster_name: cluster_color, ...}
for palette argument PR 1392 I Virshup
External tools (new)
Add Scanorama integration to scanpy external API (
scanorama_integrate()
) [^cite_hie19] PR 1332 B HieScrublet [^cite_wolock19] integration:
scrublet()
,scrublet_simulate_doublets()
, and plotting methodscrublet_score_distribution()
PR 1476 J Manninghashsolo()
for HTO demultiplexing [^cite_bernstein20] PR 1432 NJ BernsteinAdded scirpy (sc-AIRR analysis) to ecosystem page PR 1453 G Sturm
Added scvi-tools to ecosystem page PR 1421 A Gayoso
External tools (changes)
Updates for
palantir()
andpalantir_results()
PR 1245 A MousaFixes to
harmony_timeseries()
docs PR 1248 A MousaSupport for
leiden
clustering byscanpy.external.tl.phenograph()
PR 1080 A MousaDeprecate
scanpy.external.pp.scvi
PR 1554 G XingUpdated default params of
sam()
to work with larger data PR 1540 A Tarashansky
Documentation
New contribution guide PR 1544 I Virshup
zsh
installation instructions PR 1444 P Angerer
Performance
Speed up
read_10x_h5()
PR 1402 P Weiler
Bugfixes
Consistent fold-change, fractions calculation for filter_rank_genes_groups PR 1391 S Rybakov
Fixed bug where
score_genes
would error if one gene was passed PR 1398 I VirshupFixed
log1p
inplace on integer dense arrays PR 1400 I VirshupFix docstring formatting for
rank_genes_groups()
PR 1417 P WeilerRemoved
PendingDeprecationWarning`s from use of `np.matrix
PR 1424 P WeilerFixed indexing byg in
~scanpy.pp.highly_variable_genes
PR 1456 V BergenFix default number of genes for marker_genes_overlap PR 1464 MD Luecken
Fixed passing
groupby
anddendrogram_key
todendrogram()
PR 1465 M VarmaFixed download path of
pbmc3k_processed
PR 1472 D StroblBetter error message when computing DE with a group of size 1 PR 1490 J Manning
Update cugraph API usage for v0.16 PR 1494 R Ilango
Fixed
marker_gene_overlap
default value fortop_n_markers
PR 1464 MD LueckenPass
random_state
to RAPIDs UMAP PR 1474 C NoletFixed
anndata
version requirement forconcat()
(re-exported from scanpy assc.concat
) PR 1491 I VirshupFixed the width of the progress bar when downloading data PR 1507 M Klein
Updated link for
moignard15
dataset PR 1542 I VirshupFixed bug where calling
set_figure_params
could block if IPython was installed, but not used. PR 1547 I Virshupviolin()
no longer fails if.raw
not present PR 1548 I Virshupspatial()
refactoring and better handling of spatial data PR 1512 G Palla
Version 1.6
1.6.0 2020-08-15
This release includes an overhaul of dotplot()
, matrixplot()
, and stacked_violin()
(PR 1210 F Ramirez), and of the internals of rank_genes_groups()
(PR 1156 S Rybakov).
Overhaul of dotplot()
, matrixplot()
, and stacked_violin()
PR 1210 F Ramirez
An overhauled tutorial → tutorial: plotting/core.
New plotting classes can be accessed directly (e.g.,
DotPlot
) or using thereturn_fig
param.It is possible to plot log fold change and p-values in the
rank_genes_groups_dotplot()
family of functions.Added
ax
parameter which allows embedding the plot in other images.Added option to include a bar plot instead of the dendrogram containing the cell/observation totals per category.
Return a dictionary of axes for further manipulation. This includes the main plot, legend and dendrogram to totals
Legends can be removed.
The
groupby
param can take a list of categories, e.g.,groupby=[‘tissue’, ‘cell type’]
.Added padding parameter to
dotplot
andstacked_violin
. PR 1270Added title for colorbar and positioned as in dotplot for
matrixplot()
.dotplot()
changes:Improved the colorbar and size legend for dotplots. Now the colorbar and size have titles, which can be modified using the
colorbar_title
andsize_title
params. They also align at the bottom of the image and do not shrink if the dotplot image is smaller.Allow plotting genes in rows and categories in columns (
swap_axes
).Using
DotPlot
, thedot_edge_color
and line width can be modified, a grid can be added, and other modifications are enabled.A new style was added in which the dots are replaced by an empty circle and the square behind the circle is colored (like in matrixplots).
stacked_violin()
changes:Violin colors can be colored based on average gene expression as in dotplots.
The linewidth of the violin plots is thinner.
Removed the tics for the y-axis as they tend to overlap with each other. Using the style method they can be displayed if needed.
Additions
concat()
is now exported from scanpy, see Concatenation for more info. PR 1338 I VirshupAdded highly variable gene selection strategy from Seurat v3 PR 1204 A Gayoso
Added
backup_url
param toread_10x_h5()
PR 1296 A GayosoAllow prefix for
read_10x_mtx()
PR 1250 G SturmOptional tie correction for the
'wilcoxon'
method inrank_genes_groups()
PR 1330 S RybakovUse
sinfo
forprint_versions()
and addprint_header()
to do what it previously did. PR 1338 I Virshup PR 1373
Bug fixes
Avoid warning in
rank_genes_groups()
if ‘t-test’ is passed PR 1303 A WolfRestrict sphinx version to <3.1, >3.0 PR 1297 I Virshup
Clean up
_ranks
and fixdendrogram
for scipy 1.5 PR 1290 S RybakovUse
.raw
to translate gene symbols if applicable PR 1278 E RiceFix
diffmap
(issue 1262) G EraslanFix
neighbors
inspring_project
issue 1260 S RybakovFix default size of dot in spatial plots PR 1255 issue 1253 giovp
Bumped version requirement of
scipy
toscipy>1.4
to supportrmatmat
argument ofLinearOperator
issue 1246 I VirshupFix asymmetry of scores for the
'wilcoxon'
method inrank_genes_groups()
issue 754 S RybakovAvoid trimming of gene names in
rank_genes_groups()
issue 753 S Rybakov
Version 1.5
1.5.1 2020-05-21
Bug fixes
1.5.0 2020-05-15
The 1.5.0
release adds a lot of new functionality, much of which takes advantage of anndata
updates 0.7.0 - 0.7.2
. Highlights of this release include support for spatial data, dedicated handling of graphs in AnnData, sparse PCA, an interface with scvi, and others.
Spatial data support
Basic analysis → tutorial: spatial/basic-analysis and integration with single cell data → tutorial: spatial/integration-scanorama G Palla
read_visium()
read 10x Visium data PR 1034 G Palla, P Angerer, I Virshupvisium_sge()
load Visium data directly from 10x Genomics PR 1013 M Mirkazemi, G Palla, P Angerer
New functionality
Many functions, like
neighbors()
andumap()
, now store cell-by-cell graphs inobsp
PR 1118 S Rybakovscale()
andlog1p()
can be used on any element inlayers
orobsm
PR 1173 I Virshup
External tools
scanpy.external.pp.scvi
for preprocessing with scVI PR 1085 G XingGuide for using Scanpy in R PR 1186 L Zappia
Performance
pca()
now uses efficient implicit centering for sparse matrices. This can lead to signifigantly improved performance for large datasets PR 1066 A Tarashanskyscore_genes()
now has an efficient implementation for sparse matrices with missing values PR 1196 redst4r.
Warning
The new pca()
implementation can result in slightly different results for sparse matrices. See the pr (PR 1066) and documentation for more info.
Code design
stacked_violin()
can now be used as a subplot PR 1084 P Angererscore_genes()
has improved logging PR 1119 G Eraslanscale()
now saves mean and standard deviation in thevar
PR 1173 A Wolfharmony_timeseries()
PR 1091 A Mousa
Bug fixes
combat()
now works whenobs_names
aren’t unique. PR 1215 I Virshupscale()
can now be used on dense arrays without centering PR 1160 simonwmregress_out()
now works when some features are constant PR 1194 simonwmnormalize_total()
errored if the passed object was a view PR 1200 I Virshupneighbors()
sometimes ignored then_pcs
param PR 1124 V Bergenebi_expression_atlas()
which contained some out-of-date URLs PR 1102 I Virshuphighly_variable_genes()
which could lead to incorrect results when thebatch_key
argument was used PR 1180 G Eraslaningest()
where an inconsistent number of neighbors was used PR 1111 S Rybakov
Version 1.4
1.4.6 2020-03-17
Functionality in external
sam()
self-assembling manifolds [^cite_tarashansky19] PR 903 A Tarashanskyharmony_timeseries()
for trajectory inference on discrete time points PR 994 A Mousawishbone()
for trajectory inference (bifurcations) PR 1063 A Mousa
Code design
Bug fixes
1.4.5 2019-12-30
Please install scanpy==1.4.5.post3
instead of scanpy==1.4.5
.
New functionality
ingest()
maps labels and embeddings of reference data to new data → tutorial: integrating-data-using-ingest PR 651 S Rybakov, A Wolfqueries
recieved many updates including enrichment through gprofiler and more advanced biomart queries PR 467 I Virshupset_figure_params()
allows settingfigsize
and acceptsfacecolor='white'
, useful for working in dark mode A Wolf
Code design
downsample_counts
now always preserves the dtype of it’s input, instead of converting floats to ints PR 865 I Virshuprun neighbors on a GPU using rapids PR 830 T White
param docs from typed params P Angerer
embedding_density()
now only takes one positional argument; similar forembedding_density()
, which gains a paramgroupby
PR 965 A Wolfwebpage overhaul, ecosystem page, release notes, tutorials overhaul PR 960 PR 966 A Wolf
Warning
changed default
solver
inpca()
fromauto
toarpack
changed default
use_raw
inscore_genes()
fromFalse
toNone
1.4.4 2019-07-20
New functionality
scanpy.get
adds helper functions for extracting data in convenient formats PR 619 I Virshup
Bug fixes
Stopped deprecations warnings from AnnData
0.6.22
I Virshup
Code design
normalize_total()
gains paramexclude_highly_expressed
, andfraction
is renamed tomax_fraction
with better docs A Wolf
1.4.3 2019-05-14
Bug fixes
neighbors()
correctly infersn_neighbors
again fromparams
, which was temporarily broken inv1.4.2
I Virshup
Code design
calculate_qc_metrics()
is single threaded by default for datasets under 300,000 cells – allowing cached compilation PR 615 I Virshup
1.4.2 2019-05-06
New functionality
combat()
supports additional covariates which may include adjustment variables or biological condition PR 618 G Eraslanhighly_variable_genes()
has abatch_key
option which performs HVG selection in each batch separately to avoid selecting genes that vary strongly across batches PR 622 G Eraslan
Bug fixes
rank_genes_groups()
t-test implementation doesn’t return NaN when variance is 0, also changed to scipy’s implementation PR 621 I Virshupumap()
withinit_pos='paga'
detects correctdtype
A Wolflouvain()
andleiden()
auto-generatekey_added=louvain_R
upon passingrestrict_to
, which was temporarily changed in1.4.1
A Wolf
Code design
neighbors()
andumap()
got rid of UMAP legacy code and introduced UMAP as a dependency PR 576 S Rybakov
1.4.1 2019-04-26
New functionality
Scanpy has a command line interface again. Invoking it with
scanpy somecommand [args]
callsscanpy-somecommand [args]
, except for builtin commands (currentlyscanpy settings
) PR 604 P Angererebi_expression_atlas()
allows convenient download of EBI expression atlas I Virshupmarker_gene_overlap()
computes overlaps of marker genes M Lueckenfilter_rank_genes_groups()
filters out genes based on fold change and fraction of cells expressing genes F Ramireznormalize_total()
replacesnormalize_per_cell()
, is more efficient and provides a parameter to only normalize using a fraction of expressed genes S Rybakovdownsample_counts()
has been sped up, changed default value ofreplace
parameter toFalse
PR 474 I Virshupembedding_density()
computes densities on embeddings PR 543 M Lueckenpalantir()
interfaces Palantir [^cite_setty18] PR 493 A Mousa
Code design
.layers
support of scatter plots F Ramirezfix double-logarithmization in compute of log fold change in
rank_genes_groups()
A Muñoz-Rojasfix return sections of docs P Angerer
Version 1.3
1.3.6 2018-12-11
Major updates
a new plotting gallery for
visualizing-marker-genes
F Ramireztutorials are integrated on ReadTheDocs,
pbmc3k
andpaga-paul15
A Wolf
Interactive exploration of analysis results through manifold viewers
CZI’s cellxgene directly reads
.h5ad
files the cellxgene developersthe UCSC Single Cell Browser requires exporting via
cellbrowser()
M Haeussler
Code design
highly_variable_genes()
supersedesfilter_genes_dispersion()
, it gives the same results but, by default, expects logarithmized data and doesn’t subset A Wolf
1.3.5 2018-12-09
uncountable figure improvements PR 369 F Ramirez
1.3.4 2018-11-24
leiden()
wraps the recent graph clustering package by [^cite_traag18] K Polanskibbknn()
wraps the recent batch correction package [^cite_polanski19] K Polanskicalculate_qc_metrics()
caculates a number of quality control metrics, similar tocalculateQCMetrics
from Scater [^cite_mccarthy17] I Virshup
1.3.3 2018-11-05
Major updates
a fully distributed preprocessing backend T White and the Laserson Lab
Code design
read_10x_h5()
andread_10x_mtx()
read Cell Ranger 3.0 outputs PR 334 Q Gong
Note
Also see changes in anndata 0.6.
changed default compression to
None
inwrite_h5ad()
to speed up read and write, disk space use is usually less criticalperformance gains in
write_h5ad()
due to better handling of strings and categories S Rybakov
1.3.1 2018-09-03
RNA velocity in single cells [^cite_manno18]
Scanpy and AnnData support loom’s layers so that computations for single-cell RNA velocity [^cite_manno18] become feasible S Rybakov and V Bergen
scvelo harmonizes with Scanpy and is able to process loom files with splicing information produced by Velocyto [^cite_manno18], it runs a lot faster than the count matrix analysis of Velocyto and provides several conceptual developments
Plotting (Generic)
dotplot()
for visualizing genes across conditions and clusters, see here PR 199 F Ramirezviolin()
produces very compact overview figures with many panels PR 175 F Ramirez
There now is a section on imputation in external:
Version 1.2
1.2.1 2018-06-08
Plotting of Generic marker genes and quality control.
highest_expr_genes()
for quality control; plot genes with highest mean fraction of cells, similar toplotQC
of Scater [^cite_mccarthy17] PR 169 F Ramirez
1.2.0 2018-06-08
Version 1.1
1.1.0 2018-06-01
set_figure_params()
by default passesvector_friendly=True
and allows you to produce reasonablly sized pdfs by rasterizing large scatter plots A Wolfdraw_graph()
defaults to the ForceAtlas2 layout [^cite_jacomy14] [^cite_chippada18], which is often more visually appealing and whose computation is much faster S Wollockscatter()
also plots along variables axis MD Lueckenregress_out()
is back to multiprocessing F Ramirezread()
reads compressed text files G Eraslanmitochondrial_genes()
for querying mito genes FG Brundumnn_correct()
for batch correction [^cite_haghverdi18] [^cite_kang18]phate()
for low-dimensional embedding [^cite_moon17] S Gigantesandbag()
,cyclone()
for scoring genes [^cite_scialdone15] [^cite_fechtner18]
Version 1.0
1.0.0 2018-03-30
Major updates
Scanpy is much faster and more memory efficient: preprocess, cluster and visualize 1.3M cells in 6h, 130K cells in 14min, and 68K cells in 3min A Wolf
the API gained a preprocessing function
neighbors()
and a classNeighbors()
to which all basic graph computations are delegated A Wolf
Warning
Upgrading to 1.0 isn’t fully backwards compatible in the following changes
the graph-based tools
louvain()
dpt()
draw_graph()
umap()
diffmap()
paga()
require prior computation of the graph:sc.pp.neighbors(adata, n_neighbors=5); sc.tl.louvain(adata)
instead of previouslysc.tl.louvain(adata, n_neighbors=5)
install
numba
viaconda install numba
, which replaces cythonthe default connectivity measure (dpt will look different using default settings) changed. setting
method='gauss'
insc.pp.neighbors
uses gauss kernel connectivities and reproduces the previous behavior, see, for instance in the example paul15.namings of returned annotation have changed for less bloated AnnData objects, which means that some of the unstructured annotation of old AnnData files is not recognized anymore
replace occurances of
group_by
withgroupby
(consistency withpandas
)it is worth checking out the notebook examples to see changes, e.g. the seurat example.
upgrading scikit-learn from 0.18 to 0.19 changed the implementation of PCA, some results might therefore look slightly different
```{rubric} Further updates
UMAP [^cite_mcinnes18] can serve as a first visualization of the data just as tSNE, in contrast to tSNE, UMAP directly embeds the single-cell graph and is faster; UMAP is also used for measuring connectivities and computing neighbors, see
neighbors()
A Wolfgraph abstraction: AGA is renamed to PAGA:
paga()
; now, it only measures connectivities between partitions of the single-cell graph, pseudotime and clustering need to be computed separately vialouvain()
anddpt()
, the connectivity measure has been improved A Wolflogistic regression for finding marker genes
rank_genes_groups()
with parametermethod='logreg'
A Wolflouvain()
provides a better implementation for reclustering viarestrict_to
A Wolfscanpy no longer modifies rcParams upon import, call
settings.set_figure_params
to set the ‘scanpy style’ A Wolfdefault cache directory is
./cache/
, setsettings.cachedir
to change this; nested directories in this are avoided A Wolfshow edges in scatter plots based on graph visualization
draw_graph()
andumap()
by passingedges=True
A Wolfdownsample_counts()
for downsampling counts MD Lueckendefault
'louvain_groups'
are called'louvain'
A Wolf'X_diffmap'
contains the zero component, plotting remains unchanged A Wolf
Version 0.4
0.4.3 2018-02-09
clustermap()
: heatmap from hierarchical clustering, based onseaborn.clustermap()
[^cite_waskom16] A Wolfonly return
matplotlib.axes.Axes
in plotting functions ofsc.pl
whenshow=False
, otherwiseNone
A Wolf
0.4.2 2018-01-07
amendments in PAGA and its plotting functions A Wolf
0.4.0 2017-12-23
export to SPRING [^cite_weinreb17] for interactive visualization of data: spring tutorial S Wollock
Version 0.3
0.3.2 2017-11-29
finding marker genes via
rank_genes_groups_violin()
improved, see issue 51 F Ramirez
0.3.0 2017-11-16
AnnData
gains methodconcatenate()
A WolfAnnData
is available as the separate anndata package P Angerer, A Wolfresults of PAGA simplified A Wolf
Version 0.2
0.2.9 2017-10-25
Initial release of the new trajectory inference method PAGA
paga()
computes an abstracted, coarse-grained (PAGA) graph of the neighborhood graph A Wolfpaga_compare()
plot this graph next an embedding A Wolfpaga_path()
plots a heatmap through a node sequence in the PAGA graph A Wolf
0.2.1 2017-07-24
Scanpy includes preprocessing, visualization, clustering, pseudotime and trajectory inference, differential expression testing and simulation of gene regulatory networks. The implementation efficiently deals with datasets of more than one million cells. A Wolf, P Angerer
Version 0.1
0.1.0 2017-05-17
Scanpy computationally outperforms and allows reproducing both the Cell Ranger R kit’s and most of Seurat’s clustering workflows. A Wolf, P Angerer