scanpy.external.pp.hashsolo#
- scanpy.external.pp.hashsolo(adata, cell_hashing_columns, *, priors=(0.01, 0.8, 0.19), pre_existing_clusters=None, number_of_noise_barcodes=None, inplace=True)[source]#
Probabilistic demultiplexing of cell hashing data using HashSolo [Bernstein et al., 2020].
Note
More information and bug reports here.
- Parameters:
- adata
AnnData
The (annotated) data matrix of shape
n_obs
×n_vars
. Rows correspond to cells and columns to genes.- cell_hashing_columns
Sequence
[str
] .obs
columns that contain cell hashing counts.- priors
tuple
[float
,float
,float
] (default:(0.01, 0.8, 0.19)
) Prior probabilities of each hypothesis, in the order
[negative, singlet, doublet]
. The default is set to[0.01, 0.8, 0.19]
assuming barcode counts are from cells that have passed QC in the transcriptome space, e.g. UMI counts, pct mito reads, etc.- pre_existing_clusters
str
|None
(default:None
) The column in
.obs
containing pre-existing cluster assignments (e.g. Leiden clusters or cell types, but not batch assignments). If provided, demultiplexing will be performed separately for each cluster.- number_of_noise_barcodes
int
|None
(default:None
) The number of barcodes used to create the noise distribution. Defaults to
len(cell_hashing_columns) - 2
.- inplace
bool
(default:True
) Whether to update
adata
in-place or return a copy.
- adata
- Return type:
- Returns:
A copy of the input
adata
ifinplace=False
, otherwise the inputadata
. The following fields are added:.obs["most_likely_hypothesis"]
Index of the most likely hypothesis, where
0
corresponds to negative,1
to singlet, and2
to doublet..obs["cluster_feature"]
The cluster assignments used for demultiplexing.
.obs["negative_hypothesis_probability"]
Probability of the negative hypothesis.
.obs["singlet_hypothesis_probability"]
Probability of the singlet hypothesis.
.obs["doublet_hypothesis_probability"]
Probability of the doublet hypothesis.
.obs["Classification"]
:Classification of the cell, one of the barcodes in
cell_hashing_columns
,"Negative"
, or"Doublet"
.
Examples
>>> import anndata >>> import scanpy.external as sce >>> adata = anndata.read_h5ad("data.h5ad") >>> sce.pp.hashsolo(adata, ["Hash1", "Hash2", "Hash3"]) >>> adata.obs.head()