scanpy.external.pp.hashsolo#
- scanpy.external.pp.hashsolo(adata, cell_hashing_columns, *, priors=(0.01, 0.8, 0.19), pre_existing_clusters=None, number_of_noise_barcodes=None, inplace=True)[source]#
Probabilistic demultiplexing of cell hashing data using HashSolo [Bernstein et al., 2020].
Note
More information and bug reports here.
- Parameters:
- adata
AnnData The (annotated) data matrix of shape
n_obs×n_vars. Rows correspond to cells and columns to genes.- cell_hashing_columns
Sequence[str] .obscolumns that contain cell hashing counts.- priors
tuple[float,float,float] (default:(0.01, 0.8, 0.19)) Prior probabilities of each hypothesis, in the order
[negative, singlet, doublet]. The default is set to[0.01, 0.8, 0.19]assuming barcode counts are from cells that have passed QC in the transcriptome space, e.g. UMI counts, pct mito reads, etc.- pre_existing_clusters
str|None(default:None) The column in
.obscontaining pre-existing cluster assignments (e.g. Leiden clusters or cell types, but not batch assignments). If provided, demultiplexing will be performed separately for each cluster.- number_of_noise_barcodes
int|None(default:None) The number of barcodes used to create the noise distribution. Defaults to
len(cell_hashing_columns) - 2.- inplace
bool(default:True) Whether to update
adatain-place or return a copy.
- adata
- Return type:
- Returns:
A copy of the input
adataifinplace=False, otherwise the inputadata. The following fields are added:.obs["most_likely_hypothesis"]Index of the most likely hypothesis, where
0corresponds to negative,1to singlet, and2to doublet..obs["cluster_feature"]The cluster assignments used for demultiplexing.
.obs["negative_hypothesis_probability"]Probability of the negative hypothesis.
.obs["singlet_hypothesis_probability"]Probability of the singlet hypothesis.
.obs["doublet_hypothesis_probability"]Probability of the doublet hypothesis.
.obs["Classification"]:Classification of the cell, one of the barcodes in
cell_hashing_columns,"Negative", or"Doublet".
Examples
>>> import anndata >>> import scanpy.external as sce >>> adata = anndata.read_h5ad("data.h5ad") >>> sce.pp.hashsolo(adata, ["Hash1", "Hash2", "Hash3"]) >>> adata.obs.head()