scanpy.external.pp.hashsolo

scanpy.external.pp.hashsolo(adata, cell_hashing_columns, priors=[0.01, 0.8, 0.19], pre_existing_clusters=None, number_of_noise_barcodes=2, inplace=True)

Probabilistic demultiplexing of cell hashing data using HashSolo [Bernstein20].

Note

More information and bug reports here.

Parameters
adata : AnnDataAnnData

Anndata object with cell hashes in .obs columns

cell_hashing_columns : listlist

list specifying which columns in adata.obs are cell hashing counts

priors : listlist (default: [0.01, 0.8, 0.19])

a list of your prior for each hypothesis first element is your prior for the negative hypothesis second element is your prior for the singlet hypothesis third element is your prior for the doublet hypothesis We use [0.01, 0.8, 0.19] by default because we assume the barcodes in your cell hashing matrix are those cells which have passed QC in the transcriptome space, e.g. UMI counts, pct mito reads, etc.

pre_existing_clusters : str | NoneOptional[str] (default: None)

column in adata.obs for how to break up demultiplexing for example leiden or cell types, not batches though

number_of_noise_barcodes : intint (default: 2)

Use this if you wish change the number of barcodes used to create the noise distribution. The default is number of cell hashes - 2.

inplace : boolbool (default: True)

To do operation in place

Returns

adata if inplace is False returns AnnData with demultiplexing results in .obs attribute otherwise does is in place

Examples

>>> import anndata
>>> import scanpy.external as sce
>>> data = anndata.read("data.h5ad")
>>> sce.pp.hashsolo(data, ['Hash1', 'Hash2', 'Hash3'])
>>> data.obs.head()