scanpy.external.pp.hashsolo(adata, cell_hashing_columns, priors=[0.01, 0.8, 0.19], pre_existing_clusters=None, number_of_noise_barcodes=None, inplace=True)

Probabilistic demultiplexing of cell hashing data using HashSolo [Bernstein20].


More information and bug reports here.

adata : AnnData

Anndata object with cell hashes in .obs columns

cell_hashing_columns : list

list specifying which columns in adata.obs are cell hashing counts

priors : list (default: [0.01, 0.8, 0.19])

a list of your prior for each hypothesis first element is your prior for the negative hypothesis second element is your prior for the singlet hypothesis third element is your prior for the doublet hypothesis We use [0.01, 0.8, 0.19] by default because we assume the barcodes in your cell hashing matrix are those cells which have passed QC in the transcriptome space, e.g. UMI counts, pct mito reads, etc.

pre_existing_clusters : Optional[str] (default: None)

column in adata.obs for how to break up demultiplexing for example leiden or cell types, not batches though

number_of_noise_barcodes : Optional[int] (default: None)

Use this if you wish change the number of barcodes used to create the noise distribution. The default is number of cell hashes - 2.

inplace : bool (default: True)

To do operation in place


adata if inplace is False returns AnnData with demultiplexing results in .obs attribute otherwise does is in place


>>> import anndata
>>> import scanpy.external as sce
>>> data ="data.h5ad")
>>> sce.pp.hashsolo(data, ['Hash1', 'Hash2', 'Hash3'])
>>> data.obs.head()