scanpy.external.tl.sandbag

Contents

scanpy.external.tl.sandbag#

scanpy.external.tl.sandbag(adata, annotation=None, *, fraction=0.65, filter_genes=None, filter_samples=None)[source]#

Calculate marker pairs of genes [Fechtner, 2018, Scialdone et al., 2015].

Calculates the pairs of genes serving as marker pairs for each phase, based on a matrix of gene counts and an annotation of known phases.

This reproduces the approach of Scialdone et al. [2015] in the implementation of Fechtner [2018].

More information and bug reports here.

Parameters:
adata AnnData

The annotated data matrix.

annotation Mapping[str, Collection[str | int | bool]] | None (default: None)

Mapping from category to genes, e.g. {'phase': [Gene1, ...]}. Defaults to data.vars['category'].

fraction float (default: 0.65)

Fraction of cells per category where marker criteria must be satisfied.

filter_genes Collection[str | int | bool] | None (default: None)

Genes for sampling the reference set. Defaults to all genes.

filter_samples Collection[str | int | bool] | None (default: None)

Cells for sampling the reference set. Defaults to all samples.

Return type:

dict[str, list[tuple[str, str]]]

Returns:

A dict mapping from category to lists of marker pairs, e.g.: {'Category_1': [(Gene_1, Gene_2), ...], ...}.

Examples

>>> from scanpy.external.tl import sandbag
>>> from pypairs import datasets
>>> adata = datasets.leng15()
>>> marker_pairs = sandbag(adata, fraction=0.5)