scanpy.external.tl.sandbag

scanpy.external.tl.sandbag(adata, annotation=None, *, fraction=0.65, filter_genes=None, filter_samples=None)

Calculate marker pairs of genes. [Scialdone15] [Fechtner18].

Calculates the pairs of genes serving as marker pairs for each phase, based on a matrix of gene counts and an annotation of known phases.

This reproduces the approach of [Scialdone15] in the implementation of [Fechtner18].

More information and bug reports here.

Parameters
adata : AnnDataAnnData

The annotated data matrix.

annotation : Mapping | NoneOptional[Mapping[str, Collection[Union[str, int, bool]]]] (default: None)

Mapping from category to genes, e.g. {'phase': [Gene1, ...]}. Defaults to data.vars['category'].

fraction : floatfloat (default: 0.65)

Fraction of cells per category where marker criteria must be satisfied.

filter_genes : Collection[Union[str, int, bool]] | NoneOptional[Collection[Union[str, int, bool]]] (default: None)

Genes for sampling the reference set. Defaults to all genes.

filter_samples : Collection[Union[str, int, bool]] | NoneOptional[Collection[Union[str, int, bool]]] (default: None)

Cells for sampling the reference set. Defaults to all samples.

Return type

{str: List[Tuple[str, str]]}Dict[str, List[Tuple[str, str]]]

Returns

A dict mapping from category to lists of marker pairs, e.g.: {'Category_1': [(Gene_1, Gene_2), ...], ...}.

Examples

>>> from scanpy.external.tl import sandbag
>>> from pypairs import datasets
>>> adata = datasets.leng15()
>>> marker_pairs = sandbag(adata, fraction=0.5)