Contents, gene_list, *, ctrl_size=50, gene_pool=None, n_bins=25, score_name='score', random_state=0, copy=False, use_raw=None)[source]#

Score a set of genes [Satija et al., 2015].

The score is the average expression of a set of genes subtracted with the average expression of a reference set of genes. The reference set is randomly sampled from the gene_pool for each binned expression value.

This reproduces the approach in Seurat [Satija et al., 2015] and has been implemented for Scanpy by Davide Cittaro.


The annotated data matrix.


The list of gene names used for score calculation.

ctrl_size default: 50

Number of reference genes to be sampled from each bin. If len(gene_list) is not too low, you can set ctrl_size=len(gene_list).

gene_pool default: None

Genes for sampling the reference set. Default is all genes.

n_bins default: 25

Number of expression level bins for sampling.

score_name default: 'score'

Name of the field to be added in .obs.

random_state default: 0

The random seed for sampling.

copy default: False

Copy adata or modify it inplace.

use_raw default: None

Whether to use raw attribute of adata. Defaults to True if .raw is present.

Changed in version 1.4.5: Default value changed from False to None.


Returns None if copy=False, else returns an AnnData object. Sets the following field:

adata.obs[score_name]numpy.ndarray (dtype float)

Scores of each cell.


See this notebook.