scanpy.api.pp.recipe_zheng17(adata, n_top_genes=1000, log=True, plot=False, copy=False)

Normalization and filtering as of [Zheng17].

Reproduces the preprocessing of [Zheng17] - the Cell Ranger R Kit of 10x Genomics.

Expects non-logarithmized data. If using logarithmized data, pass log=False.

The recipe runs the following steps

sc.pp.filter_genes(adata, min_counts=1)  # only consider genes with more than 1 count
sc.pp.normalize_per_cell(                # normalize with total UMI count per cell
     adata, key_n_counts='n_counts_all')
filter_result = sc.pp.filter_genes_dispersion(  # select highly-variable genes
    adata.X, flavor='cell_ranger', n_top_genes=n_top_genes, log=False)
adata = adata[:, filter_result.gene_subset]     # subset the genes
sc.pp.normalize_per_cell(adata)          # renormalize after filtering
if log: sc.pp.log1p(adata)               # log transform: adata.X = log(adata.X + 1)
sc.pp.scale(adata)                       # scale to unit variance and shift to zero mean
adata : AnnData

Annotated data matrix.

n_top_genes : int, optional (default: 1000)

Number of genes to keep.

log : bool, optional (default: True)

Take logarithm.

plot : bool, optional (default: True)

Show a plot of the gene dispersion vs. mean relation.

copy : bool, optional (default: False)

Return a copy of adata instead of updating it.


Return type:

Returns or updates adata depending on copy.