- scanpy.pp.filter_genes_dispersion(data, flavor='seurat', min_disp=None, max_disp=None, min_mean=None, max_mean=None, n_bins=20, n_top_genes=None, log=True, subset=True, copy=False)
Deprecated since version 1.3.6: Use
highly_variable_genes()instead. The new function is equivalent to the present function, except that
the new function always expects logarithmized data
subset=Falsein the new function, it suffices to merely annotate the genes, tools like
pp.pcawill detect the annotation
you can now call:
copyis replaced by
If trying out parameters, pass the data matrix instead of AnnData.
The normalized dispersion is obtained by scaling with the mean and standard deviation of the dispersions for genes falling into a given bin for mean expression of genes. This means that for each bin of mean expression, highly variable genes are selected.
flavor='cell_ranger'with care and in the same way as in
- data :
The (annotated) data matrix of shape
n_vars. Rows correspond to cells and columns to genes.
- flavor :
Literal[‘seurat’, ‘cell_ranger’] (default:
Choose the flavor for computing normalized dispersion. If choosing ‘seurat’, this expects non-logarithmized data – the logarithm of mean and dispersion is taken internally when
logis at its default value
True. For ‘cell_ranger’, this is usually called for logarithmized data – in this case you should set
False. In their default workflows, Seurat passes the cutoffs whereas Cell Ranger passes
- min_mean :
- max_mean :
- min_disp :
- max_disp :
None, these cutoffs for the means and the normalized dispersions are ignored.
- n_bins :
Number of bins for binning the mean gene expression. Normalization is done with respect to each bin. If just a single gene falls into a bin, the normalized dispersion is artificially set to 1. You’ll be informed about this if you set
settings.verbosity = 4.
- n_top_genes :
Number of highly-variable genes to keep.
- log :
Use the logarithm of the mean to variance ratio.
- subset :
Keep highly-variable genes only (if True) else write a bool array for h ighly-variable genes while keeping all genes
- copy :
AnnDatais passed, determines whether a copy is returned.
- data :
If an AnnData
adatais passed, returns or updates
copy. It filters the
adataand adds the annotations
Means per gene. Logarithmized when
Dispersions per gene. Logarithmized when
Normalized dispersions per gene. Logarithmized when
If a data matrix
Xis passed, the annotation is returned as
np.recarraywith the same information stored in fields: