scanpy.pp.scale

scanpy.pp.scale#

scanpy.pp.scale(data, *, zero_center=True, max_value=None, copy=False, layer=None, obsm=None, mask_obs=None)[source]#

Scale data to unit variance and zero mean.

Note

Variables (genes) that do not display any variation (are constant across all observations) are retained and (for zero_center==True) set to 0 during this operation. In the future, they might be set to NaNs.

Parameters:

data Union[AnnData, TypeVar(_A, bound= csr_array | csc_array | csr_matrix | csc_matrix | ndarray | Array)]: The (annotated) data matrix of shape n_obs × n_vars. Rows correspond to cells and columns to genes.
zero_center bool (default: True): If False, omit zero-centering variables, which allows to handle sparse input efficiently.
max_value float | None (default: None): Clip (truncate) to this value after scaling. If None, do not clip.
copy bool (default: False): Whether this function should be performed inplace. If an AnnData object is passed, this also determines if a copy is returned.
layer str | None (default: None): If provided, which element of layers to scale.
obsm str | None (default: None): If provided, which element of obsm to scale.
mask_obs ndarray[tuple[int, ...], dtype[bool]] | str | None (default: None): Restrict both the derivation of scaling parameters and the scaling itself to a certain set of observations. The mask is specified as a boolean array or a string referring to an array in obs. This will transform data from csc to csr format if issparse(data).

Return type:

Returns:

Returns None if copy=False, else returns an updated AnnData object. Sets the following fields:

adata.X | adata.layers[layer]numpy.ndarray | scipy.sparse.csr_matrix (dtype float): Scaled count data matrix.
adata.var['mean']pandas.Series (dtype float): Means per gene before scaling.
adata.var['std']pandas.Series (dtype float): Standard deviations per gene before scaling.
adata.var['var']pandas.Series (dtype float): Variances per gene before scaling.

scanpy.pp.scale

Contents

scanpy.pp.scale#