scanpy.pp.scale(data, *, zero_center=True, max_value=None, copy=False, layer=None, obsm=None, mask_obs=None)[source]#

Scale data to unit variance and zero mean.


Variables (genes) that do not display any variation (are constant across all observations) are retained and (for zero_center==True) set to 0 during this operation. In the future, they might be set to NaNs.

data AnnData | spmatrix | ndarray | Array

The (annotated) data matrix of shape n_obs × n_vars. Rows correspond to cells and columns to genes.

zero_center bool (default: True)

If False, omit zero-centering variables, which allows to handle sparse input efficiently.

max_value float | None (default: None)

Clip (truncate) to this value after scaling. If None, do not clip.

copy bool (default: False)

Whether this function should be performed inplace. If an AnnData object is passed, this also determines if a copy is returned.

layer str | None (default: None)

If provided, which element of layers to scale.

obsm str | None (default: None)

If provided, which element of obsm to scale.

mask_obs ndarray[Any, dtype[bool_]] | str | None (default: None)

Restrict both the derivation of scaling parameters and the scaling itself to a certain set of observations. The mask is specified as a boolean array or a string referring to an array in obs. This will transform data from csc to csr format if issparse(data).

Return type:

AnnData | spmatrix | ndarray | Array | None


Returns None if copy=False, else returns an updated AnnData object. Sets the following fields:

adata.X | adata.layers[layer]numpy.ndarray | scipy.sparse._csr.csr_matrix (dtype float)

Scaled count data matrix.

adata.var['mean']pandas.Series (dtype float)

Means per gene before scaling.

adata.var['std']pandas.Series (dtype float)

Standard deviations per gene before scaling.

adata.var['var']pandas.Series (dtype float)

Variances per gene before scaling.