scanpy.pp.sample

Contents

scanpy.pp.sample#

scanpy.pp.sample(data, fraction=None, *, n=None, rng=None, copy=False, replace=False, axis='obs', p=None)[source]#

Sample observations or variables with or without replacement.

Parameters:
data AnnData | ndarray | csr_matrix | csc_matrix | Array

The (annotated) data matrix of shape n_obs × n_vars. Rows correspond to cells and columns to genes.

fraction float | None (default: None)

Sample to this fraction of the number of observations or variables. (All of them, even if there are 0`s/`False`s in `p.) This can be larger than 1.0, if replace=True. See axis and replace.

n int | None (default: None)

Sample to this number of observations or variables. See axis.

random_state

Random seed to change subsampling.

copy bool (default: False)

If an AnnData is passed, determines whether a copy is returned.

replace bool (default: False)

If True, samples are drawn with replacement.

axis Literal['obs', 0, 'var', 1] (default: 'obs')

Sample observations (axis 0) or variables (axis 1).

p str | ndarray[Any, dtype[bool]] | ndarray[Any, dtype[floating]] | None (default: None)

Drawing probabilities (floats) or mask (bools). Either an axis-sized array, or the name of a column. If p is an array of probabilities, it must sum to 1.

Return type:

AnnData | None | tuple[ndarray | csr_matrix | csc_matrix | Array, ndarray[Any, dtype[int64]]]

Returns:

If isinstance(data, AnnData) and copy=False, this function returns None. Otherwise:

data[indices, :] | data[:, indices] (depending on axis)

If data is array-like or copy=True, returns the subset.

indicesnumpy.ndarray

If data is array-like, also returns the indices into the original.