scanpy.tl.umap#
- scanpy.tl.umap(adata, *, min_dist=0.5, spread=1.0, n_components=2, maxiter=None, alpha=1.0, gamma=1.0, negative_sample_rate=5, init_pos='spectral', random_state=0, a=None, b=None, method='umap', key_added=None, neighbors_key='neighbors', copy=False)[source]#
Embed the neighborhood graph using UMAP [McInnes et al., 2018].
UMAP (Uniform Manifold Approximation and Projection) is a manifold learning technique suitable for visualizing high-dimensional data. Besides tending to be faster than tSNE, it optimizes the embedding such that it best reflects the topology of the data, which we represent throughout Scanpy using a neighborhood graph. tSNE, by contrast, optimizes the distribution of nearest-neighbor distances in the embedding such that these best match the distribution of distances in the high-dimensional space. We use the implementation of umap-learn [McInnes et al., 2018]. For a few comparisons of UMAP with tSNE, see Becht et al. [2018].
- Parameters:
- adata
AnnData
Annotated data matrix.
- min_dist
float
(default:0.5
) The effective minimum distance between embedded points. Smaller values will result in a more clustered/clumped embedding where nearby points on the manifold are drawn closer together, while larger values will result on a more even dispersal of points. The value should be set relative to the
spread
value, which determines the scale at which embedded points will be spread out. The default of in theumap-learn
package is 0.1.- spread
float
(default:1.0
) The effective scale of embedded points. In combination with
min_dist
this determines how clustered/clumped the embedded points are.- n_components
int
(default:2
) The number of dimensions of the embedding.
- maxiter
int
|None
(default:None
) The number of iterations (epochs) of the optimization. Called
n_epochs
in the original UMAP.- alpha
float
(default:1.0
) The initial learning rate for the embedding optimization.
- gamma
float
(default:1.0
) Weighting applied to negative samples in low dimensional embedding optimization. Values higher than one will result in greater weight being given to negative samples.
- negative_sample_rate
int
(default:5
) The number of negative edge/1-simplex samples to use per positive edge/1-simplex sample in optimizing the low dimensional embedding.
- init_pos
Union
[Literal
['paga'
,'spectral'
,'random'
],ndarray
,None
] (default:'spectral'
) How to initialize the low dimensional embedding. Called
init
in the original UMAP. Options are:Any key for
adata.obsm
.’paga’: positions from
paga()
.’spectral’: use a spectral embedding of the graph.
’random’: assign initial embedding positions at random.
A numpy array of initial embedding positions.
- random_state
int
|RandomState
|None
(default:0
) If
int
,random_state
is the seed used by the random number generator; IfRandomState
orGenerator
,random_state
is the random number generator; IfNone
, the random number generator is theRandomState
instance used bynp.random
.- a
float
|None
(default:None
) More specific parameters controlling the embedding. If
None
these values are set automatically as determined bymin_dist
andspread
.- b
float
|None
(default:None
) More specific parameters controlling the embedding. If
None
these values are set automatically as determined bymin_dist
andspread
.- method
Literal
['umap'
,'rapids'
] (default:'umap'
) Chosen implementation.
'umap'
Umap’s simplical set embedding.
'rapids'
GPU accelerated implementation.
Deprecated since version 1.10.0: Use
rapids_singlecell.tl.umap()
instead.
- key_added
str
|None
(default:None
) If not specified, the embedding is stored as
obsm
['X_umap']
and the the parameters inuns
['umap']
. If specified, the embedding is stored asobsm
[key_added]
and the the parameters inuns
[key_added]
.- neighbors_key
str
(default:'neighbors'
) Umap looks in
uns
[neighbors_key]
for neighbors settings andobsp
[.uns[neighbors_key]['connectivities_key']]
for connectivities.- copy
bool
(default:False
) Return a copy instead of writing to adata.
- adata
- Return type:
- Returns:
Returns
None
ifcopy=False
, else returns anAnnData
object. Sets the following fields:adata.obsm['X_umap' | key_added]
numpy.ndarray
(dtypefloat
)UMAP coordinates of data.
adata.uns['umap' | key_added]
dict
UMAP parameters.