scanpy.external.pp.harmony_integrate#
- scanpy.external.pp.harmony_integrate(adata, key, *, basis='X_pca', adjusted_basis='X_pca_harmony', **kwargs)[source]#
Use harmonypy [Korsunsky et al., 2019] to integrate different experiments.
Harmony [Korsunsky et al., 2019] is an algorithm for integrating single-cell data from multiple experiments. This function uses the python port of Harmony,
harmonypy
, to integrate single-cell data stored in an AnnData object. As Harmony works by adjusting the principal components, this function should be run after performing PCA but before computing the neighbor graph, as illustrated in the example below.- Parameters:
- adata
AnnData
The annotated data matrix.
- key
str
|Sequence
[str
] The name of the column in
adata.obs
that differentiates among experiments/batches. To integrate over two or more covariates, you can pass multiple column names as a list. Seevars_use
parameter of theharmonypy
package for more details.- basis
str
(default:'X_pca'
) The name of the field in
adata.obsm
where the PCA table is stored. Defaults to'X_pca'
, which is the default forsc.pp.pca()
.- adjusted_basis
str
(default:'X_pca_harmony'
) The name of the field in
adata.obsm
where the adjusted PCA table will be stored after running this function. Defaults toX_pca_harmony
.- kwargs
Any additional arguments will be passed to
harmonypy.run_harmony()
.
- adata
- Returns:
Updates adata with the field
adata.obsm[obsm_out_field]
, containing principal components adjusted by Harmony such that different experiments are integrated.
Example
First, load libraries and example dataset, and preprocess.
>>> import scanpy as sc >>> import scanpy.external as sce >>> adata = sc.datasets.pbmc3k() >>> sc.pp.recipe_zheng17(adata) >>> sc.pp.pca(adata)
We now arbitrarily assign a batch metadata variable to each cell for the sake of example, but during real usage there would already be a column in
adata.obs
giving the experiment each cell came from.>>> adata.obs['batch'] = 1350*['a'] + 1350*['b']
Finally, run harmony. Afterwards, there will be a new table in
adata.obsm
containing the adjusted PC’s.>>> sce.pp.harmony_integrate(adata, 'batch') >>> 'X_pca_harmony' in adata.obsm True