scanpy.pp.scrublet_simulate_doublets#
- scanpy.pp.scrublet_simulate_doublets(adata, *, layer=None, sim_doublet_ratio=2.0, synthetic_doublet_umi_subsampling=1.0, random_seed=0)[source]#
Simulate doublets by adding the counts of random observed transcriptome pairs.
- Parameters:
- adata
AnnData
The annotated data matrix of shape
n_obs
×n_vars
. Rows correspond to cells and columns to genes. Genes should have been filtered for expression and variability, and the object should contain raw expression of the same dimensions.- layer
str
|None
(default:None
) Layer of adata where raw values are stored, or ‘X’ if values are in .X.
- sim_doublet_ratio
float
(default:2.0
) Number of doublets to simulate relative to the number of observed transcriptomes. If
None
, self.sim_doublet_ratio is used.- synthetic_doublet_umi_subsampling
float
(default:1.0
) Rate for sampling UMIs when creating synthetic doublets. If 1.0, each doublet is created by simply adding the UMIs from two randomly sampled observed transcriptomes. For values less than 1, the UMI counts are added and then randomly sampled at the specified rate.
- adata
- Return type:
- Returns:
adata : anndata.AnnData with simulated doublets in .X Adds fields to
adata
:.obsm['scrublet']['doublet_parents']
Pairs of
.obs_names
used to generate each simulated doublet transcriptome.uns['scrublet']['parameters']
Dictionary of Scrublet parameters
See also
scrublet()
Main way of running Scrublet, runs preprocessing, doublet simulation (this function) and calling.
scrublet_score_distribution()
Plot histogram of doublet scores for observed transcriptomes and simulated doublets.