scanpy.tl.dpt#
- scanpy.tl.dpt(adata, n_dcs=10, *, n_branchings=0, min_group_size=0.01, allow_kendall_tau_shift=True, neighbors_key=None, copy=False)[source]#
Infer progression of cells through geodesic distance along the graph [Haghverdi et al., 2016, Wolf et al., 2019].
Reconstruct the progression of a biological process from snapshot data.
Diffusion Pseudotimewas introduced by Haghverdi et al. [2016] and implemented within Scanpy [Wolf et al., 2018]. Here, we use a further developed version, which is able to deal with disconnected graphs [Wolf et al., 2019] and can be run in ahierarchicalmode by setting the parametern_branchings>1. We recommend, however, to only usedpt()for computing pseudotime (n_branchings=0) and to detect branchings viapaga(). For pseudotime, you need to annotate your data with a root cell. For instance:adata.uns["iroot"] = np.flatnonzero(adata.obs["cell_types"] == "Stem")[0]
This requires running
neighbors(), first. In order to reproduce the original implementation of DPT, usemethod=='gauss'. Using the defaultmethod=='umap'only leads to minor quantitative differences, though.Added in version 1.1.
dpt()also requires to rundiffmap()first. As previously,dpt()came with a default parameter ofn_dcs=10butdiffmap()has a default parameter ofn_comps=15, you need to passn_comps=10indiffmap()in order to exactly reproduce previousdpt()results.- Parameters:
- adata
AnnData Annotated data matrix.
- n_dcs
int(default:10) The number of diffusion components to use.
- n_branchings
int(default:0) Number of branchings to detect.
- min_group_size
float(default:0.01) During recursive splitting of branches (‘dpt groups’) for
n_branchings> 1, do not consider groups that contain less thanmin_group_sizedata points. If a float,min_group_sizerefers to a fraction of the total number of data points.- allow_kendall_tau_shift
bool(default:True) If a very small branch is detected upon splitting, shift away from maximum correlation in Kendall tau criterion of Haghverdi et al. [2016] to stabilize the splitting.
- neighbors_key
str|None(default:None) If not specified, dpt looks in .uns[‘neighbors’] for neighbors settings and .obsp[‘connectivities’] and .obsp[‘distances’] for connectivities and distances, respectively (default storage places for pp.neighbors). If specified, dpt looks in .uns[neighbors_key] for neighbors settings and .obsp[.uns[neighbors_key][‘connectivities_key’]] and .obsp[.uns[neighbors_key][‘distances_key’]] for connectivities and distances, respectively.
- copy
bool(default:False) Copy instance before computation and return a copy. Otherwise, perform computation inplace and return
None.
- adata
- Return type:
- Returns:
Returns
Noneifcopy=False, else returns anAnnDataobject. Sets the following fields (Ifn_branchings==0, no fieldadata.obs['dpt_groups']will be written):adata.obs['dpt_pseudotime']pandas.Series(dtypefloat)Array of dim (number of samples) that stores the pseudotime of each cell, that is, the DPT distance with respect to the root cell.
adata.obs['dpt_groups']pandas.Series(dtypecategory)Array of dim (number of samples) that stores the subgroup id (‘0’, ‘1’, …) for each cell. The groups typically correspond to ‘progenitor cells’, ‘undecided cells’ or ‘branches’ of a process.
Notes
The tool is similar to the R package
destinyof Angerer et al. [2015].