scanpy.tl.dpt#
- scanpy.tl.dpt(adata, n_dcs=10, *, n_branchings=0, min_group_size=0.01, allow_kendall_tau_shift=True, neighbors_key=None, copy=False)[source]#
Infer progression of cells through geodesic distance along the graph [Haghverdi et al., 2016, Wolf et al., 2019].
Reconstruct the progression of a biological process from snapshot data.
Diffusion Pseudotime
was introduced by Haghverdi et al. [2016] and implemented within Scanpy [Wolf et al., 2018]. Here, we use a further developed version, which is able to deal with disconnected graphs [Wolf et al., 2019] and can be run in ahierarchical
mode by setting the parametern_branchings>1
. We recommend, however, to only usedpt()
for computing pseudotime (n_branchings=0
) and to detect branchings viapaga()
. For pseudotime, you need to annotate your data with a root cell. For instance:adata.uns['iroot'] = np.flatnonzero(adata.obs['cell_types'] == 'Stem')[0]
This requires running
neighbors()
, first. In order to reproduce the original implementation of DPT, usemethod=='gauss'
. Using the defaultmethod=='umap'
only leads to minor quantitative differences, though.Added in version 1.1.
dpt()
also requires to rundiffmap()
first. As previously,dpt()
came with a default parameter ofn_dcs=10
butdiffmap()
has a default parameter ofn_comps=15
, you need to passn_comps=10
indiffmap()
in order to exactly reproduce previousdpt()
results.- Parameters:
- adata
AnnData
Annotated data matrix.
- n_dcs
int
(default:10
) The number of diffusion components to use.
- n_branchings
int
(default:0
) Number of branchings to detect.
- min_group_size
float
(default:0.01
) During recursive splitting of branches (‘dpt groups’) for
n_branchings
> 1, do not consider groups that contain less thanmin_group_size
data points. If a float,min_group_size
refers to a fraction of the total number of data points.- allow_kendall_tau_shift
bool
(default:True
) If a very small branch is detected upon splitting, shift away from maximum correlation in Kendall tau criterion of Haghverdi et al. [2016] to stabilize the splitting.
- neighbors_key
str
|None
(default:None
) If not specified, dpt looks in .uns[‘neighbors’] for neighbors settings and .obsp[‘connectivities’] and .obsp[‘distances’] for connectivities and distances, respectively (default storage places for pp.neighbors). If specified, dpt looks in .uns[neighbors_key] for neighbors settings and .obsp[.uns[neighbors_key][‘connectivities_key’]] and .obsp[.uns[neighbors_key][‘distances_key’]] for connectivities and distances, respectively.
- copy
bool
(default:False
) Copy instance before computation and return a copy. Otherwise, perform computation inplace and return
None
.
- adata
- Return type:
- Returns:
Returns
None
ifcopy=False
, else returns anAnnData
object. Sets the following fields (Ifn_branchings==0
, no fieldadata.obs['dpt_groups']
will be written):adata.obs['dpt_pseudotime']
pandas.Series
(dtypefloat
)Array of dim (number of samples) that stores the pseudotime of each cell, that is, the DPT distance with respect to the root cell.
adata.obs['dpt_groups']
pandas.Series
(dtypecategory
)Array of dim (number of samples) that stores the subgroup id (‘0’, ‘1’, …) for each cell. The groups typically correspond to ‘progenitor cells’, ‘undecided cells’ or ‘branches’ of a process.
Notes
The tool is similar to the R package
destiny
of Angerer et al. [2015].