scanpy.pl.dotplot

scanpy.pl.dotplot(adata, var_names, groupby=None, use_raw=None, log=False, num_categories=7, expression_cutoff=0.0, mean_only_expressed=False, color_map='Reds', dot_max=None, dot_min=None, figsize=None, dendrogram=False, gene_symbols=None, var_group_positions=None, standard_scale=None, smallest_dot=0.0, var_group_labels=None, var_group_rotation=None, layer=None, show=None, save=None, **kwds)
../_images/scanpy.pl.dotplot.png

Makes a dot plot of the expression values of var_names.

For each var_name and each groupby category a dot is plotted. Each dot represents two values: mean expression within each category (visualized by color) and fraction of cells expressing the var_name in the category (visualized by the size of the dot). If groupby is not given, the dotplot assumes that all data belongs to a single category.

Note: A gene is considered expressed if the expression value in the adata (or adata.raw) is above the specified threshold which is zero by default.

An example of dotplot usage is to visualize, for multiple marker genes, the mean value and the percentage of cells expressing the gene accross multiple clusters.

See also rank_genes_groups_dotplot() to plot marker genes identified using the rank_genes_groups() function.

Parameters
adata : AnnData

Annotated data matrix.

var_names : str, list of str, dict or OrderedDict

var_names should be a valid subset of adata.var_names. If var_names is a dict, then the key is used as label to group the values (see var_group_labels). The dict values should be a list or str of valid adata.var_names. In this case either coloring or ‘brackets’ are used for the grouping of var names depending on the plot. When var_names is a dict, then the var_group_labels and var_group_positions are set.

groupby : str or None, optional (default: None)

The key of the observation grouping to consider.

log : bool, optional (default: False)

Plot on logarithmic axis.

use_raw : bool, optional (default: None)

Use raw attribute of adata if present.

num_categories : int, optional (default: 7)

Only used if groupby observation is not categorical. This value determines the number of groups into which the groupby observation should be subdivided.

figsize : (float, float), optional (default: None)

Figure size when multi_panel = True. Otherwise the rcParam[‘figure.figsize] value is used. Format is (width, height)

dendrogram : bool or str, optional (default, False)

If True or a valid dendrogram key, a dendrogram based on the hierarchical clustering between the groupby categories is added. The dendrogram information is computed using scanpy.tl.dendrogram(). If tl.dendrogram has not been called previously the function is called with default parameters.

gene_symbols : string, optional (default: None)

Column name in .var DataFrame that stores gene symbols. By default var_names refer to the index column of the .var DataFrame. Setting this option allows alternative names to be used.

var_group_positions : list of tuples.

Use this parameter to highlight groups of var_names. This will draw a ‘bracket’ or a color block between the given start and end positions. If the parameter var_group_labels is set, the corresponding labels are added on top/left. E.g. var_group_positions = [(4,10)] will add a bracket between the fourth var_name and the tenth var_name. By giving more positions, more brackets/color blocks are drawn.

var_group_labels : list of str

Labels for each of the var_group_positions that want to be highlighted.

var_group_rotation : float (default: None)

Label rotation degrees. By default, labels larger than 4 characters are rotated 90 degrees

layer : str, (default None)

Name of the AnnData object layer that wants to be plotted. By default adata.raw.X is plotted. If use_raw=False is set, then adata.X is plotted. If layer is set to a valid layer name, then the layer is plotted. layer takes precedence over use_raw.

expression_cutoff : float (default: 0.)

Expression cutoff that is used for binarizing the gene expression and determining the fraction of cells expressing given genes. A gene is expressed only if the expression value is greater than this threshold.

mean_only_expressed : bool (default: False)

If True, gene expression is averaged only over the cells expressing the given genes.

color_map : str, optional (default: Reds)

String denoting matplotlib color map.

dot_max : float optional (default: None)

If none, the maximum dot size is set to the maximum fraction value found (e.g. 0.6). If given, the value should be a number between 0 and 1. All fractions larger than dot_max are clipped to this value.

dot_min : float optional (default: None)

If none, the minimum dot size is set to 0. If given, the value should be a number between 0 and 1. All fractions smaller than dot_min are clipped to this value.

standard_scale : {'var', 'group'}, optional (default: None)

Whether or not to standardize that dimension between 0 and 1, meaning for each variable or group, subtract the minimum and divide each by its maximum.

smallest_dot : float optional (default: 0.)

If none, the smallest dot has size 0. All expression levels with dot_min are potted with smallest_dot dot size.

show

Show the plot, do not return axis.

save

If True or a str, save the figure. A string is appended to the default filename. Infer the filetype if ending on {‘.pdf’, ‘.png’, ‘.svg’}.

ax

A matplotlib axes object. Only works if plotting a single component.

**kwds : keyword arguments

Are passed to matplotlib.pyplot.scatter.

Returns

List of Axes

Examples

>>> adata = sc.datasets.pbmc68k_reduced()
>>> sc.pl.dotplot(adata, ['C1QA', 'PSAP', 'CD79A', 'CD79B', 'CST3', 'LYZ'],
...               groupby='bulk_labels', dendrogram=True)

Using var_names as dict: >>> markers = {‘T-cell’: ‘CD3D’, ‘B-cell’: ‘CD79A’, ‘myeloid’: ‘CST3’} >>> sc.pl.dotplot(adata, markers, groupby=’bulk_labels’, dendrogram=True)