Plotting with Marsilea#

Marsilea is a visualization library that allows user to create composable visualization in a declarative way.

You can use it to create many scanpy plots with easy customization.

Let’s first load the PBMC datdaset

import numpy as np
import scanpy as sc

pbmc = sc.datasets.pbmc3k_processed().raw.to_adata()
pbmc
AnnData object with n_obs × n_vars = 2638 × 13714
    obs: 'n_genes', 'percent_mito', 'n_counts', 'louvain'
    var: 'n_cells'
    uns: 'draw_graph', 'louvain', 'louvain_colors', 'neighbors', 'pca', 'rank_genes_groups'
    obsm: 'X_pca', 'X_tsne', 'X_umap', 'X_draw_graph_fr'
    obsp: 'distances', 'connectivities'

Define the cells and markers that we want to draw

cell_markers = {
    "CD4 T cells": ["IL7R"],
    "CD14+ Monocytes": ["CD14", "LYZ"],
    "B cells": ["MS4A1"],
    "CD8 T cells": ["CD8A"],
    "NK cells": ["GNLY", "NKG7"],
    "FCGR3A+ Monocytes": ["FCGR3A", "MS4A7"],
    "Dendritic cells": ["FCER1A", "CST3"],
    "Megakaryocytes": ["PPBP"],
}

cells, markers = [], []
for c, ms in cell_markers.items():
    cells += [c] * len(ms)
    markers += ms

uni_cells = list(cell_markers.keys())
cell_colors = [
    "#568564",
    "#DC6B19",
    "#F72464",
    "#005585",
    "#9876DE",
    "#405559",
    "#58DADA",
    "#F85959",
]
cmapper = dict(zip(uni_cells, cell_colors))

Import Marsilea

import marsilea as ma
import marsilea.plotter as mp

Heatmap#

Here is the minimum example to create a heatmap with Marsilea, it does nothing besides create a heatmap. You can adjust the main plot size by setting height and width, the unit is inches.

We will start adding components to this main plot step by step.

exp = pbmc[:, markers].X.toarray()

m = ma.Heatmap(exp, cmap="viridis", height=3.5, width=3)
m.render()
../_images/ba1422e2b60f77f4c59fc07a3c5ba6e7f359b149f5b44d7efe832e736794d28e.png

To replicate the scanpy heatmap, we can first divide the heatmap by cell types.

The group_rows method can group heatmap by group labels, the first argument is used to label the row, the order defines the display order of each cell type from top to bottom.

m.group_rows(pbmc.obs["louvain"], order=uni_cells)
m.render()
../_images/2ad8c5624f6b1b9c504a9523579d15ea2f93a9fd5b609091b38786f953fe046b.png

Now we can label each chunks with cell types and the columns with marker names.

You can use add_left or add_* to add a plotter to anyside of your main plot. In Marsilea, a plot instance is called plotter.

When you add a plotter, you can easily adjust its size and the padding between adjcent plot using size and pad.

# Create plotters
chunk = mp.Chunk(uni_cells, rotation=0, align="center")
colors = mp.Colors(list(pbmc.obs["louvain"]), palette=cmapper)
label_markers = mp.Labels(markers)

# Add to the heatmap
m.add_left(colors, size=0.1, pad=0.1)
m.add_left(chunk)
m.add_top(label_markers, pad=0.1)
m.render()
../_images/da883131a1fe9eb22a8c860e07c36a34e00e64e4442b006d9dde552fc2a40ead.png

You may want to add dendrogram to display the similarity among cell types.

You can use add_dendrogram, we will add it to the right side, but you can also add it to the left side if you want.

m.add_dendrogram("right", add_base=False)
m.render()
../_images/d81354fd0bdf8a72378abcefbc182d8285486fa17d8bd53f0c1f47b6cd1d28c4.png

The legend is still mising, you can use add_legends to add all legends at once. Marsilea will automatically layout all the legends.

In the end, you can you add_title to add a title.

m.add_legends()
m.add_title("Expression Profile")
m.render()
../_images/2d213968ea3f453f2cae74ae49e17f1fec45a53a476ae046b14010f283eda1a7.png

OK, let’s wrap up all the code in below. There are few things you should notice in Marsilea:

  • Always call render() in the end to actually render the plot.

  • The order of add_* operation decides the order of plotters. But group_rows and group_cols can be called anytime.

m = ma.Heatmap(exp, cmap="viridis", height=4, width=3)
m.group_rows(pbmc.obs["louvain"], order=uni_cells)

m.add_left(
    mp.Colors(list(pbmc.obs["louvain"]), palette=cmapper),
    size=0.1,
    pad=0.1,
)
m.add_left(mp.Chunk(uni_cells, rotation=0, align="center"))
m.add_top(mp.Labels(markers), pad=0.1)
m.add_dendrogram("right", add_base=False)

m.add_legends()
m.add_title("Expression Profile")
m.render()
../_images/4d8511ed85ad86dbf378823408503c528a42a79cd98b0e04f7be0f949decd72d.png

Now that we’ve covered some basics of Marsilea, we’ll see how it can be used to create custom plots similar to scanpy’s existing methods:

agg = sc.get.aggregate(pbmc[:, markers], by="louvain", func=["mean", "count_nonzero"])
agg.obs["cell_counts"] = pbmc.obs["louvain"].value_counts()
agg
AnnData object with n_obs × n_vars = 8 × 12
    obs: 'louvain', 'cell_counts'
    var: 'n_cells'
    layers: 'mean', 'count_nonzero'
agg_exp = agg.layers["mean"]
agg_count = agg.layers["count_nonzero"]
agg_cell_counts = agg.obs["cell_counts"].to_numpy()

Matrixplot#

h, w = agg_exp.shape

m = ma.Heatmap(
    agg_exp,
    height=h / 3,
    width=w / 3,
    cmap="Blues",
    linewidth=0.5,
    linecolor="lightgray",
    label="Expression",
)
m.add_right(mp.Labels(agg.obs["louvain"], align="center"), pad=0.1)
m.add_top(mp.Labels(markers), pad=0.1)
m.group_cols(cells, order=uni_cells)
m.add_top(mp.Chunk(uni_cells, fill_colors=cell_colors, rotation=90))
m.add_left(mp.Numbers(agg_cell_counts, color="#EEB76B", label="Count"))
m.add_dendrogram("right", pad=0.1)
m.add_legends()
m.render()
../_images/4391834d507f58697345693f7e02b03afbb9057c3fe0d8386df4a8596d54803d.png

Dot plot#

size = agg_count / agg_cell_counts[:, np.newaxis]
m = ma.SizedHeatmap(
    size=size,
    color=agg_exp,
    cluster_data=size,
    height=h / 3,
    width=w / 3,
    edgecolor="lightgray",
    cmap="Blues",
    size_legend_kws=dict(
        colors="#538bbf",
        title="Fraction of cells\nin groups (%)",
        labels=["20%", "40%", "60%", "80%", "100%"],
        show_at=[0.2, 0.4, 0.6, 0.8, 1.0],
    ),
    color_legend_kws=dict(title="Mean expression\nin group"),
)

m.add_top(mp.Labels(markers), pad=0.1)
m.add_top(mp.Chunk(uni_cells, fill_colors=cell_colors, rotation=90))
m.group_cols(cells, order=uni_cells)

m.add_right(mp.Labels(agg.obs["louvain"], align="center"), pad=0.1)
m.add_left(
    mp.Numbers(agg_cell_counts, color="#EEB76B", label="Count"), size=0.5, pad=0.1
)
m.add_dendrogram("right", pad=0.1)
m.add_legends()
m.render()
../_images/e1ed4633b95cd6bec57e431a9b5349d46a96ac78dd24940d173cd3fca0428694.png

Tracksplot#

tp = ma.ZeroHeightCluster(exp.T, width=20)
tp_anno = ma.ZeroHeight(width=1)

tp.group_cols(pbmc.obs["louvain"], order=uni_cells, spacing=0.005)
tp.add_dendrogram("top", add_base=False, size=1)
for row, gene_name in zip(exp.T, markers):
    area = mp.Area(
        row,
        add_outline=False,
        alpha=1,
        group_kws={"color": cell_colors},
    )
    tp.add_bottom(area, size=0.4, pad=0.1)
tp.add_bottom(mp.Colors(pbmc.obs["louvain"], palette=cmapper), size=0.1, pad=0.1)
tp.add_bottom(mp.Chunk(uni_cells, rotation=90))

(tp + 0.1 + tp_anno).render()
../_images/a8fd5ce09328052fe68df76c87130f355b9a63a46d81cbfd679a2c87cb6ab4e7.png

Stacked Violin#

import pandas as pd
from matplotlib.cm import ScalarMappable
from matplotlib.colors import Normalize

gene_data = []
cdata = []
for row, gene_name in zip(exp.T, markers[:5]):
    # Transform data to wide-format, marsilea only supports wide-format
    pdata = (
        pd.DataFrame({"exp": row, "cell_type": pbmc.obs["louvain"]})
        .reset_index(drop=True)
        .pivot(columns="cell_type", values="exp")
    )
    gene_data.append([gene_name, pdata])
    cdata.append(pdata.median())

# Create color mappable
cdata = pd.concat(cdata, axis=1).T
vmin, vmax = np.asarray(cdata).min(), np.asarray(cdata).max()
sm = ScalarMappable(norm=Normalize(vmin=vmin, vmax=vmax), cmap="Blues")

sv = ma.ZeroHeightCluster(agg_exp.T, width=3)
sv_anno = ma.ZeroHeight(width=1)

for gene_name, pdata in gene_data:
    # Calculate the color to display
    palette = sm.to_rgba(pdata.median()).tolist()
    sv.add_bottom(
        mp.Violin(
            pdata,
            inner=None,
            linecolor=".7",
            linewidth=0.5,
            density_norm="width",
            palette=palette,
        ),
        size=0.5,
        pad=0.1,
        legend=False,
    )
    sv_anno.add_bottom(mp.Title(gene_name, align="left"), size=0.5, pad=0.1)

sv.add_bottom(mp.Labels(cdata.columns))
sv.add_dendrogram("top")

# To fake a legend
sv.add_bottom(
    mp.ColorMesh(
        cdata,
        cmap="Blues",
        cbar_kws={"title": "Median expression\nin group", "orientation": "horizontal"},
    ),
    size=0,
)
comp = sv + 0.1 + sv_anno
comp.add_legends()
comp.render()
../_images/28915039aa4b875685c8d73a7139b94924f2a96e9041916eaeb5d593c3bace11.png

More information#

Other plots are possible with Marsilea by creating new plotter. See Marsilea’s documention to learn how.

import session_info

session_info.show(dependencies=True)
Click to view session information
-----
anndata             0.10.6
marsilea            0.3.5
matplotlib          3.8.3
numpy               1.26.4
pandas              2.2.0
scanpy              1.10.1
session_info        1.0.0
-----
Click to view modules imported as dependencies
PIL                         10.2.0
anyio                       NA
appnope                     0.1.3
arrow                       1.3.0
asttokens                   NA
attr                        23.2.0
attrs                       23.2.0
babel                       2.14.0
certifi                     2024.02.02
cffi                        1.16.0
charset_normalizer          3.3.2
colorama                    0.4.6
comm                        0.2.1
cycler                      0.12.1
cython_runtime              NA
dateutil                    2.8.2
debugpy                     1.8.0
decorator                   5.1.1
defusedxml                  0.7.1
exceptiongroup              1.2.0
executing                   2.0.1
fastjsonschema              NA
fqdn                        NA
h5py                        3.10.0
idna                        3.6
ipykernel                   6.29.0
isoduration                 NA
jedi                        0.19.1
jinja2                      3.1.3
joblib                      1.3.2
json5                       NA
jsonpointer                 2.4
jsonschema                  4.21.1
jsonschema_specifications   NA
jupyter_events              0.9.0
jupyter_server              2.12.5
jupyterlab_server           2.25.2
kiwisolver                  1.4.5
legacy_api_wrap             NA
legendkit                   0.3.4
llvmlite                    0.42.0
markupsafe                  2.1.5
matplotlib_inline           0.1.6
mpl_toolkits                NA
natsort                     8.4.0
nbformat                    5.9.2
numba                       0.59.1
overrides                   NA
packaging                   23.2
parso                       0.8.3
patsy                       0.5.6
pexpect                     4.9.0
platformdirs                4.1.0
prometheus_client           NA
prompt_toolkit              3.0.43
psutil                      5.9.8
ptyprocess                  0.7.0
pure_eval                   0.2.2
pyarrow                     15.0.0
pydev_ipython               NA
pydevconsole                NA
pydevd                      2.9.5
pydevd_file_utils           NA
pydevd_plugins              NA
pydevd_tracing              NA
pygments                    2.17.2
pyparsing                   3.1.1
pythonjsonlogger            NA
pytz                        2023.3.post1
referencing                 NA
requests                    2.31.0
rfc3339_validator           0.1.4
rfc3986_validator           0.1.1
rpds                        NA
scipy                       1.12.0
seaborn                     0.13.2
send2trash                  NA
six                         1.16.0
sklearn                     1.4.0
sniffio                     1.3.0
sphinxcontrib               NA
stack_data                  0.6.3
statsmodels                 0.14.1
threadpoolctl               3.2.0
tornado                     6.4
traitlets                   5.14.1
typing_extensions           NA
uri_template                NA
urllib3                     2.1.0
wcwidth                     0.2.13
webcolors                   1.13
websocket                   1.7.0
yaml                        6.0.1
zmq                         25.1.2
-----
IPython             8.20.0
jupyter_client      8.6.0
jupyter_core        5.7.1
jupyterlab          4.0.11
-----
Python 3.10.0 | packaged by conda-forge | (default, Nov 20 2021, 02:27:15) [Clang 11.1.0 ]
macOS-14.2.1-arm64-arm-64bit
-----
Session information updated at 2024-04-19 15:23