scanpy.datasets.pbmc3k_processed#
- scanpy.datasets.pbmc3k_processed()[source]#
Processed 3k PBMCs from 10x Genomics.
Processed using the basic tutorial Preprocessing and clustering 3k PBMCs (legacy workflow).
For preprocessing, cells are filtered out that have few gene counts or too high a
percent_mito
. The counts are logarithmized and only genes marked byhighly_variable_genes()
are retained. Theobs
variablesn_counts
andpercent_mito
are corrected for usingregress_out()
, and values are scaled and clipped byscale()
. Finally,pca()
andneighbors()
are calculated.As analysis steps, the embeddings
tsne()
andumap()
are performed. Communities are identified usinglouvain()
and marker genes usingrank_genes_groups()
.- Return type:
- Returns:
Annotated data matrix.
Examples
>>> import scanpy as sc >>> sc.datasets.pbmc3k_processed() AnnData object with n_obs × n_vars = 2638 × 1838 obs: 'n_genes', 'percent_mito', 'n_counts', 'louvain' var: 'n_cells' uns: 'draw_graph', 'louvain', 'louvain_colors', 'neighbors', 'pca', 'rank_genes_groups' obsm: 'X_pca', 'X_tsne', 'X_umap', 'X_draw_graph_fr' varm: 'PCs' obsp: 'distances', 'connectivities'