3k PBMCs from 10x Genomics.
The exact same data is also used in Seurat’s basic clustering tutorial.
This downloads 5.9 MB of data upon the first call of the function and stores it in
The following code was run to produce the file.
adata = sc.read_10x_mtx( './data/filtered_gene_bc_matrices/hg19/', # the directory with the `.mtx` file var_names='gene_symbols', # use gene symbols for the variable names (variables-axis index) cache=True) # write a cache file for faster subsequent reading adata.var_names_make_unique() # this is unnecessary if using 'gene_ids' adata.write('write/pbmc3k_raw.h5ad', compression='gzip')