# scanpy.pp.pca¶

scanpy.pp.pca(data, n_comps=50, zero_center=True, svd_solver='auto', random_state=0, return_info=False, use_highly_variable=None, dtype='float32', copy=False, chunked=False, chunk_size=None)

Principal component analysis [Pedregosa11].

Computes PCA coordinates, loadings and variance decomposition. Uses the implementation of scikit-learn [Pedregosa11].

Parameters: data : The (annotated) data matrix of shape n_obs × n_vars. Rows correspond to cells and columns to genes. n_comps : intint Number of principal components to compute. zero_center : bool, None If True, compute standard PCA from covariance matrix. If False, omit zero-centering variables (uses TruncatedSVD), which allows to handle sparse input efficiently. Passing None decides automatically based on sparseness of the data. svd_solver : strstr SVD solver to use: 'arpack' for the ARPACK wrapper in SciPy (svds()) 'randomized' for the randomized algorithm due to Halko (2009). 'auto' (the default) chooses automatically depending on the size of the problem. random_state : intint Change to use different initial states for the optimization. return_info : boolbool Only relevant when not passing an AnnData: see “Returns”. use_highly_variable : bool, None Whether to use highly variable genes only, stored in .var['highly_variable']. By default uses them if they have been determined beforehand. dtype : strstr Numpy data type string to which to convert the result. copy : boolbool If an AnnData is passed, determines whether a copy is returned. Is ignored otherwise. chunked : boolbool If True, perform an incremental PCA on segments of chunk_size. The incremental PCA automatically zero centers and ignores settings of random_seed and svd_solver. If False, perform a full PCA. chunk_size : int, None Number of observations to include in each chunk. Required if chunked=True was passed. X_pca (scipy.sparse.spmatrix or numpy.ndarray) – If data is array-like and return_info=False was passed, this function only returns X_pca… adata (AnnData) – …otherwise if copy=True it returns or else adds fields to adata: .obsm['X_pca'] PCA representation of data. .varm['PCs'] The principal components containing the loadings. .uns['pca']['variance_ratio']) Ratio of explained variance. .uns['pca']['variance'] Explained variance, equivalent to the eigenvalues of the covariance matrix.