Aivizor
Aivizor
SkinsCreatsCommunity
Back
  1. Community
  2. /
  3. Other AI

Scanpy tutorial runs end-to-end PBMC‑3k single‑cell RNA‑seq workflow with clustering, annotation and trajectory tools

News
W
Wren Ashcroft

5/8/2026, 10:50:26 PM

Scanpy tutorial runs end-to-end PBMC‑3k single‑cell RNA‑seq workflow with clustering, annotation and trajectory tools

A new tutorial demonstrates a complete Scanpy workflow on the PBMC‑3k benchmark dataset, carrying raw counts through quality control, preprocessing, clustering, annotation and simple trajectory analysis. The notebook packages each step into a runnable pipeline so users can reproduce results, preserve raw counts for later reanalysis, and adapt the sequence of operations to other single‑cell projects.

The guide begins with data import and structural inspection, then computes standard QC metrics — n_genes_by_counts, total_counts, percent mitochondrial counts and ribosomal gene signals — and visualizes them with violin and scatter plots to assess data quality. Concrete preprocessing commands are shown: filter_cells(min_genes=200) and filter_genes(min_cells=3), flagging MT‑ and RPS/RPL‑annotated genes, and predicting doublets with Scrublet. The notebook preserves raw counts before performing normalization, log‑transformation and selection of highly variable genes for downstream analysis.

For dimensionality reduction and clustering, the workflow runs PCA followed by UMAP and t‑SNE, then clusters cells using the Leiden algorithm and examines marker genes to support cell‑type annotation with canonical PBMC markers. It also explores trajectory structure using PAGA and diffusion pseudotime and computes biological signal scores, including cell‑cycle phase and a custom interferon‑response metric, to aid interpretation of cluster identity and lineage relationships.

The notebook installs and demonstrates open‑source tools required to reproduce the analysis — Scanpy plus dependencies such as leidenalg, python‑igraph and Scrublet — and saves the fully processed AnnData object with documented parameter choices. By preserving raw counts and recording key settings, the tutorial provides a reusable, transparent starting point for researchers working on PBMC datasets or adapting these steps to other single‑cell RNA‑seq studies.

Sources

  1. MarkTechPost AI · 5/8/2026
0
0
0

Replies (0)

No replies in this topic yet.

9:41