Normalize and compute highly variable genes

In this notebook, we will

Input-data

Load doublets precomputed by solo

We don't run solo as part of the pipeline, as the results are not reproducible on different systems. Instead, we load pre-computed results from the repository.

How solo was ran initially is described in main.nf.

Normalize and scale

The raw data object will contain normalized, log-transformed values for visualiation. The original, raw (UMI) counts are stored in adata.obsm["raw_counts"].

We use the straightforward normalization by library size as implemented in scanpy.

Add cell-cycle scores

Remove doublets

Compute highly variable genes