Enhancing Cancer Genomics: A Pipeline for Spatial Transcriptomics Analysis on the CGC

Miona Ranković, Nevena Vukojičić*, Nevena Ilić Raičević, Vida Matović and Ana Mijalković Lazić

Velsera, Belgrade, Serbia

nevena.vukojicic [at] velsera.com

Abstract

Spatial transcriptomics field has grown significantly in recent years. This hybrid method, inspired by in situ hybridization and next-generation sequencing, particularly single-cell RNA sequencing (scRNA-seq), enables whole transcriptome profiling while maintaining spatial context at high resolutions, offering new insights in cancer research.

We present a highly configurable sequencing-based technology solution for comprehensive spatial analysis. Available on the NCI-funded Cancer Genomics Cloud (CGC) platform by Seven Bridges, this pipeline provides a collaborative cloud infrastructure. The CGC platform integrates computation, over 1000 bioinformatics workflows, and 4+ PB of data, making Cancer Research Data Commons (CRDC) datasets accessible from any environment.

Developed with widely adopted packages, this pipeline processes datasets from leading technologies, 10x and Slide-seq. It includes steps such as quality control, data preprocessing, dimensionality reduction, cluster identification, detection of spatially variable features, and integration with scRNA-seq references. The pipeline is highly configurable, allowing various settings to be optimized for better results, and some specific components can be selectively executed. Key steps are visually represented for detailed insights.

Here, we demonstrate spatial transcriptomics analysis flow on publicly available datasets using this pipeline, showing the impact of different settings on analysis outcomes. We identify spatially variable genes with distinct tissue localization and integrate data to predict cell type composition within spatial domains.

Spatial transcriptomics analysis significantly enhances cancer research by characterizing tumor microenvironments, discovering novel biomarkers, and clarifying drug resistance mechanisms. This CGC-hosted workflow is expected to contribute to significant advancements in understanding complex spatial relationships within tissues.

Keywords: spatial transcriptomics, single-cell omics, cloud computing