Detection of isoforms and genomic alterations by high-throughput full-length single-cell RNA sequencing for personalized oncology

Arthur Dondi, Ulrike Lischetti, Francis Jacob, Franziska Singer, Nico Borgsmüller, Tumor Profiler Consortium, Viola Heinzelmann-Schwarz, Christian Beisel, Niko Beerenwinkel

Abstract

Understanding the complex background of cancer requires genotype-phenotype information in single-cell resolution. Long-read single-cell RNA sequencing (scRNA-seq), capturing full-length transcripts, lacked the depth to provide this information so far. Here, we increased the PacBio sequencing depth to 12,000 reads per cell, leveraging multiple strategies, including artifact removal and transcript concatenation, and applied the technology to samples from three human ovarian cancer patients. Our approach captured 152,000 isoforms, of which over 52,000 were novel, detected cell type- and cell-specific isoform usage, and revealed differential isoform expression in tumor and mesothelial cells. Furthermore, we identified gene fusions, including a novel scDNA sequencing-validated IGF2BP2::TESPA1 fusion, which was misclassified as high TESPA1 expression in matched short-read data, and called somatic and germline mutations, confirming targeted NGS cancer gene panel results. With multiple new opportunities, especially for cancer biology, we envision long-read scRNA-seq to become increasingly relevant in oncology and personalized medicine.

Open access data

Sam file of patient2: scDNA reads mapping to custom reference covering the IGF2BP2::TESPA1 fusion breakpoint sequence, as well as the wt TESPA1 and wt IGF2BP2 sequences surrounding the breakpoint [Download]