PAN-CANCER LANDSCAPE OF NON-CODING MUTATIONS USING DEEP MUTATIONAL SCANNING AND GRAPH-BASED GENOMIC FEATURE LEARNING

Muhammad Inam Farooq; Shazia Khalid

doi:10.66380/chre.01.22

Authors

Muhammad Inam Farooq Gomal Medical College, MTI, Dera Ismail Khan 29050 Khyber Pakhtunkhwa, Pakistan Author
Shazia Khalid Allama Iqbal Medical College, Lahore, Pakistan Author

DOI:

https://doi.org/10.66380/chre.01.22

Keywords:

Non-Coding Mutations, Graph Neural Networks, Deep Mutational Scanning, Cancer Biomarkers, Precision Oncology, Regulatory Genomics

Abstract

The vast non-coding regions of the human genome remain a largely untapped reservoir of oncogenic insight, particularly in the context of cancer pathogenesis and therapy resistance. This study introduces an innovative integrative framework that combines deep mutational scanning in patient-derived organoids with graph-based genomic learning to decipher the functional landscape of non-coding mutations across multiple cancer types. By incorporating high-throughput sequencing data (RNA-seq, ATAC-seq, ChIP-seq) and applying Graph Attention Networks trained on multi-modal datasets from TCGA, ICGC, and COSMIC, we systematically identified regulatory hotspots and functionally impactful non-coding variants. Experimental validation revealed significant dysregulation in gene expression and survival outcomes linked to specific enhancer and promoter mutations. Results from nine subgroup analyses (Tables 1–9) highlighted diverse mutation burdens and expression profiles, while twelve complex visualizations (Figures 1–12) demonstrated strong associations between mutation effects, pathway scores, and clinical phenotypes. The model’s explainability was enhanced via SHAP analysis, pinpointing key contributors to variant pathogenicity. This comprehensive, multi-step workflow provides not only a biologically meaningful interpretation of non-coding variants but also proposes novel biomarker candidates with therapeutic relevance. By bridging functional genomics and AI, this study offers a scalable methodology for advancing personalized oncology and expanding the current understanding of the regulatory genome’s role in cancer.