Peer-Reviewed Abstracts & Posters

 

Multimodal Learning on Pan-Squamous Cell Carcinomas for Improved Survival Outcomes

Asim Waqas*, Aakash Tripathi, Ashwin Mukund, Paul Stewart, Mia Naeini, Ghulam Rasool

(Accepted at National Comprehensive Cancer Network, April 5-7, 2024, Orlando, FL)

 

Cancer clinics capture disease information at varying spatial (genetic to molecular to tissue to organ and beyond) and temporal scales (disease progression). The existing bioinformatic mechanisms do not encapsulate the complete spectrum of the heterogeneous nature of data, especially under the challenge of missing data modalities. We propose a Graph Neural Network (GNN) based hierarchical relational model that can efficiently learn from multimodal, heterogeneous datasets for improved prediction of clinical outcomes. Our framework (1) generates embeddings from multi-resolution datasets, (2) learns and improves relational embeddings at various stages of data hierarchy using GNNs, (3) fuses the learned embeddings into a unified graph representation, and (4) provides improved performance metrics for downstream tasks such as survival analysis. Our solution fuses unobserved but interrelated cancer variables in non-Euclidean space through a scalable framework. We train GNNs on survival prediction task using pan-Squamous Cell Carcinomas (SCC) in the head and neck (HNSC), bladder (BLCA), lung (LUSC), cervical (CESC), and esophageal (ESCA) cancers having 476; 399; 355; 291; and 154 patients, respectively. We fine-tune and validate our approach for lung SCC data collected at Moffitt Cancer Center, comprising 103 patients. The data modalities for training and evaluation include EHR data (age at diagnosis, gender, ethnicity, race, smoking status, etc.), pathology images, and molecular data (gene expression, miRNA expression, and DNA methylation). The model evaluation used the concordance index (C-index) of patient survival prediction and 10-fold cross-validation. We compared the multimodal GNN results with other unimodal and multimodal models, such as Transformers, and got improved predictions compared to other methods. We observe that the convergence of individual data modalities and integration across varying scales creates a unified view of the disease that is more insightful than the individual view or modality. Our solution aims to converge the entire spectrum of the disease and understand the patient’s genetic, physiological, and psychosocial circumstances in a unified framework. The proposed method can help the community by providing helpful insights on heterogeneous data integration and showing that the convergence of maximum data views across varying occurrences can accrue remarkable discoveries about the disease.

 


 

SeNMo: A Self-normalizing Deep Learning Model for Enhanced Multi-Omics Data Analysis in Oncology

Asim Waqas*, Aakash Tripathi, Sabeen Ahmed, Ashwin Mukund, Paul Stewart, Mia Naeini, Hamza Farooq, Ghulam Rasool

(Accepted at American Association for Cancer Research, April 5-10, 2024, San Diego, CA.)

 

Multi-omics research has enhanced our understanding of cancer heterogeneity and progression. Investigating molecular data through multi-omics approaches is crucial for unraveling the complex biological mechanisms underlying cancer, thereby enabling more effective diagnosis, treatment, and prevention strategies. However, predicting patient outcomes through integration of all available multi-omics data is still an under-study research direction. Here, we present SeNMo (Self-normalizing Network for Multi-omics), a deep neural network that ensures the zero mean and unit variance of activations across network layers using the self-normalizing technique. Such normalizing techniques are critical in stable and robust learning of deep learning models. SeNMo is particularly efficient in handling multi-omics data characterized by high-width (many features) and low-length (fewer samples) attributes. We trained SeNMo for the task of overall survival of patients using pan-cancer multi-omics data involving 28 cancer sites from the Genomic Data Commons (GDC). The training multi-omics data includes gene expression, DNA methylation, miRNA expression, and protein expression modalities. We tested the model's performance on the Moffitt Cancer Center's internal data involving RNA expression and protein expression data. We evaluated the model’s performance in predicting patient’s overall survival using the concordance index (C-Index), which provides a robust measure of the model's predictive capability. SeNMo performed consistently well in the training regime, reflected by the validation C-Index≥0.6 on GDC's public data. In the testing regime on Moffitt's private data, SeNMo performed with a C-Index of 0.68. The model's performance increased when tested on low-dimensional data or when tested on single omic data such as RNA or protein expression data with a C-Index of 0.7. SeNMo proved to be a mini-foundation model for multi-omics oncology data because it demonstrated robust performance, adaptability across molecular data types, and universal approximator capabilities for the scale of molecular data it was trained on. SeNMo can be further scaled to any cancer site and molecular data type. It can also be fine-tuned for other downstream tasks such as treatment response prediction, risk stratification, patient subgroup identification, and others. Its ability to accurately predict patient outcomes and adapt to various downstream tasks indicates a new era in cancer research and treatment. For future research, SeNMo offers a powerful tool for uncovering deeper insights into the complex nature of cancer and sets a precedent for how artificial intelligence can be leveraged to handle the vast and intricate data in the biomedical field. We believe SeNMo and similar models are poised to transform the oncology landscape, offering hope for more effective, efficient, and patient-centric cancer care.

 


 

Pan-cancer Learning for Survival Prediction

Asim Waqas*, Aakash Tripathi, Ashwin Mukund, Paul Stewart, Mia Naeini, Ghulam Rasool

(Presented at USF AI+X Symposium, 29 September 2023)

 

1. Motivation: The disease-related information resides across varying scales, modalities, and resolutions of cancer data. Cancer clinics capture disease information at varying spatial (genetic to molecular to tissue to organ and beyond) and temporal scales (disease progression). The resulting data includes (1) -omics information from genome, proteome, transcriptome, epigenome, and microbiome, (2) radiological images from CT, PET, MRI, or ultrasound scanners, (3) digitized histopathology, immunohistochemistry, immunofluorescence slides created using tumor tissue samples and stored as gigapixel whole slide images (WSI), and (4) electronic health record (EHR) that houses structured (demographics, sex, race, etc.) and unstructured data such as visit notes and radiology/pathology/surgery reports. Clinicians routinely fuse such diverse information mentally for decision-making and develop informed treatment plans. However, such analyses are prone to oversight of the variations in data that are too subtle to be observed in manual human processing.


2. Challenges and Opportunities: With more collection sensors, associated hardware, and software, the amount and diversity of the data are increasing exponentially. The challenge is to consume and coherently learn from varying scales and types of datasets. The tumor-related spatial information may be lost in varying the observation magnification. The existing technologies enable us to record individual states of disease development from a normal cell to a pre-malignant lesion and, ultimately, a malignant tumor. Jointly learning from such multimodal, multiscale, heterogeneous information is challenging but crucial for understanding and tackling complex diseases like cancer. Traditional bioinformatics approaches, artificial intelligence (AI), and machine learning (ML) models struggle when ingesting such heterogeneous, multiscale information for critical decision-making. Analysis of diverse, heterogeneous, and incoherent multi-view, multi-modality data demands innovation in AI/ML techniques well beyond the traditional bioinformatics or statistical analytic tools. ML methods achieving better predictive power through multimodal data integration hold potential clues to cancer data integration problems. Integration of these heterogeneous, multi-resolution magnification levels of cancer data may provide a synergistic effect that is better than the sum of knowledge of individual parts.


3. Hypothesis: We hypothesize that graph neural networks (GNNs) can efficiently learn from heterogeneous multimodal datasets and perform better than the baseline unimodal models in clinical outcome prediction tasks (e.g., patient survival or cancer recurrence). The feasibility of the proposed work is based on the fact that GNNs can inherently handle data heterogeneity and learn patterns from datasets with missing samples.
4. Proposed Solution: We propose a new AI/ML framework class with the inherent capability of synergistic hierarchical learning from multi-resolution, heterogeneous oncology data for improved prediction of cancer patient survival. We propose GNN-based hierarchical relational model that can efficiently learn from multimodal, heterogeneous datasets for improved prediction of clinical outcomes. Our framework (1) generates patient embeddings from heterogeneous, multi-resolution datasets, (2) learns and improves relational embeddings at various stages of the data hierarchy using GNNs, (3) fuses the learned embeddings into a unified graph representation, and (4) provides improved performance metrics for downstream tasks such as survival analysis and recurrence prediction. Our solution fuses unobserved but interrelated cancer variables in non-Euclidean space through a scalable framework. Our framework builds on pre-trained AI/ML models, hierarchical Graph Neural Networks (GNNs), and supervised technique to learn from multimodal, heterogeneous datasets. The unimodal embeddings from pre-trained state-of-the-art AI/ML models provide learned features within the context of each modality. The hierarchical component allows multiscale data integration, from genetic to molecular to tissue to organ and beyond. The fused embeddings allow multimodal learning that combines information from various modalities to improve accuracy and reliability.


5. Results: We have developed Hierarchical Graph Neural Network (Hi-GNN) framework that can efficiently learn from multimodal, multiscale, heterogeneous datasets and predict clinical outcomes accurately. We undertook the task of predicting overall survival (OS) and progression-free survival (PFS) for pan-Squamous Cell Carcinomas in the lung (TCGA-LUSC), head and neck(TCGA-HNSC), and cervical (TCGA-CESC) cancer patients. The TCGA-LUSC has 489 patients, TCGA-HNSC has 522, and TCGA-CESC has 256 patients. We also used data collected at Moffitt Cancer Center for the lung Squamous Cell Carcinoma comprising 103 patients. The modalities for these tasks include EHR data (age at diagnosis, gender, ethnicity, race, smoking status, etc.), pathology images, and -omics data (DNA copy number and mutations, mRNA and miRNA expressions, and protein expression). For the pre-trained ML models for generating individual modality embeddings, we used (1) Robust and Efficient MEDical Imaging with Self-supervision (REMEDIS) models for the histopathology images, (2) GatorTron for the EHR data, and (3) Self-Normalizing Networks (SNNs) for molecular -omics data. Our criterion for evaluating the representativeness or optimality of these modality-specific models includes their predictive performance against the ground-truth data of OS and PFS. Our pre-training experiments showed that REMEDIS and GatorTron worked well without fine-tuning; however, SNNs always required transfer learning with larger datasets, potentially linked to the complexity and variability of the -omics datasets. We used the same embedding size for all modalities. The resultant embeddings were then used to generate modality-graphs and patient-graphs, and different GNN models were used to learn the inter-modality (patient-graphs) and intra-modality (modality-graphs) features hierarchically. Finally, we created unified patient-graphs from learned hierarchical features and used another GNN model to learn the node-prediction task. Model evaluation was done using the concordance index (C-index) of patient survival prediction and 10-fold cross-validation. We compared the multimodal GNN learning results with multiple multimodal and unimodal MLPs and Transformers. We observe that the convergence of individual data modalities and integration across varying scales using GNNs creates a unified view of the disease that is more prognostic, predictive, and insightful than the individual view or modality. Our results show improved predictions for the survival task compared to the unimodal and multimodal methods. Our solution converges the entire disease spectrum and understands the patient’s genetic, physiological, and psychosocial circumstances in a unified framework. The proposed method can help the community by providing helpful insights on heterogeneous data integration and showing that the convergence of maximum data views across varying occurrences can accrue remarkable discoveries about the disease unfathomable in individual considerations.


 

Integrative Relational Learning on Multimodal Cancer Data for Improved Clinical Predictions

Asim Waqas*, Paul Stewart, Hamza Farooq, Ghulam Rasool

(Presented at the Mid-South Computational Biology and Bioinformatics Society MCBIOS, University of Dallas, March 15-17, 2023.)

 

The disease-related information resides across varying scales, modalities, and resolutions of cancer data, for example, radiology, pathology, genomics, proteomics, electronic health record (EHR), etc. The bioinformatic mechanisms need to encapsulate the complete spectrum of the heterogeneous nature of data, especially under the challenge of missing data. We propose GNN-based hierarchical relational model that can efficiently learn from multi-modal, heterogeneous datasets for improved prediction of clinical outcomes. Our framework (1) generates graphs from heterogeneous, multi- resolution datasets, (2) learns and improves relational embeddings at various stages of the data hierarchy using GNNs, (3) fuses the learned embeddings into a unified graph representation, and (4) provides improved performance metrics for downstream tasks such as survival analysis, recurrence prediction, and distant metastasis occurrence. Our solution fuses unobserved but interrelated cancer variables in non-Euclidean space through a scalable framework. We train GNNs on survival prediction task using two datasets, Squamous Cell Lung Cancer and the merged Glioblastoma, Low-grade Gliomas dataset. We fine-tune and validate our approach for the distant metastasis prediction task using Head and Neck Cancer data collected at Moffitt Cancer Center. The data modalities for these tasks include histopathology, genomics, proteomics, radiology, and EHRs. We observe that the convergence of individual data modalities and integration across varying scales creates a unified view of the disease that is more prognostic, predictive, and insightful than the individual view or modality. Our results show improved predictions for each task compared to the existing methods. Our solution aims to converge the entire spectrum of the disease and understand the patient’s genetic, physiological, and psychosocial circumstances in a unified framework. The proposed method can help the community by providing helpful insights on heterogeneous data integration and showing that the convergence of maximum data views across varying occurrences can accrue remarkable discoveries about the disease unfathomable in individual considerations.

 


 

Unifying Multimodal Data, Time Series Analytics, and Contextual Medical Memory: Introducing MINDS as an Oncology-Centric Cloud-Based Platform

Aakash Tripathi*, Asim Waqas, Ghulam Rasool

(Accepted at National Comprehensive Cancer Network, April 5-7, 2024, Orlando, FL)

 

Introduction: The exponential growth of heterogeneous data types and sources in oncology necessitates an integrated approach for predictive modeling and advanced analytics. The Multimodal Integration of Oncology Data System (MINDS) platform is designed to tackle three primary challenges: 1) Creating benchmark datasets from multimodal data for developing and evaluating machine learning models, 2) Handling large-scale multivariate time series medical data for building temporal predictive models, and 3) Serving as a contextual medical memory to generate embeddings for enhancing foundational and downstream natural language processing models in the clinical domain.


Methods: MINDS implements a cloud-based architecture using Amazon Web Services (AWS). The platform acquires and processes multimodal oncology data including clinical notes, imaging studies, omics assays, diagnostic reports, and treatment timelines from sources such as TCGA, TCIA, and institutional EHRs. Specialized extract, transform, load (ETL) processes integrate and transform the heterogeneous data into analysis-ready formats. MINDS employs both traditional machine learning techniques such as logistic regression, random forests, and neural networks as well as large language models including GatorTron and ClinicalT5 for diverse analytical capabilities.


Results: MINDS is being developed to use the standardized and fused structured and unstructured data modalities into an unified benchmark dataset. This will enable us to develop and evaluate tailored machine learning techniques for oncology research, such as our foundational multimodal oncology model. We are working to create curated benchmark datasets spanning over 20 cancer types to improve model generalization with capabilities to ingest large-scale multivariate time series data including clinical events like diagnoses, medications, procedures, and lab tests. This will allow dynamic patient trajectory modeling and real-time predictive analytics through temporal techniques time variant models. Additionally, as part of MINDS, we developed a method to use the data as contextual memory to pre-train our own foundational multimodal transformer model. Using pretrained foundational models such as GatorTron we generate embeddings to significantly that get used as contextual knowledge for large language models which improve accuracy, robustness, and interpretability of medical NLP models, especially for tasks like named entity recognition and relation extraction from clinical notes.


Conclusion: As an oncology-focused platform unifying multimodal data at scale for ML benchmarks, facilitating granular time series analytics, and serving as a medical memory for language models, MINDS offers a transformative solution for accelerating precision medicine and AI-augmented decision making.

 


 

Multimodal Transformer Model Improves Survival Prediction in Lung Cancer Compared to Unimodal Approaches

Aakash Tripathi*, Asim Waqas, Yasin Yilmaz, Ghulam Rasool

(Accepted at American Association for Cancer Research, April 5-10, 2024, San Diego, CA.)

 

Integrating multimodal lung data including clinical notes, medical images, and molecular data is critical for predictive modeling tasks like survival prediction, yet effectively aligning these disparate data types remains challenging. We present a novel method to integrate heterogeneous lung modalities by first thoroughly analyzing various domain-specific models and selecting the optimal model for embedding feature extraction per data type based on performance on representative pretrained tasks. For clinical notes, the GatorTron models showed the lowest regression loss on an initial evaluation set, with the large GatorTron-medium model achieving 12.9 loss. After selecting the top performers, we extracted robust embeddings on the full lung dataset built using the Multimodal Integration of Oncology Data System (MINDS) framework. MINDS provides an end-to-end platform for aggregating and normalizing multimodal patient data. We aligned the multimodal embeddings to a central pre-trained language model using contrastive representation learning based on a cosine similarity loss function. To adapt the language model to the new modalities, we employed a parameter-efficient tuning method called adapter tuning, which introduces small trainable adapter layers that leave the base model weights frozen. This avoids catastrophic forgetting of the pretrained weights. We evaluated our multimodal model on prognostic prediction tasks including survival regression and subtype classification using both public and internal lung cancer datasets spanning multiple histologic subtypes and stages. Our aligned multimodal model demonstrated improved performance over models utilizing only single modalities, highlighting the benefits of integrating complementary information across diverse lung data types. This work illustrates the potential of flexible multimodal modeling for critical lung cancer prediction problems using heterogeneous real-world patient data. Our model provides a strong foundation for incorporating emerging data types, modalities, and predictive tasks in the future.