Predicting Variant Fitness of SARS-COV-2 from Full Viral Genome Sequences

Document Type

Conference Paper

Publication Date

2025

DOI

10.1609/aaaiss.v7i1.36915

Publication Title

Proceedings of the AAAI Symposium Series

Volume

7

Issue

1

Pages

428-437

Conference Name

2025 AAAI Fall Symposium Series, November 6-8, 2025, Arlington, Virginia

Abstract

Accurate prediction of the transmission fitness of emerging SARS-CoV-2 variants is vital for timely public health responses. In this study, we present a deep learning framework that predicts variant fitness from raw genomic sequences using a convolutional neural network (CNN) trained to regress Differential Population Growth Rate (DPGR) values. Our approach achieves high predictive accuracy R-square value of 0.92 on genomic sequences sampled from the USA and Europe. To interpret the model’s predictions, we apply SHapley Additive exPlanations (SHAP) to identify nucleotide-level contributions to predicted fitness. Our analysis highlights key mutations in ORF9 (nucleocapsid), ORF2 (spike), ORF5 (membrane), and ORF8 that either enhance or reduce predicted DPGR. Notably, we identify amino acid–altering mutations such as D3L, E484K, N501Y, and V97I as strong positive contributors to fitness, while synonymous or non-coding mutations had more subtle or regulatory effects. These findings validate the potential of sequence-based modeling and interpretable AI to support early detection and prioritization of high-risk variants.

Rights

© 2023, Association for the Advancement of Artificial Intelligence. All rights reserved.

"In the Returned Rights section of the AAAI copyright form, authors are specifically granted back the right to use their own papers for noncommercial uses, such as inclusion in their dissertations or the right to deposit their own papers in their institutional repositories, provided there is proper attribution. The published version is not available for posting outside the AAAI Digital Library."

Original Publication Citation

Annan, R., Nkonu, U., Hatami, P., Pantho, M. J., Qingge, L., & Qin, H. (2025). Predicting variant fitness of SARS-CoV-2 from full viral genome sequences. Proceedings of the AAAI Symposium Series, 7(1), 428-437. https://doi.org/10.1609/aaaiss.v7i1.36915

ORCID

0000-0002-1060-6722 (Qin)

Share

COinS