ORCID

0000-0002-1476-113X (Chakraborty), 0000-0002-8991-1737 (Pant)

Document Type

Article

Publication Date

2025

DOI

10.3390/jcm14134686

Publication Title

Journal of Clinical Medicine

Volume

14

Issue

13

Pages

4686 (1-14)

Abstract

Background: Pancreatic cancer is among the most lethal malignancies, with poor prognosis and limited survival despite treatment advances. Accurate survival modeling is critical for prognostication and clinical decision-making. This study had three primary aims: (1) to determine the best-fitting survival distribution among patients diagnosed and deceased from pancreatic cancer across stages and treatment types; (2) to construct and compare predictive risk classification models; and (3) to evaluate survival probabilities using parametric, semi-parametric, non-parametric, machine learning, and deep learning methods for Stage IV patients receiving both chemotherapy and radiation. Methods: Using data from the SEER database, parametric models (Generalized Extreme Value, Generalized Pareto, Log-Pearson 3), semi-parametric (Cox), and non-parametric (Kaplan–Meier) methods were compared with four machine learning models (gradient boosting, neural network, elastic net, and random forest). Survival probability heatmaps were constructed, and six classification models were developed for risk stratification. ROC curves, accuracy, and goodness-of-fit tests were used for model validation. Statistical tests included Kruskal–Wallis, pairwise Wilcoxon, and chi-square. Results: Generalized Extreme Value (GEV) was found to be the best-fitting distribution in most of the scenarios. Stage-specific survival differences were statistically significant. The highest predictive accuracy (AUC: 0.947; accuracy: 56.8%) was observed in patients receiving both chemotherapy and radiation. The gradient boosting model predicted the most optimistic survival, while random forest showed a sharp decline after 15 months. Conclusions: This study emphasizes the importance of selecting appropriate analytical models for survival prediction and risk classification. Adopting these innovations, with the help of advanced machine learning and deep learning models, can enhance patient outcomes and advance precision medicine initiatives.

Rights

© 2025 by the authors.

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution 4.0 International (CC BY 4.0) License.

Data Availability

Article states: "The study data are available from https://seer.cancer.gov/seerstat/, accessed on 25 January 2025."

Original Publication Citation

Chakraborty, A., & Pant, M. D. (2025). Machine learning models for pancreatic cancer survival prediction: A multi-model analysis across stages and treatments using the Surveillance, Epidemiology, and End Results (SEER) database. Journal of Clinical Medicine, 14(13), 1-14, Article 4686. https://doi.org/10.3390/jcm14134686

Share

COinS