Human-Centric Intelligent Systems
The thyroid gland is the crucial organ in the human body, secreting two hormones that help to regulate the human body's metabolism. Thyroid disease is a severe medical complaint that could be developed by high Thyroid Stimulating Hormone (TSH) levels or an infection in the thyroid tissues. Hypothyroidism and hyperthyroidism are two critical conditions caused by insufficient thyroid hormone production and excessive thyroid hormone production, respectively. Machine learning models can be used to precisely process the data generated from different medical sectors and to build a model to predict several diseases. In this paper, we use different machine-learning algorithms to predict hypothyroidism and hyperthyroidism. Moreover, we identified the most significant features, which can be used to detect thyroid diseases more precisely. After completing the pre-processing and feature selection steps, we applied our modified and original data to several classification models to predict thyroidism. We found Random Forest (RF) is giving the maximum evaluation score in all sectors in our dataset, and Naive Bayes is performing very poorly. Moreover selecting the feature by using the feature importance method RF provides the best accuracy of 91.42%, precision of 92%, recall of 92% and F1-score of 92%. Further, by analyzing the characteristics and behavior of the dataset, we identified the most important features (TSH, T3, TT4, and FTI) of the dataset. In terms of accuracy and other performance evaluation criteria, this study could advocate the use of effective classifiers and features backed by machine learning algorithms to detect and diagnose thyroid disease. Finally, we did some explainability analysis of our best classifier to understand the internal black-box of our machine learning model and datasets. This study could further pave the way for the researcher as well as healthcare professionals to analyze thyroid disease in real time applications.
© 2023 The Authors.
This article is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original authors and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Article states: "This research study is based on an open-source dataset. The dataset can be accessed from this link: https://archive.ics.uci.edu/ml/datasets/Thyroid+Disease Thyroid Disease."
Original Publication Citation
Hossain, M. B., Shama, A., Adhikary, A., Raha, A. D., Uddin, K. M. A., Hossain, M. A., Islam, I., Murad, S. A., Munir, M. S., & Bairagi, A. K. (2023). An explainable artificial intelligence framework for the predictive analysis of hypo and hyper thyroidism using machine learning algorithms. Human-Centric Intelligent Systems, 3, 211-231. https://doi.org/10.1007/s44230-023-00027-1
Hossain, Md. Bipul; Shama, Anika; Adhikary, Apurba; Raha, Avi Deb; Aslam Uddin, K. M.; Hossain, Mohammad Amzad; Islam, Imtia; Murad, Saydul Akbar; Munir, Md. Shirajum; and Bairagi, Anupam Kumur, "An Explainable Artificial Intelligence Framework for the Predictive Analysis of Hypo and Hyper Thyroidism Using Machine Learning Algorithms" (2023). Electrical & Computer Engineering Faculty Publications. 431.