Optimization of the Naïve Bayes Algorithm Using Particle Swarm Optimization (PSO) for Predicting Heart Disease Symptoms
Abstract
Heart disease is one of the leading causes of mortality in Indonesia, yet early detection remains a challenge due to limitations in data and suboptimal classification methods. This study aims to improve the accuracy of heart disease prediction by integrating the Naïve Bayes algorithm with Particle Swarm Optimization (PSO) for feature selection. A dataset of 303 patient records was processed using RapidMiner across three configurations: Naïve Bayes with split validation (80:20), Naïve Bayes with 10-fold cross-validation, and Naïve Bayes with PSO-based feature selection. The results showed that incorporating PSO increased accuracy from 87.60% to 89.26%, along with improvements in precision and recall, while maintaining a high AUC value (0.933). These findings demonstrate that PSO effectively identifies the most relevant features and enhances the performance of heart disease prediction models. The study also underscores the importance of validation methods and model interpretability in the application of artificial intelligence in healthcare.
Keywords
References
Afriansyah, M., Saputra, J., Ardhana, V. Y. P., & Sa’adati, Y. (2024). Algoritma Naive Bayes Yang Efisien Untuk Klasifikasi Buah Pisang Raja Berdasarkan Fitur Warna. Journal of Information Systems Management and Digital Business, 1(2), 236–248.
Alamer, L., Alqahtani, I. M., & Shadadi, E. (2023). Intelligent Health Risk and Disease Prediction Using Optimized Naive Bayes Classifier. Journal of Internet Services and Information Security, 13(1), 1–10.
Alwasiti, H., Yusoff, M., & Raza, K. (2020). Motor imagery classification for brain computer interface using deep metric learning. IEEE Access, 8, 109949–109963. https://doi.org/10.1109/access.2020.3002459
Andriyanto, S., Muharni, S., & Sulistiyanto, S. (2025). Analisis Data Sintetis Polycystic Ovary Syndrome Menggunakan Algoritma Naive Bayes dan K-NN. RIGGS: Journal of Artificial Intelligence and Digital Business, 4(1), 144–149.
Antonelli, M., Ducange, P., Marcelloni, F., & Segatori, A. (2016). On the influence of feature selection in fuzzy rule-based regression model generation. Information Sciences, 329, 649–669. https://doi.org/10.1016/j.ins.2015.09.045
Araujo, P., Silva, A., Junior, N., Cabrini, F., Santos, A., Guelfi, A., … & Kofuji, S. (2021). Impact of feature selection methods on the classification of DDoS attacks using XGBoost. Journal of Communication and Information Systems, 36(1), 200–214. https://doi.org/10.14209/jcis.2021.22
Arumugam, K., Naved, M., Shinde, P. P., Leiva-Chauca, O., Huaman-Osorio, A., & Gonzales-Yanac, T. (2023). Multiple disease prediction using machine learning algorithms. Materials Today: Proceedings, 80, 3682–3685.
Bajeh, A., Funso, B., & Usman-Hamza, F. (2019). Performance analysis of particle swarm optimization for feature selection. FUOYE Journal of Engineering and Technology, 4(1). https://doi.org/10.46792/fuoyejet.v4i1.364
Berrar, D. (2025). Bayes’ Theorem and Naive Bayes Classifier.
Bouaziz, W., Lang, P.-O., Schmitt, E., Leprêtre, P.-M., Lefebvre, F., Momas, C., … Vogel, T. (2019). Effects of a short-term interval aerobic training program with recovery bouts on vascular function in sedentary aged 70 or over: A randomized controlled trial. Archives of Gerontology and Geriatrics, 82, 217–225. https://doi.org/10.1016/j.archger.2019.02.017
Cho, M., & Hoang, T. (2017). Feature selection and parameters optimization of SVM using particle swarm optimization for fault classification in power distribution systems. Computational Intelligence and Neuroscience, 2017, 1–9. https://doi.org/10.1155/2017/4135465
Danilov, S., Matveev, G., Babenko, A., & Shlyakhto, E. (2024). Model for predicting the effect of sibutramine therapy in obesity. Journal of Personalized Medicine, 14(8), 811. https://doi.org/10.3390/jpm14080811
Doğan, T. (2018). The impact of feature selection on urban land cover classification. International Journal of Intelligent Systems and Applications in Engineering, 1(6), 59–64. https://doi.org/10.18201/ijisae.2018637933
Dong, X., Zhang, H., Li, Z., Zhu, C., Yi, S., & Chen, C. (2025). Least squares support vector machines with variable selection and hyperparameter optimization for complex structures reliability assessment. Quality and Reliability Engineering International, 41(4), 1461–1470. https://doi.org/10.1002/qre.3726
Dwiramadhan, F., Wahyuddin, M. I., & Hidayatullah, D. (2022). Sistem Pakar Diagnosa Penyakit Kulit Kucing Menggunakan Metode Naive Bayes Berbasis Web. Jurnal Teknologi Informasi dan Komunikasi (JTIK), 6(3), 429–437.
Faulata, R., Laksono, A., & Wulandari, R. (2021). Heart disease in Indonesia in 2018: An ecological analysis. Indian Journal of Forensic Medicine & Toxicology, 15(3), 3927–3933. https://doi.org/10.37506/ijfmt.v15i3.15910
Fang, J., Liu, W., Chen, L., Lauria, S., Miron, A., & Liu, X. (2023). A survey of algorithms, applications and trends for particle swarm optimization.
Gad, A. G. (2022). Particle swarm optimization algorithm and its applications: A systematic review. Archives of Computational Methods in Engineering, 29(5), 2531–2561.
Gbadamosi, B., Ogundokun, R. O., Adeniyi, E. A., Misra, S., & Stephens, N. F. (2022). Medical data analysis for IoT-based datasets in the cloud using Naïve Bayes classifier for prediction of heart disease. In New Frontiers in Cloud Computing and Internet of Things (pp. 365–386). Springer.
Gupta, S., Baghel, A., & Iqbal, A. (2018). Threshold controlled binary particle swarm optimization for high dimensional feature selection. International Journal of Intelligent Systems and Applications, 10(8), 75–84. https://doi.org/10.5815/ijisa.2018.08.07
Hariyadi, M., & Crysdian, C. (2023). Hoax detection news using naïve Bayes and support vector machine algorithm. International Journal of Advances in Data and Information Systems, 4(2), 191–200. https://doi.org/10.25008/ijadis.v4i2.1306
Khoiriyah, M., Muharni, S., & Perdana, A. (2019). Penerapan metode Naive Bayes dan Simple Additive Weighting (SAW) untuk pemilihan dosen pembimbing skripsi. International Research on Big-Data and Computer Technology: I-Robot, 3(1).
Kocbek, S., Kocbek, P., Gosak, L., Fijačko, N., & Štiglic, G. (2022). Extracting new temporal features to improve the interpretability of undiagnosed type 2 diabetes mellitus prediction models. Journal of Personalized Medicine, 12(3), 368. https://doi.org/10.3390/jpm12030368
Lewis, D. (1998). Naive (Bayes) at forty: The independence assumption in information retrieval. https://doi.org/10.1007/bfb0026666
Li, X., Ge, P., Zhu, J., Li, H., Graham, J., Singer, A., … & Duong, T. (2020). Deep learning prediction of likelihood of ICU admission and mortality in COVID-19 patients using clinical variables. PeerJ, 8, e10337. https://doi.org/10.7717/peerj.10337
Mazdadi, M., Farmadi, A., Kartini, D., & Muliadi, M. (2023). Implementation of particle swarm optimization feature selection on naïve Bayes for thoracic surgery classification. Journal of Electronics Electromedical Engineering and Medical Informatics, 5(3), 150–158. https://doi.org/10.35882/jeemi.v5i3.305
Meidina, A., & Abidin, Z. (2023). Diagnosis of heart disease using optimized naïve Bayes algorithm with particle swarm optimization and gain ratio. Recursive Journal of Informatics, 1(2), 47–54. https://doi.org/10.15294/rji.v1i2.67278
Michel, P., Ngo, N., Pons, J., Delliaux, S., & Giorgi, R. (2021). A filter approach for feature selection in classification: Application to automatic atrial fibrillation detection in electrocardiogram recordings. BMC Medical Informatics and Decision Making, 21(S4). https://doi.org/10.1186/s12911-021-01427-8
Muri, A. C., & Muharni, S. (2025). Application of the SMART method in decision making for KIP scholarship recipients faculty of business technology and science. Journal of Artificial Intelligence and Software Engineering, 5(1), 144–155.
Nuraeni, A., Suryani, S., Trisyani, Y., & Pramukti, I. (2021). Social and emotional support highly associated with helplessness among coronary heart disease patients. Open Access Macedonian Journal of Medical Sciences, 9(T6), 1–6. https://doi.org/10.3889/oamjms.2021.7308
Okemiri, H., Alo, R., & Nnamene, C. (2023). Performance analysis of machine learning algorithms for cardiovascular diseases detection. https://doi.org/10.21203/rs.3.rs-1435455/v1
Osanaiye, O., Cai, H., Choo, K., Dehghantanha, A., Xu, Z., & Dlodlo, M. (2016). Ensemble-based multi-filter feature selection method for DDoS detection in cloud computing. EURASIP Journal on Wireless Communications and Networking, 2016(1). https://doi.org/10.1186/s13638-016-0623-3
Prasetyo, S. D., Hilabi, S. S., & Nurapriani, F. (2023). Analisis Sentimen Relokasi Ibukota Nusantara Menggunakan Algoritma Naïve Bayes dan KNN. Jurnal KomtekInfo, 1–7.
Reddy, V. S. K., Meghana, P., Reddy, N. V. S., & Rao, B. A. (2022). Prediction on cardiovascular disease using decision tree and naïve Bayes classifiers. In Journal of Physics: Conference Series (Vol. 2022, p. 012015). IOP Publishing.
Restrepo-Uribe, J., Suescun, J., Orrego, D., Urda-Benitez, R., & Murillo-Escobar, J. (2022). Particle swarm optimization setup for hyperparameter selection in support vector machines. https://doi.org/10.21203/rs.3.rs-1149920/v1
Rizki, M., Hermawan, A., & Avianto, D. (2024). Optimization of hyperparameter k in k-nearest neighbor using particle swarm optimization. JUITA: Jurnal Informatika, 12(1), 71–80. https://doi.org/10.30595/juita.v12i1.20688
Septiawan, R., Prakoso, B., & Kurniawan, I. (2022). DPP IV inhibitors activities prediction as an anti-diabetic agent using particle swarm optimization-support vector machine method. Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), 6(6), 974–980. https://doi.org/10.29207/resti.v6i6.4470
Štiglic, G., Kocbek, P., Fijačko, N., Žitnik, M., Verbert, K., & Cilar, L. (2020). Interpretability of machine learning‐based prediction models in healthcare. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 10(5). https://doi.org/10.1002/widm.1379
Sigit, F. S., Tahapary, D. L., Trompet, S., Sartono, E., Willems Van Dijk, K., Rosendaal, F. R., & De Mutsert, R. (2020). The prevalence of metabolic syndrome and its association with body fat distribution in middle-aged individuals from Indonesia and the Netherlands: A cross-sectional analysis of two population-based studies. Diabetology & Metabolic Syndrome, 12(1), 2. https://doi.org/10.1186/s13098-019-0503-1
Taslim, T., Handayani, S., & Toresa, D. (2023). Feature selection in naïve Bayes for predicting ICU needs of COVID-19 patients. Indonesian Journal of Computer Science, 12(3). https://doi.org/10.33022/ijcs.v12i3.3211
Wahyuni, T., Fitriani, D., Harianto, J., & Ritanti, R. (2022). Cardiovascular disease, comorbidities, and late adult in Indonesia: A cross-sectional population-based national survey. Media Keperawatan Indonesia, 5(3), 208–215. https://doi.org/10.26714/mki.5.3.2022.208-215
Wang, H., Khoshgoftaar, T., & Gao, K. (2010). A comparative study of filter-based feature ranking techniques. In Proceedings of the 2010 IEEE International Conference on Information Reuse and Integration, 43–48. https://doi.org/10.1109/iri.2010.5558966
Wibawa, A., Kurniawan, A., Murti, D., Adiperkasa, R., Putra, S., Kurniawan, S., … & Nugraha, Y. (2019). Naïve Bayes classifier for journal quartile classification. International Journal of Recent Contributions from Engineering, Science & IT, 7(2), 91. https://doi.org/10.3991/ijes.v7i2.10659
DOI: https://doi.org/10.33394/j-ps.v13i3.15573
Refbacks
- There are currently no refbacks.
Copyright (c) 2025 Sita Muharni, Sigit Andriyanto, Supardi Supardi

This work is licensed under a Creative Commons Attribution 4.0 International License.
J-PS (Prisma Sains: Jurnal Pengkajian Ilmu dan Pembelajaran Matematika dan IPA IKIP Mataram) p-ISSN (print) 2338-4530, e-ISSN (online) 2540-7899 is licensed under a Creative Commons Attribution 4.0 International License.