Application of PCA and K-Means Clustering Methods to Identify Diabetes Mellitus Patient Groups Based on Risk Factors

Anisa Simanjuntak, Muhammad Siddik Hasibuan

Abstract


Diabetes mellitus is a chronic disease characterized by high levels of glucose (sugar) in the blood that is high for a long period of time. Identification is the process of recognizing and determining the characteristics of a particular object or entity. hypertension (high blood pressure), smoking and lack of physical activity can affect the condition of diabetes mellitus patients. Therefore, an approach is needed that can identify groups of diabetic patients based on their risk factors, so that appropriate management and treatment can be carried out. The purpose of this study is to apply PCA method by reducing data dimension to identify the linear combination of the most contributing risk factors in diabetes mellitus patient data and apply K-Means Clustering to cluster into groups based on similar risk factors. The methods to be used are Principal Component Analysis (PCA) and K-Means Clustering. type of quantitative research, this research can be categorized as analytic research, variables are risk factors for diabetes mellitus disease. The results of research using the PCA (principal component analysis) method obtained 9 main components (PC) 86.9275%. correlation between attributes and principal components, then a matrix component is formed with a loading value that the greater the value, the stronger the correlation with the principal component formed with a cut off point of loading value> 0.4 regardless of positive and negative. By using the K-Means Clustering method, The clustering results obtained are divided into 3 groups of diabetes patients based on existing risk factors. Centroid C1 represents a group of diabetes mellitus patients whose condition is at a mild level, while Centroid C2 represents a group of diabetes mellitus patients who are at a moderate level, and Centroid C3 represents a group of patients with severe or dangerous diabetes mellitus.


Keywords


Diabetes Mellitus, Principal Component Analysis, K-Means Clustering

Full Text:

PDF

References


Abdillah, A. A., & Prianto, B. (2019). Pembelajaran Mesin Menggunakan Principal Component Analysis dan Support Vector Machines untuk Mendeteksi Diabetes. Jurnal Matematika Dan Sains, 24(1), 10–14. https://doi.org/10.5614/jms.2019.24.1.2

Agustanti, D., & Purbianto, P. (2022). Pengaruh Konsumsi Air Alkali Terhadap Kadar Glukosa Darah Pada Pasien Diabetes Mellitus. Jurnal Ilmiah Keperawatan Sai Betik, 16(2), 149. https://doi.org/10.26630/jkep.v16i2.3099

Azizah, U. N., Wurjanto, M. A., Kusariana, N., & Susanto, H. S. (2022). Hubungan Kualitas Tidur dengan Kontrol Glikemik pada Penderita Diabetes Melitus : Systematic Review. Jurnal Epidemiologi Kesehatan Komunitas, 7(1), 411–422. https://doi.org/10.14710/jekk.v7i1.13310

BASTIAN, A. (2018). Penerapan Algoritma K-Means Clustering Analysis Pada Penyakit Menular Manusia (Studi Kasus Kabupaten Majalengka). Jurnal Sistem Informasi, 14(1), 28–34. https://doi.org/10.21609/jsi.v14i1.566

Hayqal, H. H. Q., Oni Soesanto, & Yuana Sukmawaty. (2022). K-Means Clustering dan Principal Component Analysis (PCA) Dalam Radial Basis Function Neural Network (RBFNN) Untuk Klasifikasi Data Multivariat. Journal of Mathematics Theory and Application, 4(1), 1–7. https://doi.org/10.31605/jomta.v4i1.1757

Hediyati, D., & Suartana, I. M. (2021). Penerapan Principal Component Analysis (PCA) Untuk Reduksi Dimensi Pada Proses Clustering Data Produksi Pertanian Di Kabupaten Bojonegoro. Journal of Information Engineering and Educational Technology, 5(2). https://doi.org/10.26740/jieet.v5n2.p49-54

IDF. (2021). International Diabetes Federation. Diabetes Research and Clinical Practice. https://doi.org/10.1016/j.diabres.2013.10.013

Ilu, S. Y., Rajesh, P., & Mohammed, H. (2022). Prediction of COVID-19 using long short-term memory by integrating principal component analysis and clustering techniques. Informatics in Medicine Unlocked, 31(June), 100990. https://doi.org/10.1016/j.imu.2022.100990

Jamal, A., Handayani, A., Septiandri, A. A., Ripmiatin, E., & Effendi, Y. (2018). Dimensionality Reduction using PCA and K-Means Clustering for Breast Cancer Prediction. Lontar Komputer : Jurnal Ilmiah Teknologi Informasi, 9(3), 192. https://doi.org/10.24843/lkjiti.2018.v09.i03.p08

Kemenkes RI. (2018). Penyakit Diabetes Melitus. https://p2ptm.kemkes.go.id/informasi-p2ptm/penyakit-diabetes-melitus

Kesuma Dinata, R., & Hasdyna, N. (2020). Machine Learning.pdf (M. S. DR. Fajriana, S.Si. (ed.); Pertama). Unimal Press.

No, V., Hal, J., Elang, A., Setyadji, S., Wibowo, A. P., Ngurah, I. G., Matthew, A., Pratama, R. B., Masyhuda, T. A., Sinaga, A. A., Purwanti, E., & Werdiningsih, I. (2023). Analisis Klaster Data Pasien Diabetes untuk Identifikasi Pola dan Karakteristik Pasien. 5(3), 172–182.

Nuraisyah, F. (2018). Faktor Risiko Diabetes Mellitus Tipe 2. Jurnal Kebidanan Dan Keperawatan Aisyiyah, 13(2), 120–127. https://doi.org/10.31101/jkk.395

Prasatya, A., Siregar, R. R. A., & Arianto, R. (2020). Penerapan Metode K-Means Dan C4.5 Untuk Prediksi Penderita Diabetes. Petir, 13(1), 86–100. https://doi.org/10.33322/petir.v13i1.925

Purbolaksono, M. D., Irvan Tantowi, M., Imam Hidayat, A., & Adiwijaya, A. (2021). Perbandingan Support Vector Machine dan Modified Balanced Random Forest dalam Deteksi Pasien Penyakit Diabetes. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 5(2), 393–399. https://doi.org/10.29207/resti.v5i2.3008

Riskesdas. (2018). Laporan Provinsi Sumatera Utara Riskesdas 2018. In Badan Penelitian dan Pengembangan Kesehatan.

Simeftiany Indrilemta Lomo, Endang Darmawan, & Sugiyarto. (2023). Cluster analysis of type II Diabetes Mellitus Patients with the Fuzzy C-means method. Annals of Mathematical Modeling, 3(1), 24–31. https://doi.org/10.33292/amm.v3i1.28

WHO. (2018). Noncommunicable diseases. https://www.who.int/news-room/fact-sheets/detail/noncommunicable-diseases

Yulianti, T. R., Siregar, K. N., Prabawa, A., & Fadhilah, N. (2022). Identifikasi Atribut dengan Principal Component Analysis dan K-Means Clustering Sebagai Dasar Penyusunan Strategi Promosi KB Pria di Indonesia. Jurnal Biostatistik, Kependudukan, Dan Informatika Kesehatan, 2(2), 79. https://doi.org/10.51181/bikfokes.v2i2.5868

Zhu, C., Idemudia, C. U., & Feng, W. (2019). Improved logistic regression model for diabetes prediction by integrating PCA and K-means techniques. Informatics in Medicine Unlocked, 17(January), 100179. https://doi.org/10.1016/j.imu.2019.100179




DOI: https://doi.org/10.33394/j-ps.v11i4.9263

Refbacks

  • There are currently no refbacks.




Copyright (c) 2023 Anisa Simanjuntak, Muhammad Siddik Hasibuan

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Creative Commons License
J-PS (Prisma Sains: Jurnal Pengkajian Ilmu dan Pembelajaran Matematika dan IPA IKIP Mataram) p-ISSN (print) 2338-4530, e-ISSN (online) 2540-7899 is licensed under a Creative Commons Attribution 4.0 International License.

View My Stats