Prediksi Kemungkinan Diabetes pada Tahap Awal Menggunakan Algoritma Klasifikasi Random Forest

research
  • 28 Sep
  • 2022

Prediksi Kemungkinan Diabetes pada Tahap Awal Menggunakan Algoritma Klasifikasi Random Forest

Diabetes adalah salah satu penyakit kronis yang mengancam jiwa dengan pertumbuhan tercepat yang telah mempengaruhi 422 juta orang di seluruh dunia menurut laporan Organisasi Kesehatan Dunia (WHO), pada tahun 2018. Diabetes dianggap sebagai salah satu penyakit paling mematikan dan kronis yang menyebabkan peningkatan gula darah. Banyak komplikasi terjadi jika diabetes tetap tidak diobati dan tidak teridentifikasi. Namun, peningkatan pendekatan machine learning memecahkan masalah kritis ini. Tujuan dari penelitian ini adalah merancang model yang dapat memprakirakan kemungkinan terjadinya diabetes pada pasien dengan ketelitian yang maksimal. Klasifikasi adalah teknik data mining yang menetapkan kategori pada kumpulan data untuk membantu dalam memprediksi dan analisis yang lebih akurat. Oleh karena itu tiga algoritma klasifikasi machine
learning yaitu Suport Vector Machine, Naive Bayes dan Random Forest digunakan dalam percobaan ini untuk mendeteksi diabetes secara dini. Eksperimen dilakukan menggunakan dataset Diabetes Hospital in Sylhet, Bangladesh yang bersumber dari UCI repository. Performa ketiga algoritma dievaluasi pada berbagai ukuran seperti Precision, Accuracy, F-Measure, dan Recall. Akurasi diukur melalui instance yang diklasifikasikan dengan benar dan salah. Hasil yang diperoleh menunjukkan
Random Forest mengungguli dengan nilai akurasi tertinggi 97,88% dibandingkan algoritma lain. Hasil ini diverifikasi menggunakan kurva Receiver Operating Characteristic (ROC) secara tepat dan sistematis.

Unduhan

  • 1129-3274-1-PB.pdf

    Jurnal_Prediksi Kemungkinan Diabetes pada Tahap Awal Menggunakan Algoritma Klasifikasi Random Forest

    •   diunduh 381x | Ukuran 671 KB

 

REFERENSI

[1] D. Sisodia and D. S. Sisodia, “Prediction of Diabetes using Classification Algorithms,”

Procedia Comput. Sci., vol. 132, pp. 1578–1585, 2018, doi: 10.1016/j.procs.2018.05.122.

[2] A. Viloria, Y. Herazo-Beltran, D. Cabrera, and O. B. Pineda, “Diabetes Diagnostic Prediction

Using Vector Support Machines,” Procedia Comput. Sci., vol. 170, pp. 376–381, 2020, doi:

10.1016/j.procs.2020.03.065.

[3] S. Hadijah, “Gejala Diabetes, Ciri-Ciri Diabetes, Penyebab Diabetes, Serta Penanganan

Penyakit Diabetes yang Perlu Kamu Tahu,” 10 November, 2017.

https://www.cermati.com/artikel/gejala-diabetes-ciri-ciri-diabetes-penyebab-diabetes-sertapenanganan-penyakit-diabetes-yang-perlu-kamu-tahu (accessed Dec. 10, 2020).

[4] H. Wu, S. Yang, Z. Huang, J. He, and X. Wang, “Type 2 diabetes mellitus prediction model

based on data mining,” Informatics Med. Unlocked, vol. 10, pp. 100–107, 2018, doi:

10.1016/j.imu.2017.12.006.

[5] D. J. Reddy et al., “Materials Today : Proceedings Predictive machine learning model for early

detection and analysis of diabetes,” Mater. Today Proc., 2020, doi:

10.1016/j.matpr.2020.09.522.

[6] N. P. Tigga and S. Garg, “Prediction of Type 2 Diabetes using Machine Learning Classification

Methods,” Procedia Comput. Sci., vol. 167, no. 2019, pp. 706–716, 2020, doi:

10.1016/j.procs.2020.03.336.

[7] L. B. Moreira and A. A. Namen, “A hybrid data mining model for diagnosis of patients with

clinical suspicion of dementia,” Comput. Methods Programs Biomed., vol. 165, pp. 139–149,

2018, doi: 10.1016/j.cmpb.2018.08.016.

[8] A. Mujumdar and V. Vaidehi, “Diabetes Prediction using Machine Learning Algorithms,”

Procedia Comput. Sci., vol. 165, pp. 292–299, 2019, doi: 10.1016/j.procs.2020.01.047.

[9] R. B. Lukmanto and E. Irwansyah, “The Early Detection of Diabetes Mellitus (DM) Using

Fuzzy Hierarchical Model,” Procedia Comput. Sci., vol. 59, no. Iccsci, pp. 312–319, 2015, doi:

10.1016/j.procs.2015.07.571.

[10] C. Fiarni, E. M. Sipayung, and S. Maemunah, “Analysis and prediction of diabetes


[11] I. Kavakiotis, O. Tsave, A. Salifoglou, N. Maglaveras, I. Vlahavas, and I. Chouvarda, “Machine
Learning and Data Mining Methods in Diabetes Research,” Comput. Struct. Biotechnol. J., vol.
15, pp. 104–116, 2017, doi: 10.1016/j.csbj.2016.12.005.
[12] S. Perveen, M. Shahbaz, A. Guergachi, and K. Keshavjee, “Performance Analysis of Data
Mining Classification Techniques to Predict Diabetes,” Procedia Comput. Sci., vol. 82, no.
March, pp. 115–121, 2016, doi: 10.1016/j.procs.2016.04.016.
[13] M. M. F. Islam, R. Ferdousi, S. Rahman, and H. Y. Bushra, “Likelihood Prediction of Diabetes
at Early Stage Using Data Mining Techniques,” Comput. Vis. Mach. Intell. Med. Image Anal.,
pp. 113–125, 2020, doi: doi.org/10.1007/978-981-13-8798-2_12.
[14] S. Salcedo-Sanz, J. L. Rojo-Álvarez, M. Martínez-Ramón, and G. Camps-Valls, “Support
vector machines in engineering: An overview,” Wiley Interdiscip. Rev. Data Min. Knowl.
Discov., vol. 4, no. 3, pp. 234–267, 2014, doi: 10.1002/widm.1125.
[15] M. Sewak, P. Vaidya, C.-C. Chan, and Zhong-Hui Duan, “SVM Approach to Breast Cancer
Classification,” Second Int. Multi-Symposiums Comput. Comput. Sci. (IMSCCS 2007), pp. 32–
37, 2007, doi: 10.1109/IMSCCS.2007.46.
[16] H. Kucuk and I. Eminoglu, “Classification of ALS disease using support vector machines,”
2015 23nd Signal Processing and Communications Application Conference (SIU), Malatya,
vol. 3, no. 2, pp. 1664–1667, 2015, doi: 10.1109/siu.2015.7130171.
[17] W. Yu, T. Liu, R. Valdez, M. Gwinn, and M. J. Khoury, “Application of support vector
machine modeling for prediction of common diseases: the case of diabetes and pre-diabetes,”
Med. Informatics Decis. Mak., pp. 1–7, 2010.
[18] H. Zhang, C. T. Liu, J. Mao, C. Shen, R. L. Xie, and B. Mu, “Development of novel in silico
prediction model for drug-induced ototoxicity by using naïve Bayes classifier approach,”
Toxicol. Vitr., vol. 65, no. September 2019, 2020, doi: 10.1016/j.tiv.2020.104812.
[19] A. Khajenezhad, M. A. Bashiri, and H. Beigy, “A distributed density estimation algorithm and
its application to naive Bayes classification,” Appl. Soft Comput., p. 106837, 2020, doi:
10.1016/j.asoc.2020.106837.
[20] L. Breiman, “Random forests,” Machine Learning, vol 45 no. 1 pp. 5–32, 2001.
[21] L. Breiman, “Bagging predictors,” Machine Learning., vol. 24, no. 2, pp. 123–140, 1996
[22] T. K. Ho, “The Random Subspace Method for Constructing Decision Forests,” vol. 20, no. 8,
pp. 832–844, 1998.
[23] H. R. Pourghasemi et al., Spatial modeling, risk mapping, change detection, and outbreak trend
analysis of coronavirus (COVID-19) in Iran (days between February 19 and June 14, 2020),
vol. 98, June. International Society for Infectious Diseases, 2020.
[24] M. Jeung, S. Baek, J. Beom, K. H. Cho, Y. Her, and K. Yoon, “Evaluation of random forest
and regression tree methods for estimation of mass first flush ratio in urban catchments,” J.
Hydrol., vol. 575, May, pp. 1099–1110, 2019, doi: 10.1016/j.jhydrol.2019.05.079.
[25] E. Izquierdo-Verdiguier and R. Zurita-Milla, “An evaluation of Guided Regularized Random
Forest for classification and regression tasks in remote sensing,” Int. J. Appl. Earth Obs.
Geoinf., vol. 88, no. October 2019, p. 102051, 2020, doi: 10.1016/j.jag.2020.102051.
[26] T. Hengl, M. Nussbaum, M. N. Wright, G. B. M. Heuvelink, and B. Gräler, “Random forest as
a generic framework for predictive modeling of spatial and spatio-temporal variables,” PeerJ,
vol. 2018, no. 8, 2018, doi: 10.7717/peerj.5518.
[27] S. Oliveira, F. Oehler, J. San-Miguel-Ayanz, A. Camia, and J. M. C. Pereira, “Modeling spatial
patterns of fire occurrence in Mediterranean Europe using Multiple Regression and Random
Forest,” For. Ecol. Manage., vol. 275, pp. 117–129, 2012, doi: 10.1016/j.foreco.2012.03.003.
[28] P. Zahedi, S. Parvandeh, A. Asgharpour, B. S. McLaury, S. A. Shirazi, and B. A. McKinney,
“Random forest regression prediction of solid particle Erosion in elbows,” Powder Technol.,
vol. 338, pp. 983–992, 2018, doi: 10.1016/j.powtec.2018.07.055.
[29] R. Arora and S. Suman, “Comparative Analysis of Classification Algorithms on Different
Datasets using WEKA,” Int. J. Comput. Appl., vol. 54, no. 13, pp. 21–25,