Diabetes adalah salah satu penyakit kronis yang mengancam jiwa dengan pertumbuhan tercepat yang telah mempengaruhi 422 juta orang di seluruh dunia menurut laporan Organisasi Kesehatan Dunia (WHO), pada tahun 2018. Diabetes dianggap sebagai salah satu penyakit paling mematikan dan kronis yang menyebabkan peningkatan gula darah. Banyak komplikasi terjadi jika diabetes tetap tidak diobati dan tidak teridentifikasi. Namun, peningkatan pendekatan machine learning memecahkan masalah kritis ini. Tujuan dari penelitian ini adalah merancang model yang dapat memprakirakan kemungkinan terjadinya diabetes pada pasien dengan ketelitian yang maksimal. Klasifikasi adalah teknik data mining yang menetapkan kategori pada kumpulan data untuk membantu dalam memprediksi dan analisis yang lebih akurat. Oleh karena itu tiga algoritma klasifikasi machine
learning yaitu Suport Vector Machine, Naive Bayes dan Random Forest digunakan dalam percobaan ini untuk mendeteksi diabetes secara dini. Eksperimen dilakukan menggunakan dataset Diabetes Hospital in Sylhet, Bangladesh yang bersumber dari UCI repository. Performa ketiga algoritma dievaluasi pada berbagai ukuran seperti Precision, Accuracy, F-Measure, dan Recall. Akurasi diukur melalui instance yang diklasifikasikan dengan benar dan salah. Hasil yang diperoleh menunjukkan
Random Forest mengungguli dengan nilai akurasi tertinggi 97,88% dibandingkan algoritma lain. Hasil ini diverifikasi menggunakan kurva Receiver Operating Characteristic (ROC) secara tepat dan sistematis.
Jurnal_Prediksi Kemungkinan Diabetes pada Tahap Awal Menggunakan Algoritma Klasifikasi Random Forest
[1] D. Sisodia and D. S. Sisodia, “Prediction of Diabetes using
Classification Algorithms,”
Procedia Comput. Sci., vol. 132, pp. 1578–1585, 2018, doi:
10.1016/j.procs.2018.05.122.
[2] A. Viloria, Y. Herazo-Beltran, D. Cabrera, and O. B.
Pineda, “Diabetes Diagnostic Prediction
Using Vector Support Machines,” Procedia Comput. Sci., vol.
170, pp. 376–381, 2020, doi:
10.1016/j.procs.2020.03.065.
[3] S. Hadijah, “Gejala Diabetes, Ciri-Ciri Diabetes,
Penyebab Diabetes, Serta Penanganan
Penyakit Diabetes yang Perlu Kamu Tahu,” 10 November, 2017.
https://www.cermati.com/artikel/gejala-diabetes-ciri-ciri-diabetes-penyebab-diabetes-sertapenanganan-penyakit-diabetes-yang-perlu-kamu-tahu
(accessed Dec. 10, 2020).
[4] H. Wu, S. Yang, Z. Huang, J. He, and X. Wang, “Type 2
diabetes mellitus prediction model
based on data mining,” Informatics Med. Unlocked, vol. 10,
pp. 100–107, 2018, doi:
10.1016/j.imu.2017.12.006.
[5] D. J. Reddy et al., “Materials Today : Proceedings
Predictive machine learning model for early
detection and analysis of diabetes,” Mater. Today Proc.,
2020, doi:
10.1016/j.matpr.2020.09.522.
[6] N. P. Tigga and S. Garg, “Prediction of Type 2 Diabetes
using Machine Learning Classification
Methods,” Procedia Comput. Sci., vol. 167, no. 2019, pp.
706–716, 2020, doi:
10.1016/j.procs.2020.03.336.
[7] L. B. Moreira and A. A. Namen, “A hybrid data mining
model for diagnosis of patients with
clinical suspicion of dementia,” Comput. Methods Programs
Biomed., vol. 165, pp. 139–149,
2018, doi: 10.1016/j.cmpb.2018.08.016.
[8] A. Mujumdar and V. Vaidehi, “Diabetes Prediction using
Machine Learning Algorithms,”
Procedia Comput. Sci., vol. 165, pp. 292–299, 2019, doi:
10.1016/j.procs.2020.01.047.
[9] R. B. Lukmanto and E. Irwansyah, “The Early Detection of
Diabetes Mellitus (DM) Using
Fuzzy Hierarchical Model,” Procedia Comput. Sci., vol. 59,
no. Iccsci, pp. 312–319, 2015, doi:
10.1016/j.procs.2015.07.571.
[10] C. Fiarni, E. M. Sipayung, and S. Maemunah, “Analysis
and prediction of diabetes
[11] I. Kavakiotis, O. Tsave, A. Salifoglou, N.
Maglaveras, I. Vlahavas, and I. Chouvarda, “Machine
Learning and Data Mining Methods in Diabetes
Research,” Comput. Struct. Biotechnol. J., vol.
15, pp. 104–116, 2017, doi:
10.1016/j.csbj.2016.12.005.
[12] S. Perveen, M. Shahbaz, A. Guergachi, and
K. Keshavjee, “Performance Analysis of Data
Mining Classification Techniques to Predict
Diabetes,” Procedia Comput. Sci., vol. 82, no.
March, pp. 115–121, 2016, doi:
10.1016/j.procs.2016.04.016.
[13] M. M. F. Islam, R. Ferdousi, S. Rahman, and
H. Y. Bushra, “Likelihood Prediction of Diabetes
at Early Stage Using Data Mining Techniques,” Comput. Vis. Mach. Intell. Med. Image Anal.,
pp. 113–125, 2020, doi:
doi.org/10.1007/978-981-13-8798-2_12.
[14] S. Salcedo-Sanz, J. L. Rojo-Álvarez, M.
Martínez-Ramón, and G. Camps-Valls, “Support
vector machines in engineering: An overview,” Wiley Interdiscip. Rev. Data Min. Knowl.
Discov., vol.
4, no. 3, pp. 234–267, 2014, doi: 10.1002/widm.1125.
[15] M. Sewak, P. Vaidya, C.-C. Chan, and
Zhong-Hui Duan, “SVM Approach to Breast Cancer
Classification,” Second
Int. Multi-Symposiums Comput. Comput. Sci. (IMSCCS 2007), pp. 32–
37, 2007, doi: 10.1109/IMSCCS.2007.46.
[16] H. Kucuk and I. Eminoglu, “Classification
of ALS disease using support vector machines,”
2015 23nd Signal Processing and Communications
Application Conference (SIU), Malatya,
vol. 3, no. 2, pp. 1664–1667, 2015, doi:
10.1109/siu.2015.7130171.
[17] W. Yu, T. Liu, R. Valdez, M. Gwinn, and M.
J. Khoury, “Application of support vector
machine modeling for prediction of common
diseases: the case of diabetes and pre-diabetes,”
Med. Informatics Decis. Mak., pp. 1–7, 2010.
[18] H. Zhang, C. T. Liu, J. Mao, C. Shen, R. L.
Xie, and B. Mu, “Development of novel in silico
prediction model for drug-induced ototoxicity by
using naïve Bayes classifier approach,”
Toxicol. Vitr.,
vol. 65, no. September 2019, 2020, doi: 10.1016/j.tiv.2020.104812.
[19] A. Khajenezhad, M. A. Bashiri, and H.
Beigy, “A distributed density estimation algorithm and
its application to naive Bayes classification,” Appl. Soft Comput., p. 106837,
2020, doi:
10.1016/j.asoc.2020.106837.
[20] L. Breiman, “Random forests,” Machine Learning, vol 45 no. 1
pp. 5–32, 2001.
[21] L. Breiman, “Bagging predictors,” Machine Learning., vol. 24,
no. 2, pp. 123–140, 1996
[22] T. K. Ho, “The Random Subspace Method for
Constructing Decision Forests,” vol. 20, no. 8,
pp. 832–844, 1998.
[23] H. R. Pourghasemi et al., Spatial modeling, risk mapping, change detection, and
outbreak trend
analysis of coronavirus (COVID-19) in Iran
(days between February 19 and June 14, 2020),
vol. 98, June. International Society for
Infectious Diseases, 2020.
[24] M. Jeung, S. Baek, J. Beom, K. H. Cho, Y.
Her, and K. Yoon, “Evaluation of random forest
and regression tree methods for estimation of
mass first flush ratio in urban catchments,” J.
Hydrol., vol.
575, May, pp. 1099–1110, 2019, doi: 10.1016/j.jhydrol.2019.05.079.
[25] E. Izquierdo-Verdiguier and R.
Zurita-Milla, “An evaluation of Guided Regularized Random
Forest for classification and regression tasks
in remote sensing,” Int. J. Appl. Earth Obs.
Geoinf., vol.
88, no. October 2019, p. 102051, 2020, doi: 10.1016/j.jag.2020.102051.
[26] T. Hengl, M. Nussbaum, M. N. Wright, G. B.
M. Heuvelink, and B. Gräler, “Random forest as
a generic framework for predictive modeling of
spatial and spatio-temporal variables,” PeerJ,
vol. 2018, no. 8, 2018, doi: 10.7717/peerj.5518.
[27] S. Oliveira, F. Oehler, J.
San-Miguel-Ayanz, A. Camia, and J. M. C. Pereira, “Modeling spatial
patterns of fire occurrence in Mediterranean
Europe using Multiple Regression and Random
Forest,” For.
Ecol. Manage., vol. 275, pp. 117–129, 2012, doi:
10.1016/j.foreco.2012.03.003.
[28] P. Zahedi, S. Parvandeh, A. Asgharpour, B.
S. McLaury, S. A. Shirazi, and B. A. McKinney,
“Random forest regression prediction of solid
particle Erosion in elbows,” Powder Technol.,
vol. 338, pp. 983–992, 2018, doi:
10.1016/j.powtec.2018.07.055.
[29] R. Arora and S. Suman, “Comparative
Analysis of Classification Algorithms on Different
Datasets using WEKA,” Int. J. Comput. Appl., vol.
54, no. 13, pp. 21–25,