Integration of Bagging and Greedy Forward Selection on Image Pap Smear Classification using Naïve Bayes

research
  • 06 May
  • 2019

Integration of Bagging and Greedy Forward Selection on Image Pap Smear Classification using Naïve Bayes

Abstract-Herlev dataset consists of 7 cervical cell classes, i.e. superficial squamous, intermediate squamous, columnar, mild dysplasia, moderate dysplasia, severe dysplasia, and carcinoma in situ is considered. The dataset will be tested to classify two classes, consisting of normal and abnormal cells. Seven different cell types will be classified to separate the cells into 7 classes which are 3 normal cell classes and 4 abnormal cell classes. There are still some difficulties to classify the dataset into seven classes. This Pap smear image dataset has a class with a number of different and unbalanced classes. Another condition is that the data has features that are suspected to be irrelevant, so it is still difficult to classify especially abnormal classes. To handle the class imbalance, this study used ensemble method (Bagging). For handling data that had features and had no contribution, we made feature selection of Greedy Forward Selection. Furthermore, Naïve Bayes was used as learning algorithms. The results of this study obtained the highest accuracy value for the classification of two classes that are normal and abnormal using Naïve Bayes model with Greedy Forward Selection of 92.15%. As the classification of seven classes is good enough for Naïve Bayes model and Greedy Forward Selection with Bagging of 63.25% although it still needs to improve. 

Unduhan

 

REFERENSI

[1] E. Somers, “International Agency for Research on Cancer.,” C.  Can. Med. Assoc. J. = J. l"Association medicale Can., vol. 133, no. 9, pp. 845–846, 1985.

[2] E. Martin and J. Jantzen, Pap-Smear Classification. 2003.

[3] D. Riana, D. E. O. Dewi, D. H. Widyantoro, and T. L. R. Mengko, “Color canals modification with canny edge detection and morphological reconstruction for cell nucleus segmentation and area measurement in normal Pap smear images,” AIP Conf. Proc., vol. 1589, no. Icmns 2012, pp. 414–417, 2014.

[4] D. Riana, M. E. Plissiti, C. Nikou, D. H. Widyantoro, and T. L. R. Mengko, “Inflammatory cell extraction and nuclei detection in Pap smear images,” Int. J. e-Health Med. Commun., vol. 6, no. 2, pp. 27–43, 2015.

[5] D. Riana, D. H. Widyantoro, and T. L. Mengko, “Extraction and classification texture of inflammatory cells and nuclei in normal pap smear images,” Proc. - 2015 4th Int. Conf. Instrumentation, Commun. Inf. Technol. Biomed. Eng. ICICI-BME 2015, pp. 65–69, 2016.

[6] D. Riana, D. Ekashanti, O. Dewi, D. H. Widyantoro, and T. L. R. Mengko, “Segmentation and Area Measurement in Abnormal Pap Smear Images Using Color Canals Modification with Canny Edge Detection,” in International Conference on Women’s Health in Science & Engineering, 2012, pp. 1–4.

[7] J. Hyeon, C. Ho-Jin, B. D. Lee, and K. N. Lee, “Diagnosing Cervical Cell Images Using Pre-trained Convolutional Neural Network as Feature Extractor,” in Big Data and Smart Computing (BigComp), 2017 IEEE International Conference on, 2017, pp. 390–393.

[8] Y. Ramdani, D. Riana, and A. Mubarok, “Analisa Size Citra Sel Tunggal Nukleus Menggunakan Global Threshold dan Operasi Kanal Warna.”

[9] A. Herliana and D. Riana, “KLASIFIKASI SEL TUNGGAL PAP SMEAR TEKSTUR TERSELEKSI MENGGUNAKAN CORRELATION-BASED FEATURES SELECTION BERBASIS DECISION TREE J48,” pp. 144–147, 2014.

[10] D. Kashyap, A. Somani, and J. Shekhar, “Cervical Cancer Detection And Classification Using Independent Level Sets And Multi SVMs,” 39th Int. Conf. Telecommun. Signal Process., pp. 523–528, 2016.

[11] M. S. A and S. Jereesh, “AutomatedCervicalCancerDetectionthroughRGVFse gmentationandSVM Classification.pdf,” pp. 663–669, 2015.

[12] L. Zhang, L. Lu, I. Nogues, R. Summers, S. Liu, and J. Yao, “DeepPap: Deep Convolutional Networks for Cervical Cell Classification,” IEEE J. Biomed. Heal. Informatics, vol. XX, no. c, pp. 1–1, 2017.

[13] Y. E. Kurniawati and A. E. Permanasari, “Comparative Study on Data Mining Classification Methods for Cervical Cancer Prediction Using Pap Smear Results,” 2016.

[14] Y. Ma, G. Luo, X. Zeng, and A. Chen, “Transfer learning for cross-company software defect prediction,” Inf. Softw. Technol., vol. 54, no. 3, pp. 248–256, 2012.

[15] T. M. Khoshgoftaar, J. Van Hulse, and A. Napolitano, “Comparing boosting and bagging techniques with noisy and imbalanced data,” IEEE Trans. Syst. Man, Cybern. Part ASystems Humans, vol. 41, no. 3, pp. 552–568, 2011.

[16] R. S. Wahono, N. S. Herman, and S. Ahmad, “Neural Network Parameter Optimization Based on Genetic Algorithm for Software Defect Prediction,” vol. 20, no. 10, pp. 1951–1955, 2014.

[17] Fitriyani and R. S. Wahono, “Integrasi Bagging dan Greedy Forward Selection pada Prediksi Cacat Software dengan Menggunakan Naïve Bayes,” J. Softw. Eng., vol. 1, no. 2, pp. 101–108, 2015.

[18] J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques. 2012.

[19] I. H. Laradji, M. Alshayeb, and L. Ghouti, “Software defect prediction using ensemble learning on selected features,” Inf. Softw. Technol., vol. 58, pp. 388–402, 2015.

[20] X.-Y. Liu, J. Wu, and Z.-H. Zhou, “Exploratory Undersampling for Class Imbalance Learning,” IEEE Trans. Syst. Man Cybern., vol. 39, no. 2, pp. 539–550, 2009.

[21] F. Gorunescu, Data mining: concepts and techniques. 2011.