FEATURE SELECTION BASED ON GENETIC ALGORITHM, PARTICLE SWARM OPTIMIZATION AND PRINCIPAL COMPONENT ANALYSIS FOR OPINION MINING COSMETIC PRODUCT REVIEW

research
  • 22 Feb
  • 2024

FEATURE SELECTION BASED ON GENETIC ALGORITHM, PARTICLE SWARM OPTIMIZATION AND PRINCIPAL COMPONENT ANALYSIS FOR OPINION MINING COSMETIC PRODUCT REVIEW

Opinion mining is an automation technique of textual data from opinion sentence that produce sentiment information. It is also called sentiment analysis that involves the construction of a system for collecting and classifying opinions about a product review done by understanding, extracting and processing the text in an opinion sentence become positive, negative, and neutral. One of the techniques mostly used by data classification is Support Vector Machine (SVM). SVM is able to identify the separated hyper plane that maximizes the margin between two different classes. However, SVM has a weakness for parameter selection or suitable feature. In this research, the researchers made an improvement toward the previous research using combined method of feature selection in SVM through comparing three-feature selection; Genetic Algorithm, Particle Swarm Optimization, and Principal Component Analysis. It can be determined which one of the best feature selections that improve the classification accuracy of SVM. The dataset was cosmetic products review downloaded from www.amazon.com. Measurement is based on SVM accuracy by adding the feature selection method. While the evaluation used 10 Fold Cross Validation and the accuracy measurement used the Confusion Matrix and ROC Curve. The result of the measurement accuracy of SVM accuracy is obtained with average 82.00% and the average AUC 0.988. After the integration of SVM algorithm and feature selection, Genetic algorithm shows the best results with average accuracy 94.00% and the average AUC 0.984. Particle Swarm Optimization indicates the best results with average accuracy 97.00% and the average AUC 0.988. While Principal Component Analysis indicates the best results with average accuracy 83.00% and the average AUC 0.809. As conclusion, the research of SVM Algorithm showed the best accuracy improvement toward the feature selection of Particle Swarm Optimization integrated with the increased accuracy from 82.00% to 97.00%.

Unduhan

 

REFERENSI