Pemilihan Fitur Pada Analisis Sentimen Review Travel Online Menggunakan Algoritma Naïve Bayes Dalam Penerapan Mutual Information Dan Particle Swarm Optimization (PSO)
Sari
Pertumbuhan pesat dalam aplikasi Internet di bidang pariwisata telah menyebabkan pembesaran ulasan pribadi untuk informasi terkait perjalanan di Web. Di antara kategori utama produk dan layanan, perjalanan memiliki persentase tertinggi sebelum dilakukan pembelian secara online untuk melihat bahwa 73% dari wisatawan melakukan pencarian online sebelum membuat keputusan perjalanan mereka. Algoritma Naïve Bayes dengan mutual information dan partikel swarm optimization sebagai fitur pilihan dan algoritma Naïve Bayes tanpa fitur seleksi, dapat diterapkan pada prediksi travel review online. Dari pengolahan data yang telah dilakukan, menggabungkan metode pemilihan fitur, filter menggunakan mutual information dan wrapper menggunakan partikel swarm optimization. Keduanya menghasilkan akurasi klasifikasi yang baik. Hasil prediksi dari peninjauan perjalanan online dengan algoritma Naïve Bayes tanpa fitur seleksi memiliki akurasi 81,50% dengan nilai AUC 0,500, sedangkan hasil prediksi dari travel review online menggunakan algoritma naïve Bayes dengan mutual information dan partikel swarm optimization sebagai pemilihan fitur memiliki akurasi 93,50% dengan nilai AUC 0,965. Peningkatan akurasi data 12% diperoleh dari 200 ulasan. Jadi, travel review prediksi secara online menggunakan algoritma Naïve Bayes dan algoritme seleksi fitur lebih unggul daripada menggunakan naïve bayes tanpa pemilihan fitur.
Kata kunci: analisis sentimen, klasifikasi teks, naïve bayes
Abstract
The rapid growth in Internet applications in the field of tourism has led to a large number of personal reviews for travel-related information on the Web. Among the major categories of products and services, travel has the highest percentage of pre-purchase online looking at which 73% of the tourists do a search online before making their travel decisions. Naïve Bayes algorithm with mutual information and particle swarm optimization as feature selection and naïve Bayes algorithm without feature selection, can be applied to the prediction travel review online. Of data processing that has been done, combining feature selection method, the filter uses mutual information and wrapper using particle swarm optimization. Both produce good classification accuracy. Prediction results of a review of travel online with algorithms naïve Bayes without feature selection has an accuracy of 81.50% with a value of AUC 0.500, while the prediction results of the reviews online travel using an algorithm naïve Bayes with mutual information and particle swarm optimization as feature selection has an accuracy of 93.50% with a value of AUC 0.965. Improved accuracy of 12% data was obtained from 200 reviews. So the prediction travel review online using naïve Bayes algorithm and feature selection algorithms are superior to using naïve bayes without feature selection.
Keywords: sentiment analysis, text classification, naïve bayes
Teks Lengkap:
PDFReferensi
Abou-shouk, M. A., Mun, W., & Megicks, P. (2016). Using competing models to evaluate the role of environmental pressures in ecommerce adoption by small and medium sized travel agents in a developing country. Tourism Management, 52, 327–339. http://doi.org/10.1016/j.tourman.2015.07.007
Alshalabi, H., Tiun, S., Omar, N., & Albared, M. (2013). Experiments on the Use of Feature Selection and Machine Learning Methods in Automatic Malay Text Categorization. Procedia Technology, 11(Iceei), 748–754. http://doi.org/10.1016/j.protcy.2013.12.254
Bermejo, P., Gámez, J. a., & Puerta, J. M. (2011). Improving the performance of Naive Bayes multinomial in e-mail foldering by introducing distribution-based balance of datasets. Expert Systems with Applications, 38(3), 2072–2080. http://doi.org/10.1016/j.eswa.2010.07.146
Chen, J., Huang, H., Tian, S., & Qu, Y. (2009). Expert Systems with Applications Feature selection for text classification with Naïve Bayes. Expert Systems With Applications, 36, 5432–5435. http://doi.org/10.1016/j.eswa.2008.06.054
Dawson, C. W. (2009). Projects in Computing and Information Systems (pp. 1–297). Pearson Education Limited.
Eng, T. J. E., & Sci, C. (2012). Hybrid feature selection for text classification ¨. Turk J Elec & Comp Sci, 20(2), 1296–1311. http://doi.org/10.3906/elk-1101-1064
Farid, D., Zhang, L., Mofizur, C., Hossain, M. A., & Strachan, R. (2014). Expert Systems with Applications Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks. Expert Systems With Applications, 41, 1937–1946.
Feldman, R. (2007). The Text Mining Handbook. Chemistry & … (pp. 1–423). Retrieved from http://onlinelibrary.wiley.com/doi/10.1002/cbdv.200490137/abstract
Godes, D. (2005). The Firm ’ s Management of Social Interactions. Springer Sciences, (1), 415–428.
Gorunescu, F. (2011). Data Mining Concept, Models, and Techniques (pp. 1–370). Springer.
Grosan, C., Abraham, A., & Chis, M. (2006). Swarm Intelligence in Data Mining, 20, 1–20.
Gupta, V., Science, L. C., & Lehal, G. S. (2009). A Survey of Text Mining Techniques and Applications, 1(1), 60–76.
Hoque, N., Bhattacharyya, D. K., & Kalita, J. K. (2014). Expert Systems with Applications MIFS-ND : A mutual information-based feature selection method. Expert Systems With Applications, 41(14), 6371–6385. http://doi.org/10.1016/j.eswa.2014.04.019
Indriani, A., & Nbc, D. (2014). Klasifikasi Data Forum dengan menggunakan Metode Naïve Bayes Classifier. Seminar Nasional Aplikasi Teknologi Informasi (SNATI), 5–10.
Liu, R., Chen, Y., Jiao, L., & Li, Y. (2014). A particle swarm optimization based simultaneous learning framework for clustering and classification. Pattern Recognition, 47, 2143–2152.
Liu, Y., Wang, G., Chen, H., Dong, H., Zhu, X., & Wang, S. (2011). An Improved Particle Swarm Optimization for Feature Selection. Journal of Bionic Engineering, 8, 1–10.
Lu, S., Chiang, D., Keh, H., & Huang, H. (2010). Knowledge-Based Systems Chinese text classification by the Naïve Bayes Classifier and the associative classifier with multiple confidence threshold values. Knowledge-Based Systems, 23, 598–604. http://doi.org/10.1016/j.knosys.2010.04.004
Maimon, O., & Rokach, L. (2010). Data Mining and Knowledge Discovery Handbook (pp. 1–1306). Springer. Retrieved from http://onlinelibrary.wiley.com/doi/10.1002/cbdv.200490137/abstract
Narayanan, V., Arora, I., & Bhatia, A. (2009). Fast and accurate sentiment classification using an enhanced Naive Bayes model. Departement of Electronics Engineering, 1–8.
Parimala, R., & Nallaswamy, R. (2012). Feature Selection using a Novel Particle Swarm Optimization and It’s Variants. International Journal of Information Technology and Computer Science, 4(5), 16–24. http://doi.org/10.5815/ijitcs.2012.05.03
Serrano-guerrero, J., Olivas, J. A., Romero, F. P., & Herrera-viedma, E. (2015). Sentiment analysis : A review and comparative analysis of web services. Information Sciences, 311, 18–38.
Tripathy, A., Agrawal, A., & Rath, S. K. (2015). Classification of Sentimental Reviews Using Machine Learning Techniques. Procedia Computer Science, 57, 821–829. http://doi.org/10.1016/j.procs.2015.07.523
Uysal, A. K., & Gunal, S. (2012). A novel probabilistic feature selection method for text classification. Knowledge-Based Systems, 36(October), 226–235. http://doi.org/10.1016/j.knosys.2012.06.005
Vercellis, C. (2009). Business Intelligence (pp. 1–420). Wiley.
Williams, L., Bannister, C., Arribas-ayllon, M., Preece, A., & Spasic, I. (2015). Expert Systems with Applications. Expert Systems With Applications, 42, 7375–7385.
Ye, Q., Zhang, Z., & Law, R. (2009). Expert Systems with Applications Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. Expert Systems With Applications, 36(3), 6527–6535. http://doi.org/10.1016/j.eswa.2008.07.035
Youn, E., & Jeong, M. K. (2009). Class dependent feature scaling method using naive Bayes classifier for text datamining. Pattern Recognition Letters, 30, 477–485. http://doi.org/10.1016/j.patrec.2008.11.013
Ziveria, M. (2014). Secara Semantik Menggunakan Teori Mutual. Seminar Nasional Sistem Informasi Indonesia, (September), 203–210.
DOI: https://doi.org/10.31294/ijcit.v3i1.3763
##submission.copyrightStatement##
##submission.license.cc.by-sa4.footer##
P-ISSN: 2527-449X E-ISSN: 2549-7421
Statistik Pengunjung Jurnal IJCIT