Pemilihan Fitur Pada Analisis Sentimen Review Travel Online Menggunakan Algoritma Naïve Bayes Dalam Penerapan Mutual Information Dan Particle Swarm Optimization (PSO)

Lisda Widiastuti

Sari


Abstrak
Pertumbuhan pesat dalam aplikasi Internet di bidang pariwisata telah menyebabkan pembesaran ulasan pribadi untuk informasi terkait perjalanan di Web. Di antara kategori utama produk dan layanan, perjalanan memiliki persentase tertinggi sebelum dilakukan pembelian secara online untuk melihat bahwa 73% dari wisatawan melakukan pencarian online sebelum membuat keputusan perjalanan mereka. Algoritma Naïve Bayes dengan mutual information dan partikel swarm optimization sebagai fitur pilihan dan algoritma Naïve Bayes tanpa fitur seleksi, dapat diterapkan pada prediksi travel review online. Dari pengolahan data yang telah dilakukan, menggabungkan metode pemilihan fitur, filter menggunakan mutual information dan wrapper menggunakan partikel swarm optimization. Keduanya menghasilkan akurasi klasifikasi yang baik. Hasil prediksi dari peninjauan perjalanan online dengan algoritma Naïve Bayes tanpa fitur seleksi memiliki akurasi 81,50% dengan nilai AUC 0,500, sedangkan hasil prediksi dari travel review online menggunakan algoritma naïve Bayes dengan mutual information dan partikel swarm optimization sebagai pemilihan fitur memiliki akurasi 93,50% dengan nilai AUC 0,965. Peningkatan akurasi data 12% diperoleh dari 200 ulasan. Jadi, travel review prediksi secara online menggunakan algoritma Naïve Bayes dan algoritme seleksi fitur lebih unggul daripada menggunakan naïve bayes tanpa pemilihan fitur.
Kata kunci: analisis sentimen, klasifikasi teks, naïve bayes

Abstract
The rapid growth in Internet applications in the field of tourism has led to a large number of personal reviews for travel-related information on the Web. Among the major categories of products and services, travel has the highest percentage of pre-purchase online looking at which 73% of the tourists do a search online before making their travel decisions. Naïve Bayes algorithm with mutual information and particle swarm optimization as feature selection and naïve Bayes algorithm without feature selection, can be applied to the prediction travel review online. Of data processing that has been done, combining feature selection method, the filter uses mutual information and wrapper using particle swarm optimization. Both produce good classification accuracy. Prediction results of a review of travel online with algorithms naïve Bayes without feature selection has an accuracy of 81.50% with a value of AUC 0.500, while the prediction results of the reviews online travel using an algorithm naïve Bayes with mutual information and particle swarm optimization as feature selection has an accuracy of 93.50% with a value of AUC 0.965. Improved accuracy of 12% data was obtained from 200 reviews. So the prediction travel review online using naïve Bayes algorithm and feature selection algorithms are superior to using naïve bayes without feature selection.
Keywords: sentiment analysis, text classification, naïve bayes

Teks Lengkap:

PDF

Referensi


Abou-shouk, M. A., Mun, W., & Megicks, P. (2016). Using competing models to evaluate the role of environmental pressures in ecommerce adoption by small and medium sized travel agents in a developing country. Tourism Management, 52, 327–339. http://doi.org/10.1016/j.tourman.2015.07.007

Alshalabi, H., Tiun, S., Omar, N., & Albared, M. (2013). Experiments on the Use of Feature Selection and Machine Learning Methods in Automatic Malay Text Categorization. Procedia Technology, 11(Iceei), 748–754. http://doi.org/10.1016/j.protcy.2013.12.254

Bermejo, P., Gámez, J. a., & Puerta, J. M. (2011). Improving the performance of Naive Bayes multinomial in e-mail foldering by introducing distribution-based balance of datasets. Expert Systems with Applications, 38(3), 2072–2080. http://doi.org/10.1016/j.eswa.2010.07.146

Chen, J., Huang, H., Tian, S., & Qu, Y. (2009). Expert Systems with Applications Feature selection for text classification with Naïve Bayes. Expert Systems With Applications, 36, 5432–5435. http://doi.org/10.1016/j.eswa.2008.06.054

Dawson, C. W. (2009). Projects in Computing and Information Systems (pp. 1–297). Pearson Education Limited.

Eng, T. J. E., & Sci, C. (2012). Hybrid feature selection for text classification ¨. Turk J Elec & Comp Sci, 20(2), 1296–1311. http://doi.org/10.3906/elk-1101-1064

Farid, D., Zhang, L., Mofizur, C., Hossain, M. A., & Strachan, R. (2014). Expert Systems with Applications Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks. Expert Systems With Applications, 41, 1937–1946.

Feldman, R. (2007). The Text Mining Handbook. Chemistry & … (pp. 1–423). Retrieved from http://onlinelibrary.wiley.com/doi/10.1002/cbdv.200490137/abstract

Godes, D. (2005). The Firm ’ s Management of Social Interactions. Springer Sciences, (1), 415–428.

Gorunescu, F. (2011). Data Mining Concept, Models, and Techniques (pp. 1–370). Springer.

Grosan, C., Abraham, A., & Chis, M. (2006). Swarm Intelligence in Data Mining, 20, 1–20.

Gupta, V., Science, L. C., & Lehal, G. S. (2009). A Survey of Text Mining Techniques and Applications, 1(1), 60–76.

Hoque, N., Bhattacharyya, D. K., & Kalita, J. K. (2014). Expert Systems with Applications MIFS-ND : A mutual information-based feature selection method. Expert Systems With Applications, 41(14), 6371–6385. http://doi.org/10.1016/j.eswa.2014.04.019

Indriani, A., & Nbc, D. (2014). Klasifikasi Data Forum dengan menggunakan Metode Naïve Bayes Classifier. Seminar Nasional Aplikasi Teknologi Informasi (SNATI), 5–10.

Liu, R., Chen, Y., Jiao, L., & Li, Y. (2014). A particle swarm optimization based simultaneous learning framework for clustering and classification. Pattern Recognition, 47, 2143–2152.

Liu, Y., Wang, G., Chen, H., Dong, H., Zhu, X., & Wang, S. (2011). An Improved Particle Swarm Optimization for Feature Selection. Journal of Bionic Engineering, 8, 1–10.

Lu, S., Chiang, D., Keh, H., & Huang, H. (2010). Knowledge-Based Systems Chinese text classification by the Naïve Bayes Classifier and the associative classifier with multiple confidence threshold values. Knowledge-Based Systems, 23, 598–604. http://doi.org/10.1016/j.knosys.2010.04.004

Maimon, O., & Rokach, L. (2010). Data Mining and Knowledge Discovery Handbook (pp. 1–1306). Springer. Retrieved from http://onlinelibrary.wiley.com/doi/10.1002/cbdv.200490137/abstract

Narayanan, V., Arora, I., & Bhatia, A. (2009). Fast and accurate sentiment classification using an enhanced Naive Bayes model. Departement of Electronics Engineering, 1–8.

Parimala, R., & Nallaswamy, R. (2012). Feature Selection using a Novel Particle Swarm Optimization and It’s Variants. International Journal of Information Technology and Computer Science, 4(5), 16–24. http://doi.org/10.5815/ijitcs.2012.05.03

Serrano-guerrero, J., Olivas, J. A., Romero, F. P., & Herrera-viedma, E. (2015). Sentiment analysis : A review and comparative analysis of web services. Information Sciences, 311, 18–38.

Tripathy, A., Agrawal, A., & Rath, S. K. (2015). Classification of Sentimental Reviews Using Machine Learning Techniques. Procedia Computer Science, 57, 821–829. http://doi.org/10.1016/j.procs.2015.07.523

Uysal, A. K., & Gunal, S. (2012). A novel probabilistic feature selection method for text classification. Knowledge-Based Systems, 36(October), 226–235. http://doi.org/10.1016/j.knosys.2012.06.005

Vercellis, C. (2009). Business Intelligence (pp. 1–420). Wiley.

Williams, L., Bannister, C., Arribas-ayllon, M., Preece, A., & Spasic, I. (2015). Expert Systems with Applications. Expert Systems With Applications, 42, 7375–7385.

Ye, Q., Zhang, Z., & Law, R. (2009). Expert Systems with Applications Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. Expert Systems With Applications, 36(3), 6527–6535. http://doi.org/10.1016/j.eswa.2008.07.035

Youn, E., & Jeong, M. K. (2009). Class dependent feature scaling method using naive Bayes classifier for text datamining. Pattern Recognition Letters, 30, 477–485. http://doi.org/10.1016/j.patrec.2008.11.013

Ziveria, M. (2014). Secara Semantik Menggunakan Teori Mutual. Seminar Nasional Sistem Informasi Indonesia, (September), 203–210.




DOI: https://doi.org/10.31294/ijcit.v3i1.3763

##submission.copyrightStatement##

##submission.license.cc.by-sa4.footer##

P-ISSN: 2527-449X E-ISSN: 2549-7421
Statistik Pengunjung Jurnal IJCIT
 

Dipublikasikan oleh LPPM Universitas Bina Sarana Informatika

Jl. Kramat Raya No.98, Kwitang, Kec. Senen, Kota Jakarta Pusat, DKI Jakarta 10450
Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License