Dinda Ayu Muthia


Abstract — In the era of widespread use of the internet today, the
number of consumers who wrote the opinion and experience of
online continues to increase. Read the review as a whole can be
time consuming, however, if only a few reviews that read, then the
evaluation will be biased. Sentiment analysis aims to address this
problem by automatically classifying user review be positive or
negative opinion. Naïve Bayes classifier is a popular machine
learning techniques for text classification, because it is very simple,
efficient and has a good performance in many domains. However,
Naïve Bayes has the disadvantage that is very sensitive to feature
too much, resulting in a classification accuracy becomes low.
Therefore, in this study used the integration method of feature
selection, namely Information gain and Genetic algorithm in order
to improve the accuracy of Naïve Bayes classifier. This research
resulted in the classification of the text in the form of positive or
negative review of the book. Measurement is based on the accuracy
of Naive Bayes before and after the addition of feature selection
methods. The evaluation was done using a 10 fold cross validation.
While the measurement accuracy is measured by confusion matrix
and ROC curves. The results showed an increase in the accuracy of
Naïve Bayes from 78.50% to 84.50%.

Full Text:



Z. Zhang, Q. Ye, Z. Zhang, and Y. Li, “Sentiment classification of

Internet restaurant reviews written in Cantonese,” Expert Syst.

Appl., vol. 38, no. 6, pp. 7674–7682, Jun. 2011.

J. Chen, H. Huang, S. Tian, and Y. Qu, “Feature selection for text

classification with Naïve Bayes,” Expert Syst. Appl., vol. 36, no. 3,

pp. 5432–5435, Apr. 2009.

Q. Ye, Z. Zhang, and R. Law, “Expert Systems with Applications

Sentiment classification of online reviews to travel destinations by

supervised machine learning approaches,” Expert Syst. Appl., vol.

, no. 3, pp. 6527–6535, 2009.

A. K. Uysal and S. Gunal, “A novel probabilistic feature selection

method for text classification,” Knowledge-Based Syst., vol. 36, pp.

–235, Dec. 2012.

S. R. R. V, D. V. L. N. Somayajulu, and A. R. Dani, “Classification

of Movie Reviews Using Complemented Naive Bayesian

Classifier,” vol. 1, no. 4, pp. 162–167, 2010.

R. Feldman, “Techniques and applications for sentiment analysis,”

Commun. ACM, vol. 56, no. 4, p. 82, Apr. 2013.

E. Haddi, X. Liu, and Y. Shi, “The Role of Text Pre-processing in

Sentiment Analysis,” Procedia Comput. Sci., vol. 17, pp. 26–32,

Jan. 2013.

A. S. H. Basari, B. Hussin, I. G. P. Ananta, and J. Zeniarja,

“Opinion Mining of Movie Review using Hybrid Method of

Support Vector Machine and Particle Swarm Optimization,”

Procedia Eng., vol. 53, pp. 453–462, Jan. 2013.

F. Gorunescu, Data Mining Concept Model Technique. 2011.

S. Gunal, “Hybrid feature selection for text classification ¨,” vol. 20,

J. Han and M. Kamber, Data Mining Concepts and Techniques.

R. Moraes, J. F. Valiati, and W. P. Gavião Neto, “Document-level

sentiment classification: An empirical comparison between SVM

and ANN,” Expert Syst. Appl., vol. 40, no. 2, pp. 621–633, Feb.

Z. Markov and T. Daniel, Uncovering Patterns in. 2007.

Santoso, Budi, Data Mining Teknik Pemanfaatan Data Untuk

Keperluan Bisnis. Yogyakarta: Graha Ilmu. 2007.

DOI: https://doi.org/10.31294/jtk.v2i1.357

Copyright (c) 2016 Dinda Ayu Muthia

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

ISSN: 2442-2436 (print), and 2550-0120

 dipublikasikan oleh LPPM Universitas Bina Sarana Informatika Jakarta

Jl. Kramat Raya No.98, Kwitang, Kec. Senen, Kota Jakarta Pusat, DKI Jakarta 10450
Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License