Combining Likes-Retweet Analysis and Naive Bayes Classifier within Twitter for Sentiment Analysis

Rizal Setya Perdana, Aryo Pinandito

Abstract


Sentiment analysis is a research study that aims to extract subjectivity of opinions. Due to massive growth number of user generated content in social media, Twitter is one of the most popular microblogging application which user is freely to discuss and share opinions about specific topic or entity. Twitter have several features that potentially can be used to improve sentiment analysis such as like and retweet. Like and retweet are mechanism in Twitter to propagate or share and to show appreciation of other user posting. This paper proposes a combination of textual and non-textual features to improve performance of sentiment prediction. In this research we apply Naïve Bayes for textual classification and Fisher Score to determine non-textual (like and retweet) features. By combining two kinds of features, our experimental find the optimal value of α and β. The evaluation performance using F1-measure gives 0.838 of accuracy with α and β are 0.6 and 0.4 respectively.

Keywords


Sentiment Analysis; Twitter; Naive Bayes; Retweet-Like;

Full Text:

PDF

References


Yang, L. C., Selvaretnam, B., Hoong, P. K., Tan, I. K. T., Howg, E. K., & Kar, L. H. (2016). Exploration of road traffic tweets for congestion monitoring. Journal of Telecommunication, Electronic and Computer Engineering, 8(2), 141–145. Retrieved from https://www.scopus.com/inward/record.uri?eid=2-s2.0-84984848398 partnerID=40&md5=aa3409237b2ff2788facabd0f6edd723

Chierichetti, F., Kleinberg, J., & Kumar, R. (2014). Event Detection via Communication Pattern Analysis, 51–60.

Mahadzir, N. H., Omar, M. F., & Nawi, M. N. M. (2016). Towards sentiment analysis application in housing projects. Journal of Telecommunication, Electronic and Computer Engineering, 8(8), 20060. https://doi.org/10.1063/1.4960900

Medhat, W., Hassan, A., & Korashy, H. (2014). Sentiment analysis algorithms and applications: A survey. Ain Shams Engineering Journal, 5(4), 1093–1113. http://doi.org/10.1016/j.asej.2014.04.011

Ravi, K., & Ravi, V. (2015). A survey on opinion mining and sentiment analysis: Tasks, approaches and applications. Knowledge-Based Systems, 89, 14–46. https://doi.org/10.1016/j.knosys.2015.06.015

FAQs about Retweets. 2016. Help Center Twitter. Retrieved Feb 1, 2017 from https://support.twitter.com/articles/77606

Wu, B., & Shen, H. (2015). Analyzing and predicting news popularity on Twitter. International Journal of Information Management, 35(6), 702–711. http://doi.org/10.1016/j.ijinfomgt.2015.07.003

Liking A Tweet or Moment. 2016. Help Center Twitter. Retrieved Feb 1, 2017 from https://support.twitter.com/articles/20169874

Mertiya, M., & Singh, A. (2016). Combining Naive Bayes and Adjective Analysis for Sentiment Detection on Twitter. Inventive Computation Technologies (ICICT), International Conference on, 3. Retrieved from 10.1109/INVENTIVE.2016.7824847

Bing Liu, N. Indurkhya and F. J. Damerau, Handbook of Natural Language Processing, Second Edition, 2010, pp. 1-3860-68.

S. Vijay Gaikwad, P. D. Y Patil, and P. Patil, “Text Mining Methods and Techniques,” Int. J. Comput. Appl., vol. 85, no. 17, pp. 975–8887, 2014.

Tang, B., Kay, S., & He, H. (2016). Toward Optimal Feature Selection in Naive Bayes for Text Categorization. IEEE Transactions on Knowledge and Data Engineering, 28(9), 2508–2521. https://doi.org/10.1109/TKDE.2016.2563436

Z. Xu and Q. Yang, “Analyzing user retweet behavior on twitter,” Proc. 2012 IEEE/ACM Int. Conf. Adv. Soc. Networks Anal. Mining, ASONAM 2012, pp. 46–50, 2012.

I. Guyon and A. Elisseeff, “An Introduction to Variable and Feature Selection,” J. Mach. Learn. Res., vol. 3, no. 3, pp. 1157–1182, 2003.

Tala, F. Z. (2003). A Study of Stemming Effects on Information Retrieval in Bahasa Indonesia. M.S. thesis. M.Sc. Thesis. Master of Logic Project. Institute for Logic, Language and Computation. Universiteti van Amsterdam the Netherlands.

Zainal, A., & Novan, A. (2002). “Klasifikasi Dokumen Berita Kejadian Berbahasa Indonesia dengan Algoritma Single Pass Clustering”, Prosiding Seminar on Intelligent Technology and its Applications (SITIA), Teknik Elektro, Institut Teknologi Sepuluh Nopember Surabaya

Kibriya A.M., Frank E., Pfahringer B., Holmes G. (2004) Multinomial Naive Bayes for Text Categorization Revisited. In: Webb G.I., Yu X. (eds) AI 2004: Advances in Artificial Intelligence. AI 2004. Lecture Notes in Computer Science, vol 3339. Springer, Berlin, Heidelberg

Q. Gu, Z. Li, and J. Han, “Generalized Fisher Score for Feature Selection,” CoRR, vol. abs/1202.3, no. August, pp. 327–330, 2012.

B. Singh, “Optimization of Feature Selection Method for High Dimensional Data Using Fisher Score and Minimum Spanning Tree,” IEEE India Conf. Optim., 2014.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.

ISSN: 2180-1843

eISSN: 2289-8131