Emotion Detection in Twitter Social Media Using Long Short-Term Memory (LSTM) and Fast Text
Abstract
Emotion detection is important in various fields such as education, business, employee recruitment. In this study, emotions will be detected with text that comes from Twitter because social media makes users tend to express emotions through text posts. One of the social media that has the highest user growth rate in Indonesia is Twitter. This study will use the LSTM method because this method is proven to be better than previous studies. Word embedding fast text will also be used in this study to improve Word2Vec and GloVe that cannot handle the problem of out of vocabulary (OOV). This research produces the best accuracy for each word embedding as follows, Word2Vec produces an accuracy of 73,15%, GloVe produces an accuracy of 60,10%, fast text produces an accuracy of 73,15%. The conclusion in this study is the best accuracy was obtained by Word2Vec and fast text. The fast text has the advantage of handling the problem of out of vocabulary (OOV), but in this study, it cannot improve the accuracy of word 2vec. This study has not been able to produce very good accuracy. This is because of the data used. In future works, to get even better results, it is expected to apply other deep learning methods, such as CNN, BiLSTM, etc. It is hoped that more data will be used in future studies.
Downloads
References
Ardiada, D., Sudarma, M., & Giriantari, D. (2019). Text Mining pada Sosial Media untuk Mendeteksi Emosi Pengguna Menggunakan Metode Support Vector Machine dan K-Nearest Neighbour. 18(1), 55–60.
Bata, J., Suyoto, & Pranowo. (2015). Leksikon Untuk Deteksi Emosi Dari Teks Bahasa Indonesia. Seminar Nasional Informatika 2015 (SemnasIF 2015), 2015(November), 195–202.
Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching Word Vectors with Subword Information. http://www.isthe.com/chongo/tech/comp/fnv
Consoli, D. (2009). Emotions That Influence Purchase Decisions And Their Electronic Processing. Annales Universitatis Apulensis Series Oeconomica, 2(11), 1–45.
Dandannavar, P. S., Mangalwede, S. R., & Kulkarni, P. M. (2018). Social Media Text - A Source for Personality Prediction. Proceedings of the International Conference on Computational Techniques, Electronics and Mechanical Systems, CTEMS 2018, 62–65. https://doi.org/10.1109/CTEMS.2018.8769304
Daouas, T., & Lejmi, H. (2018). Emotions recognition in an intelligent e-learning environment. Interactive Learning Environments, 26(8), 991–1009. https://doi.org/10.1080/10494820.2018.1427114
Dwi L, Adinda., Harijanto, Budi., Rahutomo, Faisal. (2020). Implementasi Deep Learning Untuk Deteksi Ekspresi Emosi pada Twitter.
Fanesya, F., Wihandika, R. C., & Indriati. (2019). Deteksi Emosi Pada Twitter Menggunakan Metode Naïve Bayes Dan Kombinasi Fitur. 3(7), 6678–6686.
Gholamy, A., Kreinovich, V., & Kosheleva, O. (2018). Why 70/30 or 80/20 Relation Between Training and Testing Sets: A Pedagogical Explanation. In Part of the Computer Sciences Commons Comments. https://digitalcommons.utep.edu/cs_techrephttps://digitalcommons.utep.edu/cs_techrep/1209
Haddi, E., Liu, X., & Shi, Y. (2013). The role of text pre-processing in sentiment analysis. Procedia Computer Science, 17, 26–32. https://doi.org/10.1016/j.procs.2013.05.005
Halim, K., Novianus Palit, H., & Tjondrowiguno, A. N. (2020). Penerapan Recurrent Neural Network untuk Pembuatan Ringkasan Ekstraktif Otomatis pada Berita Berbahasa Indonesia. Jurnal Infra, 8(1), 221–227.
Han, J., & Kamber, M. (n.d.). Data Mining: Concepts and Techniques.
Haryadi, D., & Kusuma, G. P. (2019). Emotion detection in text using nested Long Short-Term Memory. International Journal of Advanced Computer Science and Applications, 10(6), 351–357. https://doi.org/10.14569/ijacsa.2019.0100645
Hochreiter, Sepp dan Schmidhber, Jurgen. (1997). Long Short-Term Memory. https://doi.org/10.1162/neco.1997.9.8.1735
Juwiantho, H., Setiawan, E. I., Santoso, J., & Purnomo, M. H. (2020). Sentiment Analysis Twitter Bahasa Indonesia Berbasis Word2Vec Menggunakan Deep Convolutional Neural Network. Jurnal Teknologi Informasi Dan Ilmu Komputer (JTIIK), 7(1), 181–188. https://doi.org/10.25126/jtiik.202071758
Lim, E., Istts, T. I., Setiawan, E. I., & Istts, T. I. (2020). Stance Classification Post Kesehatan di Media Sosial Dengan FastText Embedding dan Deep Learning. 65–73.
Miedema, F. (2018). Sentiment Analysis with Long Short-Term Memory networks.
Nazief, B. A. A. & Adriani, M. (2005). Confix- stripping: Approach to Stemming Algorithm for Bahasa Indonesia. Conferences in Research and Practice in Information Technology Series, 38(4).
Nurlaila, A., Wiranto, & Saptono, R. (2017). Classification Of Customers Emotion Using Naive Bayes Classifier ( Case Study : Natasha Skin Care ). Jurnal Ilmiah Teknologi Dan Informasi, 6(2), 92–97.
Nurrohmat, M. A., & SN, A. (2019). Sentiment Analysis of Novel Review Using Long Short-Term Memory Method. IJCCS (Indonesian Journal of Computing and Cybernetics Systems), 13(3), 209. https://doi.org/10.22146/ijccs.41236
Pennington, J., Socher, R., & Manning, C. D. (2014). GloVe: Global vectors for word representation. EMNLP 2014 - 2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, June 2018, 1532–1543. https://doi.org/10.3115/v1/d14-1162
Rao, A., & Spasojevic, N. (2016). Actionable and Political Text Classification using Word Embeddings and LSTM. http://arxiv.org/abs/1607.02501
Rohman, A. N., Handayani, R. D., & Kusrini, K. (2020). Deteksi Emosi Media Sosial Menggunakan Term Frequency-Inverse Document Frequency. CSRID (Computer Science Research and Its Development Journal), 11(3), 140–148.
Rohman, A. N., Utami, E., & Raharjo, S. (2019). View of Deteksi Kondisi Emosi pada Media Sosial Menggunakan Pendekatan Leksikon dan Natural Language Processing. Jurnal Eksplora Informatika. https://doi.org/https://doi.org/10.30864/eksplora.v9i1.277
S. B. Kotsiantis, D. K. and P. E. P. (2006). Data Preprocessing for Supervised Learning.
Saifulloh, M., & Siregar, M. U. (2019). Pengungkapan Diri Gofar Hilman Sebagai Influencer Melalui Media Instagram. Pustaka Komunikasi, 2(2), 167–180. https://doi.org/https://doi.org/10.32509/pustakom.v2i2.869
Salam, A., Zeniarja, J., & Khasanah, R. S. U. (2018). Analisis Sentimen Data Komentar Sosial Media Facebook Dengan K-Nearest Neighbor (Studi Kasus Pada Akun Jasa Ekspedisi Barang J&T Ekpress Indonesia). Prosiding SINTAK, 480–486.
Saputri, M. S., Mahendra, R., & Adriani, M. (2019). Emotion Classification on Indonesian Twitter Dataset. Proceedings of the 2018 International Conference on Asian Language Processing, IALP 2018, November, 90–95. https://doi.org/10.1109/IALP.2018.8629262
Utomo, B. (2020). Kinerja Deep Learning dalam Analisis Sentimen.
Arliyanti Nurdin, Bernadus Anggo Seno Aji, Anugrayani Bustamin, Z. A. (2020). Perbandingan kinerja Word Embedding Word2Vec, GloVe dan FastText pada klasifikasi teks. Jurnal TEKNOKOMPAK, 14(2), 74--79.
Ekman (1999). Basic Emotions. In Encyclopedia of Personality and Individual Differences (pp. 1–6). https://doi.org/10.1007/978-3-319-28099-8_495-1
Copyright (c) 2021 M Alfa Riza, Novrido Charibaldi
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with International Journal of Artificial Intelligence & Robotics (IJAIR) agree to the following terms:
-
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
-
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
-
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.