Sentimental analysis of COVID-19 twitter data using deep learning and machine learning models

Simran Darad; Sridhar  Krishnan

doi:10.17163/ings.n29.2023.10

PDF (Spanish) HTML (Spanish) PDF HTML EPUB (Spanish)

Published: Jan 24, 2023

Updated: 2023-01-24

Versions:

2023-01-24 (3)

2023-01-24 (2)

2023-01-01 (1)

DOI: https://doi.org/10.17163/ings.n29.2023.10

Keywords:

COVID-19, coronavirus, Twitter, tweets, sentiment analysis, tweepy, text classification

Simran Darad

https://orcid.org/0000-0003-4629-3980

Sridhar Krishnan

https://orcid.org/0000-0002-4659-564X

Abstract

The novel coronavirus disease (COVID-19) is an ongoing pandemic with large global attention. However, spreading fake news on social media sites like Twitter is creating unnecessary anxiety and panic among people towards this disease. In this paper, we applied machine learning (ML) techniques to predict the sentiment of the people using social media such as Twitter during the COVID-19 peak in April 2021. The data contains tweets collected on the dates between 16 April 2021 and 26 April 2021 where the text of the tweets has been labelled by training the models with an already labelled dataset of corona virus tweets as positive, negative, and neutral. Sentiment analysis was conducted by a deep learning model known as Bidirectional Encoder Representations from Transformers (BERT) and various ML models for text analysis and performance which were then compared among each other. ML models used were Naïve Bayes, Logistic Regression, Random Forest, Support Vector Machines, Stochastic Gradient Descent and Extreme Gradient Boosting. Accuracy for every sentiment was separately calculated. The classification accuracies of all the ML models produced were 66.4%, 77.7%, 74.5%, 74.7%, 78.6%, and 75.5\%, respectively and BERT model produced 84.2%. Each sentiment-classified model has accuracy around or above 75%, which is a quite significant value in text mining algorithms. We could infer that most people tweeting are taking positive and neutral approaches.

Issue

No. 29 (2023): january-june

Section

IIoT and artificial intelligence

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

The Universidad Politécnica Salesiana of Ecuador preserves the copyrights of the published works and will favor the reuse of the works. The works are published in the electronic edition of the journal under a Creative Commons Attribution/Noncommercial-No Derivative Works 4.0 Ecuador license: they can be copied, used, disseminated, transmitted and publicly displayed.

The undersigned author partially transfers the copyrights of this work to the Universidad Politécnica Salesiana of Ecuador for printed editions.

It is also stated that they have respected the ethical principles of research and are free from any conflict of interest. The author(s) certify that this work has not been published, nor is it under consideration for publication in any other journal or editorial work.

The author (s) are responsible for their content and have contributed to the conception, design and completion of the work, analysis and interpretation of data, and to have participated in the writing of the text and its revisions, as well as in the approval of the version which is finally referred to as an attachment.

References

T. Vijay, A. Chawla, B. Dhanka, and P. Karmakar, “Sentiment analysis on covid-19 twitter data,” in 2020 5th IEEE International Conference on Recent Advances and Innovations in Engineering (ICRAIE), 2020, pp. 1–7. [Online]. Available: https://doi.org/10.1109/ICRAIE51050.2020.9358301

M. Mansoor, K. Gurumurthy, A. R. U, and V. R. B. Prasad, “Global sentiment analysis of COVID-19 tweets over time,” CoRR, vol. abs/2010.14234, 2020. [Online]. Available: https://doi.org/10.48550/arXiv.2010.14234

H. Drias and Y. Drias, “Mining twitter data on covid-19 for sentiment analysis and frequent patterns discovery,” medRxiv, 2020. [Online]. Available: https://doi.org/10.1101/2020.05.08.20090464

F. Rustam, M. Khalid, W. Aslam, V. Rupapara, A. Mehmood, and G. S. Choi, “A performance comparison of supervised machine learning models for covid-19 tweets sentiment analysis,” PLOS ONE, vol. 16, no. 2, pp. 1–23, 02 2021. [Online]. Available: https://doi.org/10.1371/journal.pone.0245909

R. Lamsal, “Design and analysis of a large-scale COVID-19 tweets dataset,” Applied Intelligence, vol. 51, no. 5, pp. 2790–2804, May 2021. [Online]. Available: https://doi.org/10.1007/s10489-020-02029-z

A. D. Dubey, “Twitter sentiment analysis during covid-19 outbreak,” SSRN, 2021. [Online]. Available: https://dx.doi.org/10.2139/ssrn.3572023

N. Chintalapudi, G. Battineni, and F. Amenta, “Sentimental analysis of COVID-19 tweets using deep learning models,” Infect Dis Rep, vol. 13, no. 2, pp. 329–339, Apr. 2021. [Online]. Available: https://doi.org/10.3390/idr13020032

M. A. Kausar, A. Soosaimanickam, and M. Nasar, “Public sentiment analysis on twitter data during covid-19 outbreak,” International Journal of Advanced Computer Science and Applications, vol. 12, no. 2, 2021. [Online]. Available: http://dx.doi.org/10.14569/IJACSA.2021.0120252

A. Mitra and S. Bose, “Decoding Twitter-verse: An analytical sentiment analysis on Twitter on COVID-19 in india,” Impact of Covid 19 on Media and Entertainment, 2020. [Online]. Available: https://bit.ly/3YMj1c3

B. P. Pokharel, “Twitter sentiment analysis during covid-19 outbreak in nepal,” SSRN, 2020. [Online]. Available: https://dx.doi.org/10.2139/ssrn.3624719

C. R. Machuca, C. Gallardo, and R. M. Toasa, “Twitter sentiment analysis on coronavirus: Machine learning approach,” Journal of Physics: Conference Series, vol. 1828, no. 1, p. 012104, feb 2021. [Online]. Available: https://dx.doi.org/10.1088/1742-6596/1828/1/012104

S. Boon-Itt and Y. Skunkan, “Public perception of the COVID-19 pandemic on twitter: Sentiment analysis and topic modeling study,” JMIR Public Health Surveill, vol. 6, no. 4, p. e21978, Nov. 2020. [Online]. Available: https://doi.org/10.2196/21978

A. K. Uysal and S. Gunal, “The impact of preprocessing on text classification,” Information Processing & Management, vol. 50, no. 1, pp. 104–112, 2014. [Online]. Available: https://doi.org/10.1016/j.ipm.2013.08.006

S. Gujral, “Sentiment analysis: Predicting sentiment of COVID-19 tweets,” Analytics Vidhya, 2021. [Online]. Available: https://bit.ly/3j9tMVj

——, “Amazon product review sentiment analysis using bert,” Analytics Vidhya, 2021. [Online]. Available: https://bit.ly/3Vad9WE

B. Lutkevich. (2022) Bert language model. TechTarget Enterprise Al. [Online]. Available: https://bit.ly/3Wo5Pb4

J. Samuel, G. G. M. N. Ali, M. M. Rahman, E. Esawi, and Y. Samuel, “Covid-19 public sentiment insights and machine learning for tweets classification,” Information, vol. 11, no. 6, 2020. [Online]. Available: https://doi.org/10.3390/info11060314

Article Sidebar

Main Article Content

Abstract

Article Details

References