Sentimental analysis of COVID-19 twitter data using deep learning and machine learning models

Main Article Content

Abstract

The novel coronavirus disease (COVID-19) is an ongoing pandemic with large global attention. However, spreading fake news on social media sites like Twitter is creating unnecessary anxiety and panic among people towards this disease. In this paper, we applied machine learning (ML) techniques to predict the sentiment of the people using social media such as Twitter during the COVID-19 peak in April 2021. The data contains tweets collected on the dates between 16 April 2021 and 26 April 2021 where the text of the tweets has been labelled by training the models with an already labelled dataset of corona virus tweets as positive, negative, and neutral. Sentiment analysis was conducted by a deep learning model known as Bidirectional Encoder Representations from Transformers (BERT) and various ML models for text analysis and performance which were then compared among each other. ML models used were Naïve Bayes, Logistic Regression, Random Forest, Support Vector Machines, Stochastic Gradient Descent and Extreme Gradient Boosting. Accuracy for every sentiment was separately calculated. The classification accuracies of all the ML models produced were 66.4%, 77.7%, 74.5%, 74.7%, 78.6%, and 75.5\%, respectively and BERT model produced 84.2%. Each sentiment-classified model has accuracy around or above 75%, which is a quite significant value in text mining algorithms. We could infer that most people tweeting are taking positive and neutral approaches.

Article Details

Section
IIoT and artificial intelligence

References

T. Vijay, A. Chawla, B. Dhanka, and P. Karmakar, “Sentiment analysis on covid-19 twitter data,” in 2020 5th IEEE International Conference on Recent Advances and Innovations in Engineering (ICRAIE), 2020, pp. 1–7. [Online]. Available: https://doi.org/10.1109/ICRAIE51050.2020.9358301

M. Mansoor, K. Gurumurthy, A. R. U, and V. R. B. Prasad, “Global sentiment analysis of COVID-19 tweets over time,” CoRR, vol. abs/2010.14234, 2020. [Online]. Available: https://doi.org/10.48550/arXiv.2010.14234

H. Drias and Y. Drias, “Mining twitter data on covid-19 for sentiment analysis and frequent patterns discovery,” medRxiv, 2020. [Online]. Available: https://doi.org/10.1101/2020.05.08.20090464

F. Rustam, M. Khalid, W. Aslam, V. Rupapara, A. Mehmood, and G. S. Choi, “A performance comparison of supervised machine learning models for covid-19 tweets sentiment analysis,” PLOS ONE, vol. 16, no. 2, pp. 1–23, 02 2021. [Online]. Available: https://doi.org/10.1371/journal.pone.0245909

R. Lamsal, “Design and analysis of a large-scale COVID-19 tweets dataset,” Applied Intelligence, vol. 51, no. 5, pp. 2790–2804, May 2021. [Online]. Available: https://doi.org/10.1007/s10489-020-02029-z

A. D. Dubey, “Twitter sentiment analysis during covid-19 outbreak,” SSRN, 2021. [Online]. Available: https://dx.doi.org/10.2139/ssrn.3572023

N. Chintalapudi, G. Battineni, and F. Amenta, “Sentimental analysis of COVID-19 tweets using deep learning models,” Infect Dis Rep, vol. 13, no. 2, pp. 329–339, Apr. 2021. [Online]. Available: https://doi.org/10.3390/idr13020032

M. A. Kausar, A. Soosaimanickam, and M. Nasar, “Public sentiment analysis on twitter data during covid-19 outbreak,” International Journal of Advanced Computer Science and Applications, vol. 12, no. 2, 2021. [Online]. Available: http://dx.doi.org/10.14569/IJACSA.2021.0120252

A. Mitra and S. Bose, “Decoding Twitter-verse: An analytical sentiment analysis on Twitter on COVID-19 in india,” Impact of Covid 19 on Media and Entertainment, 2020. [Online]. Available: https://bit.ly/3YMj1c3

B. P. Pokharel, “Twitter sentiment analysis during covid-19 outbreak in nepal,” SSRN, 2020. [Online]. Available: https://dx.doi.org/10.2139/ssrn.3624719

C. R. Machuca, C. Gallardo, and R. M. Toasa, “Twitter sentiment analysis on coronavirus: Machine learning approach,” Journal of Physics: Conference Series, vol. 1828, no. 1, p. 012104, feb 2021. [Online]. Available: https://dx.doi.org/10.1088/1742-6596/1828/1/012104

S. Boon-Itt and Y. Skunkan, “Public perception of the COVID-19 pandemic on twitter: Sentiment analysis and topic modeling study,” JMIR Public Health Surveill, vol. 6, no. 4, p. e21978, Nov. 2020. [Online]. Available: https://doi.org/10.2196/21978

A. K. Uysal and S. Gunal, “The impact of preprocessing on text classification,” Information Processing & Management, vol. 50, no. 1, pp. 104–112, 2014. [Online]. Available: https://doi.org/10.1016/j.ipm.2013.08.006

S. Gujral, “Sentiment analysis: Predicting sentiment of COVID-19 tweets,” Analytics Vidhya, 2021. [Online]. Available: https://bit.ly/3j9tMVj

——, “Amazon product review sentiment analysis using bert,” Analytics Vidhya, 2021. [Online]. Available: https://bit.ly/3Vad9WE

B. Lutkevich. (2022) Bert language model. TechTarget Enterprise Al. [Online]. Available: https://bit.ly/3Wo5Pb4

J. Samuel, G. G. M. N. Ali, M. M. Rahman, E. Esawi, and Y. Samuel, “Covid-19 public sentiment insights and machine learning for tweets classification,” Information, vol. 11, no. 6, 2020. [Online]. Available: https://doi.org/10.3390/info11060314