Deciphering COVID-19 Narratives: A Comparative Study of ML Models (RF, MNB, GB, LR, SVM) and DL Models (CNN, Bi-LSTM) for News Article Classification

Kana Das; Md. Asadullah; Md. Murad Hossain; Annita Siddeka Tanni; Shahidul Islam; Masudul Islam; Mst Sharmin Akter Sumy

doi:10.31449/inf.v49i14.6494

Deciphering COVID-19 Narratives: A Comparative Study of ML Models (RF, MNB, GB, LR, SVM) and DL Models (CNN, Bi-LSTM) for News Article Classification

Kana Das, Md. Asadullah, Md. Murad Hossain, Annita Siddeka Tanni, Shahidul Islam, Masudul Islam, Mst Sharmin Akter Sumy

Abstract

The COVID-19 pandemic has provided an unprecedented amount of information in news outlets, which include scientific, health-related, political, economic, and social narratives. This study compares the effectiveness of machine learning and deep learning algorithms for classifying text data, with a certain emphasis on how well the former handle COVID-19 news narratives. The study dataset contains news articles regarding COVID-19. To achieve the primary purpose of this research is to classify COVID19 related news, we integrate multiple datasets. The analysis reveals machine learning models exhibit superior performance in text data classification. In particular, the Random Forest model reaches a 98% accuracy rate. In contrast, with regards to deep learning models, the Bidirectional Long Short-Term Memory model with FastText integration turns out to be the best option due to its exceptional accuracy. Exploratory data techniques such as topic modeling and word cloud approaches are incorporated to uncover hidden patterns in the data. Pre-trained (e.g., deep learning) and non-pre-trained ML models are implemented highlighting the versatility of ML in text classification tasks. The specific purpose to compare to the deep learning and machine learning algorithm to classification of the new article. Notably, a predictive model employing Bi-LSTM with the FastText pre-trained model achieved an impressive 94% accuracy in classifying COVID-19 news reports.

Full Text:

PDF

References

Rodriguez-Rodriguez, I., Rodriguez, J. V., Shirvanizadeh, N., Ortiz, A., & Pardo-Quiles, D. J. (2021). Applications of artificial intelligence, machine learning, big data and the internet of things to the COVID-19 pandemic: A scientometric review using text mining. International Journal of Environmental Research and Public Health, 18(16), 8578.

Didi, Y., Walha, A., & Wali, A. (2022). COVID-19 tweets classification based on a hybrid word embedding method. Big Data and Cognitive Computing, 6(2), 58.

Tiwari, S., Chanak, P., & Singh, S. K. (2022). A review of the machine learning algorithms for COVID-19 case analysis. IEEE Transactions on Artificial Intelligence.

Abdeen, M. A., Hamed, A. A., & Wu, X. (2021). Fighting the COVID-19 Infodemic in News Articles and False Publications: The NeoNet Text Classifier, a Supervised Machine Learning Algorithm. Applied Sciences, 11(16), 7265.

Koirala, A. (2020). COVID-19 fake news classification with deep learning. Preprint, 4.

Ravichandran, B. D., & Keikhosrokiani, P. (2023). Classification of Covid-19 misinformation on social media based on neuro-fuzzy and neural network: A systematic review. Neural Computing and Applications, 35(1), 699-717.

Chughtai, M. A., Hou, J., Long, H., Li, Q., & Ismail, M. (2021, November). Design of a predictor for COVID-19 misinformation prediction. In 2021 International Conference on Innovative Computing (ICIC) (pp. 1-7). IEEE.

Arbane, M., Benlamri, R., Brik, Y., & Alahmar, A. D. (2023). Social media-based COVID-19 sentiment classification model using Bi-LSTM. Expert Systems with Applications, 212, 118710.

Dangi, D., Dixit, D. K., & Bhagat, A. (2022). Sentiment analysis of COVID-19 social media data through machine learning. Multimedia Tools and Applications, 81(29), 42261-42283.

Ghasiya, P., & Okamura, K. (2021). Investigating COVID-19 news across four nations: A topic modeling and sentiment analysis approach. Ieee Access, 9, 36645-36656.

Madani, Y., Erritali, M., & Bouikhalene, B. (2021). Fake News Detection Approach Using Parallel Predictive Models and Spark to Avoid Misinformation Related to Covid-19 Epidemic. In Intelligent Systems in Big Data, Semantic Web and Machine Learning (pp. 179-195). Cham: Springer International Publishing.

Malla, S., & Alphonse, P. J. A. (2021). COVID-19 outbreak: An ensemble pre-trained deep learning model for detecting informative tweets. Applied Soft Computing, 107, 107495.

Qasim, R., Bangyal, W. H., Alqarni, M. A., & Ali Almazroi, A. (2022). A fine-tuned BERT-based transfer learning approach for text classification. Journal of healthcare engineering, 2022.

Khadhraoui, M., Bellaaj, H., Ammar, M. B., Hamam, H., & Jmaiel, M. (2022). Survey of BERT-base models for scientific text classification: COVID-19 case study. Applied Sciences, 12(6), 2891.

Shahi, T. B., Sitaula, C., & Paudel, N. (2022). A hybrid feature extraction method for Nepali COVID-19-related tweets classification. Computational Intelligence and Neuroscience, 2022.

Ahmed, M., Hossain, M. S., Islam, R. U., & Andersson, K. (2022). Explainable Text Classification Model for COVID-19 Fake News Detection. Journal of Internet Services and Information Security (JISIS), 12(2), 51-69.

Rabby, G., & Berka, P. (2023). Multi-class classification of COVID-19 documents using machine learning algorithms. Journal of Intelligent Information Systems, 60(2), 571-591.

Etaiwi, H. A. (2022). Empirical Evaluation of Machine Learning Classification Algorithms for Detecting COVID-19 Fake News. International Journal of Advances in Soft Computing & Its Applications, 14(1).

Malla, S., & Alphonse, P. J. A. (2022). Fake or real news about COVID-19? Pretrained transformer model to detect potential misleading news. The European Physical Journal Special Topics, 231(18), 3347-3356.

Felber, T. (2021). Constraint 2021: Machine learning models for covid-19 fake news detection shared task. arXiv preprint arXiv:2101.03717.

Chintalapudi, N., Battineni, G., & Amenta, F. (2021). Sentimental analysis of COVID-19 tweets using deep learning models. Infectious disease reports, 13(2), 329-339.

Khan, S., Hakak, S., Deepa, N., Prabadevi, B., Dev, K., & Trelova, S. (2022). Detecting COVID-19-related fake news using feature extraction. Frontiers in Public Health, 9, 788074.

Aluna, R. P., Yulita, I. N., & Sudrajat, R. (2021, October). Electronic News Sentiment Analysis Application to New Normal Policy during the Covid-19 Pandemic Using Fasttext and Machine Learning. In 2021 International Conference on Artificial Intelligence and Big Data Analytics (pp. 236-241). IEEE.

Alenezi, M. N., & Alqenaei, Z. M. (2021). Machine learning in detecting covid-19 misinformation on twitter. Future Internet, 13(10), 244.

Verma, S., Paul, A., Kariyannavar, S. S., & Katarya, R. (2020, November). Understanding the applications of natural language processing on COVID-19 data. In 2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA) (pp. 1157-1162). IEEE.

DOI: https://doi.org/10.31449/inf.v49i14.6494

This work is licensed under a Creative Commons Attribution 3.0 License.

Username
Password
Remember me