• Name: Analysis of Covid-19 Sentiments on Social Media
  • Role: Data visualisation, Data processing, Model creation
  • Year: Sep 2020
  • More Details: Full Report

Exploring the correlation between sentiments about Covid-19 and the infection rate.

Our hypothesis is that there is a negative correlation between the public’s general sentiments toward Covid-19 and the infection rate. I created a few data visualizations (left) using Flourish to capture our outcomes and validate our hypothesis.

We scraped data from Twitter and Reddit, pre-processed it and tried various models such as Logistic Regression, Naive Bayes, Support Vector Machine, and BERT with Fine-Tuning to achieve the best result for multi-class classification. We evaluated our model (82% accuracy) and visualised/summarized the results using Flourish.