Public Perception of the COVID-19 Pandemic on Twitter: Sentiment Analysis and Topic Modeling Study

JMIR Public Health Surveill. 2020 Nov 11;6(4):e21978. doi: 10.2196/21978.

Abstract

Background: COVID-19 is a scientifically and medically novel disease that is not fully understood because it has yet to be consistently and deeply studied. Among the gaps in research on the COVID-19 outbreak, there is a lack of sufficient infoveillance data.

Objective: The aim of this study was to increase understanding of public awareness of COVID-19 pandemic trends and uncover meaningful themes of concern posted by Twitter users in the English language during the pandemic.

Methods: Data mining was conducted on Twitter to collect a total of 107,990 tweets related to COVID-19 between December 13 and March 9, 2020. The analyses included frequency of keywords, sentiment analysis, and topic modeling to identify and explore discussion topics over time. A natural language processing approach and the latent Dirichlet allocation algorithm were used to identify the most common tweet topics as well as to categorize clusters and identify themes based on the keyword analysis.

Results: The results indicate three main aspects of public awareness and concern regarding the COVID-19 pandemic. First, the trend of the spread and symptoms of COVID-19 can be divided into three stages. Second, the results of the sentiment analysis showed that people have a negative outlook toward COVID-19. Third, based on topic modeling, the themes relating to COVID-19 and the outbreak were divided into three categories: the COVID-19 pandemic emergency, how to control COVID-19, and reports on COVID-19.

Conclusions: Sentiment analysis and topic modeling can produce useful information about the trends in the discussion of the COVID-19 pandemic on social media as well as alternative perspectives to investigate the COVID-19 crisis, which has created considerable public awareness. This study shows that Twitter is a good communication channel for understanding both public concern and public awareness about COVID-19. These findings can help health departments communicate information to alleviate specific public concerns about the disease.

Keywords: COVID-19; Twitter; data; health informatics; infodemic; infodemiology; infoveillance; mining; perception; social media; topic modeling.

MeSH terms

  • COVID-19 / psychology*
  • Data Mining / methods*
  • Health Knowledge, Attitudes, Practice*
  • Humans
  • Models, Psychological
  • Natural Language Processing*
  • Pandemics
  • Social Media / statistics & numerical data*