Mining long-COVID symptoms from Reddit: characterizing post-COVID syndrome from patient reports

JAMIA Open. 2021 Sep 2;4(3):ooab075. doi: 10.1093/jamiaopen/ooab075. eCollection 2021 Jul.

Abstract

Our objective was to mine Reddit to discover long-COVID symptoms self-reported by users, compare symptom distributions across studies, and create a symptom lexicon. We retrieved posts from the /r/covidlonghaulers subreddit and extracted symptoms via approximate matching using an expanded meta-lexicon. We mapped the extracted symptoms to standard concept IDs, compared their distributions with those reported in recent literature and analyzed their distributions over time. From 42 995 posts by 4249 users, we identified 1744 users who expressed at least 1 symptom. The most frequently reported long-COVID symptoms were mental health-related symptoms (55.2%), fatigue (51.2%), general ache/pain (48.4%), brain fog/confusion (32.8%), and dyspnea (28.9%) among users reporting at least 1 symptom. Comparison with recent literature revealed a large variance in reported symptoms across studies. Temporal analysis showed several persistent symptoms up to 15 months after infection. The spectrum of symptoms identified from Reddit may provide early insights about long-COVID.

Keywords: COVID-19; natural language processing; post-acute COVID-19 syndrome; social media; virus diseases.