Long COVID data in SQL

Data analytics projects
SQL
Long COVID
Author

Jennifer HY Lin

Published

June 5, 2022

Introduction

For this SQL project1, I’ve used the same set of data as the Tableau project in order to see if there will be any other new insights when using SQL for data analysis. Dataset is from this paper – Michelen M, Manoharan L, Elkheir N, et al. Characterising long COVID: a living systematic review. BMJ Global Health 2021;6:e005427, which was discovered through PubMed. One other thing of note was that the paper only collected long COVID-related data up until 17th March 2021. All other more recent developments of long COVID will likely require more time before further data are more readily available, for example, the long COVID impact from Omicron variants.

Image: Rawpixel.com

The process

MySQL server was installed with DBeaver used as the GUI. Four tables (Continents, Countries, Risk factors and Hospitalisation) in .csv file formats were imported into the newly created database named LongCovid. A series of SQL queries were written and performed. Two views were created so that selected data were stored for future use, such as for data visualisations in Tableau.

Footnotes

  1. The published date reflected the most recent date I worked on associated file with the project, prior to the blog move. This work is under CC BY-SA 4.0 International License for anyone interested in exploring the area further.↩︎