Language selection


Government of Canada Data Conference 2023: Identifying Pandemic Hubs (Health Regions) (DDN3-V10)


A project to help identify and predict health regions at risk for higher rates of infection during the COVID-19 pandemic.

Duration: 00:04:15

Published: February 15, 2023

Event: GC Data Conference 2023: About the conference

Now playing

Government of Canada Data Conference 2023: Identifying Pandemic Hubs (Health Regions)



Transcript: Government of Canada Data Conference 2023: Identifying Pandemic Hubs (Health Regions)

I have the pleasure to introduce the identifying Pandemic Hub's presentation. This work curated valuable data and frameworks to help identify and predict health regions that are at risk for higher rates of infection.

Before we start with the presentation, I would like to acknowledge that this work was a collaboration with other the data scientist from the Data Science Division, in particular things to Denise Chen, Bilan Gill and Zachary Zanussi.

The main goal of our project was to identify and predict how the risk of becoming overwhelmed with the COVID 19 infections and the health Regional level. Health regions, also called health authorities, are a governance model used by the Canada's provincial government to administer and deliver public health care to all Canadian residents. We accomplish this by development, pseudo spatial temporal forecast model to predict health region level of infection risk. The data available to us for this project was health indicator, such as proportion of Canadians with chronic diseases, proportion of Canadians with access to regular health care provider type of health care facilities. We also have socio economic indicators as GDP population density. COVID 19 data we were having we cumulative new cases, death count and health at the health regional level and also Google and Apple mobility data. So they were changes in the mobility baseline based on different categories such walking, public transportation, driving. And these data was provided at the provincial territorial and at some city levels. We gathered all this information on historical and in a dynamic relational database that was available to the whole team.

Let's talk about a little bit of all of our architecture. Let's first remark that we developed this project during the first COVID wave, so there wasn't not much readily available data at the health regional level. There was a big challenge with this, and we needed to get as much as possible so we can model it. We developed a cloud architecture with a dynamic workflow that abstracted transform and loaded all data sources live while making them available for analysis and its share environment. This was a collaborative project that made available the latest public data to our team in real time.

For Ontario you will see the actual the orange curves versus the model's prediction, the blue curves. These are the best counts on the test that's for the top five health regions in Ontario. And again, as you can see, the model was able to capture the trends. You would see the same pathological issues in terms of their performance degradation as time goes by and some of over or underestimation in the model's prediction.

So if we go through the conclusions. So in conclusion, we developed the general framework to identify and predict vulnerable, high risk, high risk health regions during the spread and possibly new waves of COVID 19.  This time series deep learning regression and well supervised classification models allow us to do short term local risk prediction, so a one day only look ahead.

Based on the cumulative cases and deaths at the health region level. The forecasting range on this sliding window can be adjusted based on the inputs from experts and epidemiologists. I have to mention that these are not the only knobs that we can turn. These approaches are highly customizable and can be adjusted based on inputs from experts. So, for instance, the geographical level we can zoom out or we can go to a higher or lower geographical level based on feedback or the labeling system for the classification model. So we came up with the risk level threshold.

Related links

Date modified: