Cause and effect of total cases

Hello everyone!

I have a question that just came to my mind as I was looking at the data for the first time.

When you look at the total cases reported per week in the training files, can we say that those cases are the result of the meteorological conditions recorded in previous weeks? What would be the relationship between the reported cases for a given week and the data provided for the same week (if any)?

What I see here is that the infections reported for a specific week are not a direct cause-effect of the meteorological conditions recorded on that same week. This is mostly because the disease has an incubation period of 4-7 days, which means that the infection is related to the conditions in previous weeks.

When analyzing the data, would it be convenient to somehow shift the data back one week or should we assume that those cases (if any) reported within the same week of the infection will eventually cancel the error produced by the cases reported a week after the infection has occurred?



Definitely makes sense to shift the climate variables! Iā€™d look at the timing of both the mosquito lifecycle and dengue incubation. It would be really cool to see a chart with training accuracies for different shifts (i.e., x axis is -1 week, -2 weeks, -3 weeks, and so on).

Maybe try to use all the shifted variables as new ones in your model as well :slight_smile: