You can find these baseline models in the above Github link, these solutions are in iPython notebooks, most of which are well commented. Comment below if you need any help understanding the code, or connect with me on any social media channels (links to which can be found on my driven data profile) for any feedback!
Thank you so much for sharing these baseline models. It has helped me tremendously and I have learned a lot in the process as well.
Two questions:
something is wrong with the XGBoost model in your github repo and it will not open.
Why do you say that the Negative Binomial Regression model " is by far the best model we have come across" when the mean absolute error for the training dataset is (for san jaun) around 17 compared to only 12 for a simple linear regression?
The problem with the XGBoost model is easy to solve. Just open it with any tools having automatic spelling correction function, then you can find there is a left curly braces missing.