It is interesting to think about whether we can use some weather indications in some areas outside of New Haven to predict weather type of New Haven. Such weather types may include rain, fog, thunderstorm, snow, or any combination of them. The weather indicators may include temperature, dew point, humidity, sea level press, etc. I used three cities in PA as predictors and logistic regression to finish the prediction. Some key points:
(1) Used data visualization and correlation analysis to decide that the time lag is one day between predictors and response.
(2) Used undersampling, oversampling, boosting and adaptive bagging to address the imbalanced problem.
(3) Predicted different data types separately, and combined the results together to see the overall performance.
Data Source: WeatherUnderground.
Code: click here.
Report: download here.