8th place solution - CNNs+LightGBM

Hi there! huge congrats to all winners and also everyone participating!
Leaving overview of our solution. Guess we started lstms and similar bit late in the game and struggled to get good results.

Some side notes,
-could models actually be giving better answers than provided gt for some spots? as the curated ground-truth seems to be affected by interpolation and smoothing techniques
-wonder if a follow-up work could better explore how good these models could be for forecasting, ex: rapid intensification. Probably requires different competition setup to avoid test/future info peeking. :slight_smile:

last note, thanks to the organizers, and big thanks also to teams,creators and contributors of amazing tools that made our work much easier: jupyter, vs code, git, docker,sklearn, pytorch, timm, pytorch lightning, mlflow, powerbi, pandas,numpy,matplotlib, tqdm,fastai and many others used by these. :slight_smile:

best
DevScope team

Solution overview

• Best submission (last one)- 8th place
	○ Public:  7.1096 (our best public)
	○ Private: 6.7598 (our best private)
• 1-CNN Level models
	○ 10 Fold to better control early stopping and later use val fold predictions
		• Splits by storm
			○ ~40% storms have 100% test data
			○ ~Remaining 60% uniform between 1-99% test data, remaining goes to training
	○ Simple avg from folds when needed (ex: final score)
	○ Stacked current and two prev images on 3 RGB channels
	○ (added minor square mark for ocean and relative_time encoding but not that improved much)
	○ 256px center crop (seemed to work best than any other)
	○ 3 best CNN architectures found: efnet_b2, efnet_b1_ap, efnet_b1_ap-cont
	○ (30 models, 3 archs x 10 folds)
	○ y=current and future 3 next steps (4 values)
		§ (Current and 3 better than current only)
	○ Save both predictions and CNN features
		§ (fast expeiments on 2n level models, no further CNN scoring  needed)
	○ CNN CV score:
		b2 8.281
		b1_ap 8.235
		b1_ap-cont 8.271
		
• 2-Smooth/Historical Model
	○ LightGBM 10 fold (simple group fold by storm) to better control early stopping
		§ Use val folds CNN predictions (current+3 next values) and top CNN features 
			○ (before last fc, selected by lasso, save some time vs running full dataset)
		§ Add temporal features, past predictions, ocean, train groundtruth, relative_time,
			○  (past lags, diffs)
		§ (note: future image predictions score improves reasonably 
			○ but now allowed by competition setup)
	○ CV score (bit optimistic/overfitted):
		§ 7.03
• 3-Post processing
	○ Further Smooth train to test transition (some storms)
		§  using simple linear regression of last 5 steps in train
		§ (blend with 2nd model predictions with decaying weight for first 18 test steps)
		§ Parameters found by simple grid search/eval rmse drop
		§ CV Score: 6.976
	○ Final Exponential Smoothing Avg (interpolate relative_time to account for missing steps)
		§ CV Score: 6.972
4 Likes