Please be thoughtful of what metrics you choose to score people by. An F-beta metric would have been more practical. With the WPR heavily biased towards precision people were focused on optimizing a metric in a way that did not seem to have real life practicality.
Giving 80% of the score on precision alone means people optimized for selecting about 50-300 anomalies depending on the site out of the tens of thousands of anomalies that Schneider Electric Identified. (it’s easy to estimate these numbers based on an all True submission).
With an F-beta metric the score could place more reward on true positives while also forcing competitors to identify as many anomalies as possible.
It was unclear exactly what was considered an anomaly. The problem description talked about overconsumption of electricity, but then the challenge call to action simply said ‘find anomalies in the data’. Does that mean ignore anomalies that show under consumption? And does that mean only find anomalies of electricity consumption but not flag anomalous meter readings?
In any real world case these goals would be made clear, or we could go back and ask clients more about the problem scope.
Spreading the prize money among the top 5, or even 10, would get more people excited. The people ranked 5-10 would get tiny amounts, but it would be a great gesture since they usually worked nearly as hard as the top 5. Don’t assume kaggle’s traditional top 3 reward is the best way.
One more thing, the competition was too short. Extending the time frame would have led to more people getting involved.