Anomaly detection - evaluation anomaly?

patbaa · March 1, 2018, 10:57am

I am thinking about the low scores. In the evaluation there is a 0.8*TP/(TP+FP) term.
If we identify 1 real anomaly and predict only this then we have a 0.8 score.
Ok, we need to do this to all the 3 sites, but it is not that a big extra effort.

There is a public/private split, so it can happen that all the predicted TP are in the private part, but if we can identify 20-50 anomalies then some of them should fall to the public.

What do you think? Is there an anomaly here, or just simply it is that hard to identify 20-50 anomalies without having many FP among them?

viana · March 2, 2018, 6:01am

I bet that somebody will try that unless organizers come up with the better metric

bull · March 2, 2018, 6:51pm

As with all DrivenData competitions, competitors will be best served by focusing on generalizable methods since there are both public and private leaderboards. It’s unlikely that trying to work around the metric will result in a good outcome on both, and that’s especially true if you only focus on either precision or recall. Those have been intentionally weighted in terms of how important they are in context, so the metric is a good fit for this task.

Good luck, and have fun!

Topic		Replies	Views
Anomaly related questions Power Laws	1	937	May 5, 2018
Anomaly detection: bug in scoring Power Laws	4	823	March 3, 2018
Anomaly Competition Feedback Power Laws	1	917	April 5, 2018
Anomaly Detection \| Why not allowing raw anomaly scores submission? Power Laws	0	701	March 9, 2018
About the Power Laws category Power Laws	8	1543	January 1, 2019

Anomaly detection - evaluation anomaly?

Related topics