Anomalies detection - quite a lot of difference in #1 and #2!

pnepne · March 7, 2018, 7:24am

The score for #1 (0.75x) is almost double of that of #2 (0.38x)! This is a lot of difference for any data science competition. I wonder if #1 (viana) is using any radically new technique…

willkoehrsen · March 7, 2018, 11:07pm

Theoretically, according to the scoring, if we declared 1 point as an anomaly and it was correct, our score would be 0.8! I wonder how the judges would consider this model. According to the scoring of the competition, this would be a great model, but it clearly is not practical.

My guess is the top model is severely overfitting and it might even be hand-labeled. Again, this is just a hypothesis, but I would think second place has a more robust model.

vikas79 · March 8, 2018, 5:56am

how to know which record is an anomaly or not as there is no column which shows any particular record is an anomaly or not.

DenisVorotyntsev · March 8, 2018, 12:52pm

Top2 scores have pretty much the same nicknames, the differences is one letter: @viana, @lviana. Maybe it is the same person.

jefrey_c · March 8, 2018, 4:19pm

I guess that’s how to get more than two submissions per day.

willkoehrsen · March 8, 2018, 11:04pm

@vikas79 This is an unsupervised learning problem! We don’t have the answers but have to find them ourselves.

My guess again is that the top scorer(s) is hand-labeling the individual data points as anomalies. Due to the way the scoring is set up, an optimal approach would be to find a single anomalous data point and declare all others as normal. I would like to hear from the organizers how they plan on handling a situation where the top entry is hand-labeling data points.

bull · March 9, 2018, 6:41am

If you ever suspect that someone is using more than one account for submissions, please email info@drivendata.org directly.

lviana · March 16, 2018, 3:37pm

Hello ! You can be 100% sure that I’m not @viana ! It turns out that Viana is a pretty common surname in Brazil and Portugal.

@bull if you ever need a proof that I’m not @viana, don’t hesitate to contact me.

Topic		Replies	Views
Anomaly detection - evaluation anomaly? Power Laws	2	946	March 2, 2018
Anomaly Competition Feedback Power Laws	1	917	April 5, 2018
Small feedback for the organizers Power Laws	2	1018	March 31, 2018
External Data Clarification Power Laws	4	858	February 20, 2018
About the Power Laws category Power Laws	8	1543	January 1, 2019

Anomalies detection - quite a lot of difference in #1 and #2!

Related topics