About metric... (R)MSE unstable to outliers

leigh.plt · January 19, 2021, 2:50pm

Why was RMSE chosen which is very unstable on data with many outliers? If you want to detect emissions more stably, the metric should be a log scale or some type of scale operation.
For example if you super accurate predict 49999 steps but missed once with outlier by 100 your score RMSE: (100**2 / 50000)**0.5 = 0.447. Huge influence by one single value.

Timothee · January 19, 2021, 4:49pm

+1, I also agree with you, this score seems weird seeing that in the benchmark post they explicitely say that there are outliers.

A better score would be median absolute error or at least mean absolute error or a median squared error if we are really interested in the mean, all these solutions would be more robust, I think the median bit is even better than log-scale (at least it is mathematically speaking).

One of the reason for this score I think is that it is very often used in the context of regression but apart from that I would also vote for a more robust score considering the nature of the problem (there are outliers in the test set).

manojnair · January 19, 2021, 11:59pm

You raised a great point. Most of the time, Dst fluctuates in the range +10 to -30 nT. The energetic solar-events that produce large (< -100 nT) negative deflections in Dst are rare, but they are important to be modeled accurately. They adversely and severely affect the magnetic referencing. These large, negative deflections are not “outliers” and they usually last several hours. However, the input solar-wind data (RTSW) does have some outliers, owing to the sensor malfunctions & outages. We use RMSE for two reasons. First is that we want to make sure that the model is also sensitive to the rare, large events. Secondly, RMSE is widely used in the geophysics scientific literature, so it is useful for comparison with a model published in the past. Thanks for your question and good luck.

Topic		Replies	Views
Calculation of metric for the submissions MagNet: Model the Geomagnetic Field	2	393	December 22, 2020
Forecasting energy consumption: NWRMSE Power Laws	9	1894	April 1, 2018
How the evaluation metrics handles outlier noises in lidar ground truth? The BioMassters	3	558	November 9, 2022
Manually calculate RMSE MagNet: Model the Geomagnetic Field	1	447	January 20, 2021
Domain Knowledge and Resources MagNet: Model the Geomagnetic Field	2	594	December 23, 2020

About metric... (R)MSE unstable to outliers

Related topics