Preprocessing Data to smooth outliers

AmirH · October 19, 2018, 7:10pm

Hi,
I did some preprocessing on the data to smooth out some outliers in various forms.
Attached are 2 csv files, 1 for test and 1 for train, with 2 new columns: New_Consumption, New_HourlyRate

I’m curious if using these 2 columns instead of the original consumption gives you better results or not, as i did a NON ML solution, i’m curious to see if a NN responds the same to it.

It’s too complicated to list the exact calculations i did, as i did them using PowerBI and not in python code that i can easily share,
If it produces good results, i can take the time to document it into some bullets.

Test Data:
https://drive.google.com/file/d/1jnD0c81LHEYeG5etaxNEDgAypWxWCe_w/view?usp=sharing

TrainData:
https://drive.google.com/file/d/1mBzezT6FFf3P1DvsFSYrdB8YhJSFNtX6/view?usp=sharing

Topic		Replies	Views
4th place solution Cold Start Energy Forecasting	4	1305	January 10, 2019
Anomaly Detection -- R Code to Load Data and Fill NAs Power Laws	6	1770	March 20, 2018
Temperature on the train file Cold Start Energy Forecasting	1	852	September 29, 2018
Preprocessing question Sustainable Industry: Rinse Over Run	3	1081	January 16, 2019
Just to make sure the data is correct Cold Start Energy Forecasting	2	928	September 28, 2018

Preprocessing Data to smooth outliers

Related topics