Optimizing Demand-side : site_id during test phase?

twsthomas · February 14, 2018, 6:37am

About the Optimizing Demand-side Strategies competition :

I have a concern about the test phase, will we be provided with the same site_id (the 11 from the train/submit data) or some brand new site_id? In the former, we should learn site features, in the latter, we should learn to generalize.
Or maybe we do not know what site_id there is in the test phase (and we should learn both)?

Moreover, in the first case, may we assume that price_buy and price_sell will behave (roughly) the same ways?

Thank you for your clarification
Thomas

ironbar · February 20, 2018, 7:46pm

I have the same concern, I don’t understand why no one has already answer you.

bull · February 21, 2018, 5:45pm

Great question: Evaluation will be done using the same site ids as the available training data, and it is acceptable to build site-specific models provided that:

The corresponding model training is automated and can be replicated for new sites.
Only past data has been used to build the model that is used for predictions at time t.

ironbar · February 21, 2018, 6:41pm

Thanks for your clear answer bull. I have also a question regarding the number of simulations that will be evaluated at the end of the competition.

There is a constraint on the execution time: 30 minutes.
And it is also said that the submission will be evaluated against 10 day periods.
However the number of simulations is unknown and that is important to verify if the time constraint is satisfied by the solution.

Thanks
ironbar

bull · February 23, 2018, 2:07am

We’re still finalizing exactly how many simulations, but you can use the provided “submit data” as the upper-bound for the number of simulations. Currently that file takes ~10 minutes to execute with no battery controller.

Will update when we have more clarity.

patbaa · March 5, 2018, 11:59am

Only past data has been used to build the model that is used for predictions at time t.

Does it mean that we need to make models for each timestamp to make sure not to use future data?

Wouldn’t it be easier to test it against the data collected/simulated after the public ones?

bull · March 5, 2018, 6:07pm

Answered here in the problem description:

patbaa · March 5, 2018, 7:45pm

But that is a different challenge.

ironbar · March 16, 2018, 6:31pm

Hi bull,
Can you confirm that the simulation with no battery takes 10 minutes on the Xeon 8175?

I’m trying to find information about the processor for comparing with mine but it seems that it is a custom processor made for AWS, so there is not too much information.

bull · March 19, 2018, 9:41pm

Yes, that is more or less right.

That said, we are planning to release additional information about how many simulations we will use for the final evaluation. As of right now, we expect that competitors will have additional time per simulation as compared to what we currently ask.

ironbar · March 20, 2018, 6:52am

Hi Bull,

My current solution takes 27 minutes on my computer. It has an i7 processor, so I’m worried if those 3 extra minutes will be enough on the evaluation processor.

bull · March 21, 2018, 10:31pm

We’ve finalized the evaluation set–you shouldn’t have to worry. Here’s the annoncement:
https://www.drivendata.org/competitions/53/optimize-photovoltaic-battery/announcements/

antoine · March 23, 2018, 6:00pm

Thank you very much @bull for this information.

I would just like to express strong worries about the choice of only 34 simulation periods for the evaluation.

Let’s just 1 minute consider the score of each participant as a random variable with some expected value and a standard deviation on a given simulation. Since the standard deviation scales as 1 / sqrt(N) for N independent simulations, the choice of 34 simulations only leads to a reduction of 6 of the variance on individual simulation. This is extremely low !

A number of 240 simulations, such as the submit, would already be high variance. But with only 34 simulations, you will not select the best solution but rather the one which had the best luck.

It is even more regrettable that I imagine your strong attention to this selection was due to fairness concerns. In reality, such a restricted selection of evaluation data leads to the opposite.

I ask you to please consider this matter seriously and add more data to the evaluation, so that the final score truly reveals the best solution and not random flukes. Some participants have invested a lot of time and would like to be evaluated fairly with sufficient data.

ironbar · March 24, 2018, 8:30am

I think the same, changing the rules at one week to the end of the challenge is not fair. I had put a lot of effort in optimizing the execution time of my solution to fit the requirement time.

And suddenly time is not an important constraint. So I may have lost my competitive advantage agains someone who instead of optimizing the time has optimized the score.

bull · March 30, 2018, 7:28pm

Thanks for your thoughts, as always we try to be fair to competitors and to change things as infrequently as possible during the competition.

As part of our evaluation, we will ensure that the winning solutions are consistently the best performing–if necessary, by running additional simulations. This will be done at the sole discretion of the judges, but with an eye to fairness and ensuring that there is not a winner by chance.

Thanks again for participating and for your work and thoughtful questions throughout!

Topic		Replies	Views
Optimizing Demand-side Final Results? Power Laws	12	1256	April 11, 2018
How were the forecasts for optimizing demand side generated? Power Laws	3	974	April 4, 2018
Clarification of Problem Objectives Power Laws	2	876	February 21, 2018
Wondering how you build your model with this train/test data Power Laws	0	811	April 2, 2018
Forecasting Power Consumption: Using past predicted test data for training for future? Power Laws	2	823	March 15, 2018

Optimizing Demand-side : site_id during test phase?

Related topics