Issue with make test-submission

I’ve downloaded the data for this competition and put each file in the data/ folder as suggested but when I run the “make test-submission” I get the following error:

Unpacking submission…
Archive: ./submission/submission.zip
inflating: ./predict.py
Copying main.py
… main.py copied a1f2a88f61bc340855a1ae966d8652b8 ./main.py
Running submission with Python
ERROR conda.cli.main_run:execute(34): Subprocess for ‘conda run [‘python’, ‘main.py’]’ command failed. (See above for error)
2020-12-17 21:27:52.862 | INFO | main:main:140 - reading raw Dst data from /codeexecution/data/dst_labels.csv …
2020-12-17 21:27:53.302 | INFO | main:main:144 - calculating submission format and ground truth …
2020-12-17 21:27:56.241 | INFO | main:main:151 - calculated submission format and ground truth dataframes with 139,365 rows
2020-12-17 21:27:56.242 | INFO | main:main:156 - reading in solar wind data …
/home/appuser/miniconda/envs/py/.tmpvi5sfnr8: line 3: 26 Killed python main.py

Exporting submission.csv result…
ERROR: Script did not produce a submission.csv file in the main directory.
================ END ================

Tracking back the error, it looks like the script cannot read the solar_winds.csv file. Wondering if anybody else encountered this issue?

Hi,
I was facing the same issue, later I get to know the reason is RAM is getting full because of large size of solar_wind.csv file. Increasing available RAM solved the issue. alternatively you can read part of the file instead of reading whole.

2 Likes

It worked! I simply opened the Docker Preferences and cranked up the Memory from 2Gb to 4Gb.
Thank you!

That’s great. Good luck :tada: Let me know if you face any other issues. You can also down sample the data because it is is very high frequency sample per minute. Where target is per hour.

Got a similar issue with different error code
(base) iMac-di-Andrea:noaa-runtime andrea$ make test-submission
chmod -R 0777 submission/
docker run
-it
–network none
–mount type=bind,source="/Users/andrea/Desktop/WQU Course/Drivendata comp/magnet/noaa-runtime"/data,target=/codeexecution/data,readonly
–mount type=bind,source="/Users/andrea/Desktop/WQU Course/Drivendata comp/magnet/noaa-runtime"/submission,target=/codeexecution/submission
–shm-size 8g
a7e818734188
Unpacking submission…
Archive: ./submission/submission.zip
inflating: ./config.json
inflating: ./predict.py
creating: ./model/
creating: ./model/variables/
inflating: ./model/variables/variables.data-00000-of-00001
inflating: ./model/variables/variables.index
inflating: ./model/saved_model.pb
creating: ./model/assets/
inflating: ./scaler.pck
Copying main.py
… main.py copied bd6ff34c39b399463c4a1c2b4bd2b302 ./main.py
Running submission with Python
2021-01-09 21:07:10.750 | INFO | main:main:140 - reading raw Dst data from /codeexecution/data/dst_labels.csv …
2021-01-09 21:07:11.417 | INFO | main:main:144 - calculating submission format and ground truth …
2021-01-09 21:07:15.681 | INFO | main:main:151 - calculated submission format and ground truth dataframes with 139,365 rows
2021-01-09 21:07:15.681 | INFO | main:main:156 - reading in solar wind data …
2021-01-09 21:08:15.356 | INFO | main:main:160 - reading in satellite positions data …
2021-01-09 21:08:15.630 | INFO | main:main:164 - reading in sunspots data …
2021-01-09 21:08:15.642 | INFO | main:main:168 - entering main loop
2021-01-09 21:08:31.743705: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-01-09 21:08:31.947253: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 3489280000 Hz
2021-01-09 21:08:31.947948: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x559a31758dc0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-01-09 21:08:31.947965: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2021-01-09 21:08:37.513 | INFO | main:main_loop:78 - making predictions for period train_a
Traceback (most recent call last):
File “main.py”, line 181, in
typer.run(main)
File “/home/appuser/miniconda/envs/py/lib/python3.8/site-packages/typer/main.py”, line 855, in run
app()
File “/home/appuser/miniconda/envs/py/lib/python3.8/site-packages/typer/main.py”, line 214, in call
return get_command(self)(*args, **kwargs)
File “/home/appuser/miniconda/envs/py/lib/python3.8/site-packages/click/core.py”, line 829, in call
return self.main(*args, **kwargs)
File “/home/appuser/miniconda/envs/py/lib/python3.8/site-packages/click/core.py”, line 782, in main
rv = self.invoke(ctx)
File “/home/appuser/miniconda/envs/py/lib/python3.8/site-packages/click/core.py”, line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File “/home/appuser/miniconda/envs/py/lib/python3.8/site-packages/click/core.py”, line 610, in invoke
return callback(*args, **kwargs)
File “/home/appuser/miniconda/envs/py/lib/python3.8/site-packages/typer/main.py”, line 497, in wrapper
return callback(**use_params) # type: ignore
File “main.py”, line 169, in main
submission = main_loop(
File “main.py”, line 97, in main_loop
dst0, dst1 = predict_dst(
File “/codeexecution/predict.py”, line 129, in predict_dst
features, s = preprocess_features(
File “/codeexecution/predict.py”, line 81, in preprocess_features
hourly_features = aggregate_hourly(solar_wind).join(sunspots)
File “/codeexecution/predict.py”, line 60, in aggregate_hourly
[“period”, feature_df.index.get_level_values(1).floor(“H”)]
File “/home/appuser/miniconda/envs/py/lib/python3.8/site-packages/pandas/core/indexes/base.py”, line 1469, in _get_level_values
self._validate_index_level(level)
File “/home/appuser/miniconda/envs/py/lib/python3.8/site-packages/pandas/core/indexes/base.py”, line 1402, in _validate_index_level
raise IndexError(
IndexError: Too many levels: Index has only 1 level, not 2
ERROR conda.cli.main_run:execute(34): Subprocess for ‘conda run [‘python’, ‘main.py’]’ command failed. (See above for error)
Exporting submission.csv result…
ERROR: Script did not produce a submission.csv file in the main directory.
================ END ================

It looks like you have a problem with the index. The testing dataframes only have one index (timedelta), they don’t have the “period” index level. I encountered the same issue and solved it by assuming the testing dataframes only have one index (timedelta) and adapting my code accordingly.

1 Like

Solved (seems) thanks to the advice here in the forum. I had to change the code in predict.py:

    def aggregate_hourly(feature_df, aggs=["mean", "std"]):
        """Aggregates features to the floor of each hour using mean and standard deviation.
        e.g. All values from "11:00:00" to "11:59:00" will be aggregated to "11:00:00".
        """
        # group by the floor of each hour use timedelta index
        agged = feature_df.groupby(
            [feature_df.index.get_level_values(0).floor("H")]
        ).agg(aggs)
        # flatten hierachical column index
        agged.columns = ["_".join(x) for x in agged.columns]
        return agged

Let’s see the score now
I also had an error since I forgot to import numpy .