Cannot run local eval for fl solutions, fincrime

panheng · January 23, 2023, 2:32am

I was using the following command.

SUBMISSION_TRACK=fincrime SUBMISSION_TYPE=federated GITHUB_ACTIONS_NO_TTY=true make test-submission

The tested submission was the example source code. But after configure_fit completed, it was stuck at " fit_round 1: strategy sampled 5 clients (out of 5)". And then it threw ray errors after dozens of minutes.

How can I fix this? I am using a windows 10 machine and running make commands in git bash.

jayqi · January 23, 2023, 3:09am

Hi @panheng,

Taking the error messages at face value, it seems like Ray is unable to find the requested client resources to spawn the client processes.

Earlier in the logs, at the start of federated training, you should see some logs that look something like this:

INFO flower 2023-01-23 02:56:31,321 | app.py:140 | Starting Flower simulation, config: ServerConfig(num_rounds=2, round_timeout=None)
2023-01-23 02:56:33,139 INFO worker.py:1518 -- Started a local Ray instance.
INFO flower 2023-01-23 02:56:34,302 | app.py:174 | Flower VCE: Ray initialized with resources: {'memory': 3547435008.0, 'node:127.0.0.1': 1.0, 'CPU': 4.0, 'object_store_memory': 1773717504.0}

What does it say there for how many CPU and GPU resources that Ray is starting with on your machine?

For reference, here is the relevant source code that sets the resource requirements for clients. This is set to (available - 1) CPUs for your system, and 1 GPU if you are running the GPU version of the image. It seems like you’ve allocated 16 cores to the Docker container, and you have a GPU on your system?

If you don’t need a GPU for your code, and if it’s the case that Ray is having trouble finding your GPU, then you can try using the CPU version of the image by setting the variable CPU_OR_GPU=cpu when running all Makefile commands (pull, build, test-submission, etc.).

Unfortunately, I’m not all that familiar with Windows, and I’m not sure if anything here involves any specific Windows gotchas.

Topic		Replies	Views
Resource issue when submitting for Track B federated PETs Prize Challenge	5	545	January 23, 2023
Developing Own Submission PETs Prize Challenge	12	383	January 5, 2023
Looking to get your code submission running? We can help! DEID2 Sprint 1 (Prescreened Arena)	4	456	October 20, 2020
Unable to run “make test-submission” in code execution container from local machine Youth Mental Health: Automated Abstraction	1	25	November 4, 2024
Killed submission NASA Pose Bowl	1	135	April 2, 2024

Cannot run local eval for fl solutions, fincrime

Related topics