Super-scaling from giving an esmvaltool run more tasks on slurm #3818
Replies: 2 comments 6 replies
-
Details of individual tasks: ntasks=2
ntasks=1
The grading tasks are ncl diagnostics. |
Beta Was this translation helpful? Give feedback.
-
Apologies, accidently hit comment mid-way through writing this. This piqued my interest as I would have thought there would have been no difference in performance due to a lazy assumption I had around the SLURM configuration at the MO. The assumption I had was based on this line describing
So if the only change between two jobs is
But it must be the case that even though you are allocated 2 CPUs when using As @NParsonsMO pointed out, the grading tasks are running concurrently so I would hazard a guess that because the It would be interesting to rerun the One other thing I've noticed is how ESMValTool/Dask is configuring the number of processes to run. You'll notice when running on SPICE, if you don't specify n_processes = session["max_parallel_tasks"] or os.cpu_count() The problem with this is it sees all of the CPUs on a SPICE node.
Which is why you end up with this.
Not much of a big deal in this case as there are only two variables being processed, but I guess it might cause issues if you had say eight variables and Dask was trying to run eight grading tasks concurrently but you only requested |
Beta Was this translation helpful? Give feedback.
-
Hi all,
I'm testing a cut-down version of recipe_spmi.yml either 1 or 2 cores on the Met Office analysis cluster, and using 2 cores with same total memory available gives a 3+ factor speedup (details below). I would have expected less than a factor of two. Is there an obvious explanation for why it might be so much faster? I'm ony asking out of curiosity - not a priority if deeper investigation would be needed.
With ntasks=1 and mem=40GB it completed in 29 mins and used 34 GB.
With ntasks=2 and mem=40GB it completed in 8 mins and used 34 GB.
Recipe: recipe_smpi_filled.yml.txt
Log file: main_log.txt
Slurm directives used:
Beta Was this translation helpful? Give feedback.
All reactions