Underutilized: E130 - Unused Nodes Message

An E130 error message indicates a job has several nodes allocated which appear to be idle. This might not be a problem if the job was examined while it was setting up for a new round of processing. But in general, it suggests that there is a job specification issue (i.e. a mismatch between the number of MPI processes requested by mpirun, and the number of nodes/cores available.). If the majority of the nodes assigned are never used, you may get an E131 error message instead.

The following example shows the case of a job which requested 4 nodes, but 2 are seen to be idle.

 1)  E130 - Unused nodes
 2)  Job 432234 has 2 unused nodes.
 3)  Please correct this problem.
 4)
 5)  Node statistics::
 6)  Number of nodes: 4
 7)  Number of cores: 64
 8)  Total physical memory per node: 32046mb
 9)  Average memory usage per node: 3392mb, 10%
10)  Average memory usage per core: 212mb
11)  Average virtual memory usage per node: 4914mb
12)  Average virtual memory usage per core: 307mb
13)  Average CPU percent per node: 616%
14)  Average CPU percent per core: 38%
15)  Average load per node: 6.28
16)  Reverified average load per node: 6.35
17)  Effective maximum load on a node: 16.23
18)
19)
20)  PBS_job=432234.mike3 user=flast allocation=hpc_alloc02
21)  queue=checkpt total_load=25.15 cpu_hours=190.05 wall_hours=8.90
22)  unused_nodes=2 total_nodes=4 ppn=16 avg_load=6.28 avg_cpu=616%
23)  avg_mem=3392mb avg_vmem=4914mb
24)  top_proc=flast:d_hydro:mike032:672M:433M:7.8hr:100%
25)  toppm=flast:wave.exe:mike032:2946M:2880M node_processes=0
26)  avg_avail_mem=26911mb min_avail_mem=20687mb
27)  reverified_avg_load=6.35
28)
29)  Name:  First Last
30)  Mail:  flast@somewhere.lsu.edu
31)  Affil: First Last
32)  Category:
33)  Name:  First Last
34)  Mail:  flast@somewhere.lsu.edu
35)  Affil: First Last
36)  Category: validation:current:09/03/2013
37)  Allocations:
38)  hpc_alloc02,flast,383311.16,

Users may direct questions to sys-help@loni.org.