Skip to content

Conversation

@smoors
Copy link
Collaborator

@smoors smoors commented Jan 3, 2026

this runs a short test in a prerun cmd to get the process binding, which is checked with the check_process_binding.py script.

fixes #307

the results are written into the job error file.

the test currently doesn't fail on binding error, as we don't yet have a bullet-proof solution for setting the binding in all cases (see also the discussion in #305).

so for now, both the errors and warnings are printed as warnings on screen, adding sanity checks can be added in a follow-up PR.

example:

PROCESS BINDING ERROR: wrong number of processes: expected 3, found 5
PROCESS BINDING ERROR: wrong number of cpus per process: expected 4, found Counter({2: 5})
PROCESS BINDING WARNING: processes spanning multiple packages: Counter({2: 1})
PROCESS BINDING WARNING: processes spanning multiple numanodes: Counter({2: 3})
PROCESS BINDING WARNING: processes with cores shared by processing units, indicating hyperthreading: Counter({2: 2}),

note: i was able to get the correct launcher run command by updating the job resources in the assign_tasks_per_compute_unit function. this also allowed simplifying the openfoam test and make it more robust.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Amend mixin class to run with mpirun ... --report-bindings and do a sanity check on the binding

1 participant