Description
TL;DR
I am using the valid_systems = [r'%scheduler=slurm', r'%scheduler=squeue']
syntax described in the documentation to select which tests to run. ReFrame only selects the tests that depend on the first partition on the list defined in site_configuration
.
ReFrame version
4.8.0-dev.3+cf670fe1 (latest as of today)
Steps to reproduce the error
1. Create system configuration files
Create a configuration file for a given cluster, defining two different partitions. In my case, I have used local
and slurm
.
Copy the file into a different one and change the order of appearance of the partitions inside the partitions
key in the systems list.
I have used the two configuration files below.
File daint-local-partition-first-config.py
:
site_configuration = {
'systems': [
{
'name' : 'daint',
'descr' : 'Piz Daint vCluster',
'hostnames' : ['daint'],
'partitions': [
{
'name': 'login',
'scheduler': 'local',
'time_limit': '10m',
'environs': [
'builtin',
],
'descr': 'Login nodes',
'max_jobs': 4,
'launcher': 'local'
},
{
'name': 'normal',
'descr': 'GH200',
'scheduler': 'slurm',
'time_limit': '10m',
'environs': [
'builtin',
],
'max_jobs': 100,
'launcher': 'srun',
},
]
}
],
}
File daint-slurm-partition-first-conf.py
:
site_configuration = {
'systems': [
{
'name' : 'daint',
'descr' : 'Piz Daint vCluster',
'hostnames' : ['daint'],
'partitions': [
{
'name': 'normal',
'descr': 'GH200',
'scheduler': 'slurm',
'time_limit': '10m',
'environs': [
'builtin',
],
'max_jobs': 100,
'launcher': 'srun',
},
{
'name': 'login',
'scheduler': 'local',
'time_limit': '10m',
'environs': [
'builtin',
],
'descr': 'Login nodes',
'max_jobs': 4,
'launcher': 'local'
},
]
}
],
}
2. Define two tests
One test should set valid_systems = [r'%scheduler=slurm']
and the other valid_systems = [r'%scheduler=local']
.
I am using the following two tests.
import os
import reframe as rfm
import reframe.utility.sanity as sn
SLEEPCMD='/bin/sleep'
@rfm.simple_test
class sleep_submit_job_check(rfm.RunOnlyRegressionTest):
executable = SLEEPCMD
# run only when slurm is the workload manager
valid_systems = [r'%scheduler=slurm']
valid_prog_environs = ['builtin']
executable_opts = ['1']
@sanity_function
def assert_sanity(self):
return True
@rfm.simple_test
class sleep_local_job_check(rfm.RunOnlyRegressionTest):
executable = SLEEPCMD
# run in the local scheduler
valid_systems = [r'%scheduler=local']
valid_prog_environs = ['builtin']
executable_opts = ['1']
@sanity_function
def assert_sanity(self):
return sn.all([
sn.assert_eq(os.stat(sn.evaluate(self.stdout)).st_size, 0,
msg=f'file {self.stdout} is not empty'),
sn.assert_eq(os.stat(sn.evaluate(self.stderr)).st_size, 0,
msg=f'file {self.stderr} is not empty'),
])
3. The output
When the local
partition is defined as the first entry, it selects only the job that sets valid_systems = [r'%scheduler=local']
.
$ reframe -C daint-local-partition-first-config.py -c mini-reproducer.py -l
[ReFrame Setup]
version: 4.8.0-dev.3+cf670fe1
...
[List of matched checks]
- sleep_local_job_check /7370cc85
Found 1 check(s)
...
When the slurm
partition is defined as the first entry, it selects only the job that sets valid_systems = [r'%scheduler= slurm']
.
$ reframe -C daint-slurm-partition-first-conf.py -c mini-reproducer.py -l
[ReFrame Setup]
version: 4.8.0-dev.3+cf670fe1
...
[List of matched checks]
- sleep_submit_job_check /4d2777d3
Found 1 check(s)
....
4. The expected output
ReFrame should select both tests independently of the order in the site_configuration
variable.
Thus this
$ reframe -C daint-local-partition-first-config.py -c mini-reproducer.py -l
[ReFrame Setup]
version: 4.8.0-dev.3+cf670fe1
...
[List of matched checks]
- sleep_local_job_check /7370cc85
- sleep_submit_job_check /4d2777d3
Found 2 check(s)
...
should have the same output as below.
$ reframe -C daint-slurm-partition-first-conf.py -c mini-reproducer.py -l
[ReFrame Setup]
version: 4.8.0-dev.3+cf670fe1
...
[List of matched checks]
- sleep_local_job_check /7370cc85
- sleep_submit_job_check /4d2777d3
Found 2 check(s)
....
Metadata
Metadata
Assignees
Type
Projects
Status