Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test queues #1172

Open
matt-chan opened this issue Oct 28, 2022 · 6 comments · May be fixed by #1680
Open

Test queues #1172

matt-chan opened this issue Oct 28, 2022 · 6 comments · May be fixed by #1680
Labels

Comments

@matt-chan
Copy link
Contributor

In what area(s)?

/area administration
/area ansible
/area autoscaling
/area configuration
/area cyclecloud
/area documentation
/area image
/area job-scheduling
/area monitoring
/area ood
/area remote-visualization
/area user-management

Describe the feature

Hi Xavier,

It would be great if we could set a few test queues in azhop. This would let our users run quick jobs without having to wait for node spinup time.

Currently, I'm approximating the behavior by setting a large idle time on some queues, but would be nice to have a setting which actually keeps the nodes alive forever using the slurm setting here: https://learn.microsoft.com/en-us/azure/cyclecloud/slurm?view=cyclecloud-8#excluding-a-partition. Also another common feature of these test queues is a short job timelimit. I don't see a way to set this from cyclecloud right now though, even though it is in /etc/slurm/cyclecloud.conf.

Thanks!
Matt

@matt-chan matt-chan added the kind/feature New feature request label Oct 28, 2022
@xpillons
Copy link
Collaborator

@matt-chan instead of excluding nodes from partition, I'm thinking on have a parameter to define how many cores/VM should always been on for each queue/partitions.

@matt-chan
Copy link
Contributor Author

Hi Xavier, Yes I think that behavior would be best if we could achieve it, but I'm not certain it's possible. I originally tried to make a PR before making this feature request but I couldn't figure out how to do it. I'm not sure Cyclecloud and slurm have that functionality.

Your team is definitely better at this stuff than I am. If you can figure it out, it would be a great feature! Just to make sure we're on the same page, it is the number of idle VMs we want to keep in each queue right? So if there are 5 jobs, and the idle setting is 2 VMs, there should be 7 VMs running in total?

@xpillons
Copy link
Collaborator

xpillons commented Nov 2, 2022

@matt-chan the way it works is that it will always keep x number of nodes always running. If they are filled by jobs then new nodes will be added up to the quota define for that queue/partition.
I'm afraid that having always yy number of nodes above the allocated ones is not possible today.

@ltalirz
Copy link
Contributor

ltalirz commented Mar 16, 2023

@xpillons I now implemented a simple solution for this.
The following script is run as a cron job every 5 minutes on weekdays (I have it on ondemand, but I guess it should move to the scheduler VM). I think it is self-explanatory

#!/bin/bash
# Usage: ./warmup-queues.sh viz hb2la
set -e

# SLURM node states & state flags on AZ-HOP

# idle   VM allocated and idling
# idle~  VM not allocated from Azure
# idle#  VM being allocated from Azure
# idle%  VM being powered down
# mix    Some CPUs allocated but not all

for queue in "$@"; do
  available=`sinfo -p $queue --states=mix,idle --noheader | grep -v idle~ | grep -v idle# | grep -v idle% | wc -l`
  allocating=`sinfo -p $queue --states=idle --noheader | grep idle# | wc -l`

  if [[ $available == 0 && $allocating == 0 ]]; then
    echo "Allocating 1 node on queue $queue"
    srun --partition $queue bash > /dev/null 2>&1 &
    PID=$!
    sleep 2
    set +e
    kill $PID
    set -e
  elif [[ $available -gt 0 ]]; then
    # "touch" one available node so that it won't be deallocated by slurm after timeout
    set +e
    srun --partition $queue "exit" > /dev/null 2>&1 &
    set -e
  fi
done

The admin can set a warmup field on any queue in config.yml. These queues are passed as arguments to the cronjob.

Let me know if you are interested in a PR for this

P.S. This creates one extra job every 5 minutes per queue. There may be more "official" ways of doing this via the slurm config https://slurm.schedmd.com/power_save.html#config but it already does the job

@xpillons
Copy link
Collaborator

@ltalirz sounds a great start. Need to be run on the scheduler. Also ideally it should read the config file and pickup partition names, number of nodes to allocate.

@ltalirz
Copy link
Contributor

ltalirz commented Mar 16, 2023

Also ideally it should read the config file and pickup partition names

This is already how it works; the cronjob is

    - name: set up cronjob for queue warmup
      cron:
        name: "queue-warmup"
        job: "/usr/local/sbin/queue-warmup.sh {{ warmup_queues | map(attribute='name') | join(' ') }}"
        minute: "*/5"
        weekday: 1-5
        user: "root"
        state: "present"
      vars:
        warmup_queues: "{{ queues | selectattr('warmup', 'defined') | selectattr('warmup', 'equalto', true) }}"

Modification for keeping >1 warm nodes will require some modifications (more touching of nodes needed) but should be doable I guess. In practice, 1 idling node (at all times) is already a great improvement in user experience and often all you need.

@ltalirz ltalirz linked a pull request Sep 12, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants