Skip to content
This repository has been archived by the owner on Oct 11, 2024. It is now read-only.

update workflows to use generated whls #204

Merged
merged 40 commits into from
May 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
062a8c5
iteration 1
Apr 23, 2024
0d5a9e6
here goes
Apr 23, 2024
e22b518
gg
Apr 23, 2024
453e6f6
here we go again
Apr 23, 2024
e697076
okay
Apr 23, 2024
765f56d
clean up some cruft
Apr 23, 2024
f566a67
making pyenv virutalenv robust
Apr 24, 2024
9d0b2cb
Merge remote-tracking branch 'origin/main' into redo-nightly
Apr 30, 2024
84a3dd5
update build-test
Apr 30, 2024
eab4ee3
Add benchmarking input to build-test workflow
dbarbuzzi Apr 30, 2024
73ed882
Update remote-push workflow to call build-test
dbarbuzzi Apr 30, 2024
2adff16
Update TEST-MULTI to only run on nightly/release
dbarbuzzi Apr 30, 2024
c984364
Add missing params in remote-push
dbarbuzzi Apr 30, 2024
2376553
Add default for "push_benchmark[…]" input
dbarbuzzi Apr 30, 2024
fa9a69c
Update nightly workflow to call build-test
dbarbuzzi Apr 30, 2024
036362d
Add temp debug info
dbarbuzzi Apr 30, 2024
90773e1
Try alternate expression for conditional
dbarbuzzi Apr 30, 2024
4d06537
Update conditional expression
dbarbuzzi Apr 30, 2024
e51df10
Remove temp DEBUG job
dbarbuzzi Apr 30, 2024
b09b653
Update `uses` in nightly job
dbarbuzzi Apr 30, 2024
c1f4ff3
Add publish job to build-test workflow
dbarbuzzi Apr 30, 2024
cc2b70f
Update build-test to use benchmark config input
dbarbuzzi May 1, 2024
0dd7c07
Drop lm-eval accuracy job
dbarbuzzi May 1, 2024
4bc5d11
Rename 'vllm' folder in benchmark script
dbarbuzzi May 1, 2024
a7bda4f
Increase logging in benchmark script
dbarbuzzi May 1, 2024
4beb1ab
Move folder rename to correct location
dbarbuzzi May 1, 2024
a126dd7
Remove conditional
dbarbuzzi May 1, 2024
a4993be
Don't fail if folder is already moved
dbarbuzzi May 1, 2024
4246779
Restore correct skip list for remote-push
dbarbuzzi May 2, 2024
3b99796
List vllm package files
dbarbuzzi May 2, 2024
5a30cd2
fix benchmark ls
dbarbuzzi May 2, 2024
db1aea9
Install magic wand after wheel
dbarbuzzi May 2, 2024
31ba163
Show python binaries being used
dbarbuzzi May 2, 2024
9f1ecfd
Merge branch 'main' into redo-nightly
dbarbuzzi May 2, 2024
6cae967
Install nm-magic-wand as an extra
dbarbuzzi May 2, 2024
b30e6df
some minor patches
May 2, 2024
2391249
Update benchmark scripts for upstream changes
dbarbuzzi May 2, 2024
443bd15
Remove debug logging
dbarbuzzi May 2, 2024
df6e969
Restore correct nightly skip list
dbarbuzzi May 2, 2024
b6889fd
Fix style issue
dbarbuzzi May 2, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .github/actions/nm-benchmark/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,9 @@ runs:
- id: benchmark
run: |
mkdir -p ${{ inputs.output_directory }}
# move source directories
mv vllm vllm-ignore || echo "no 'vllm' folder to move"
mv csrc csrc-ignore || echo "no 'csrc' folder to move"
COMMIT=${{ github.sha }}
VENV="${{ inputs.venv }}-${COMMIT:0:7}"
source $(pyenv root)/versions/${{ inputs.python }}/envs/${VENV}/bin/activate
Expand Down
6 changes: 2 additions & 4 deletions .github/actions/nm-install-test-whl/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,14 +44,12 @@ runs:
pip3 install coverage
pip3 install pytest-cov
pip3 install pytest-xdist
pip3 install --index-url http://${{ inputs.pypi }}:8080/ --trusted-host ${{ inputs.pypi }} nm-magic-wand-nightly
pip3 list
pip3 install -r requirements-dev.txt
BASE=$(./.github/scripts/convert-version ${{ inputs.python }})
WHL=$(find . -type f -iname "*${BASE}*.whl")
WHL_BASENAME=$(basename ${WHL})
echo "whl=${WHL_BASENAME}" >> "$GITHUB_OUTPUT"
pip3 install ${WHL}
pip3 install -r requirements-dev.txt
pip3 install ${WHL}[sparse]
# report magic_wand version
MAGIC_WAND=$(pip3 show nm-magic-wand-nightly | grep "Version" | cut -d' ' -f2)
echo "magic_wand=${MAGIC_WAND}" >> "$GITHUB_OUTPUT"
Expand Down
27 changes: 27 additions & 0 deletions .github/actions/nm-install-whl/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
name: install whl
description: 'installs found whl based on python version into specified venv'
inputs:
python:
description: 'python version, e.g. 3.10.12'
required: true
venv:
description: 'name for python virtual environment'
required: true
runs:
using: composite
steps:
- id: install_whl
run: |
# move source directories
mv vllm vllm-ignore
mv csrc csrc-ignore
# activate and install
COMMIT=${{ github.sha }}
VENV="${{ env.VENV_BASE }}-${COMMIT:0:7}"
source $(pyenv root)/versions/${{ inputs.python }}/envs/${VENV}/bin/activate
pip3 install -r requirements-dev.txt
BASE=$(./.github/scripts/convert-version ${{ inputs.python }})
WHL=$(find . -type f -iname "*${BASE}*.whl")
WHL_BASENAME=$(basename ${WHL})
pip3 install ${WHL}[sparse]
shell: bash
2 changes: 1 addition & 1 deletion .github/actions/nm-set-python/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ runs:
pyenv local ${{ inputs.python }}
COMMIT=${{ github.sha }}
VENV="${{ inputs.venv }}-${COMMIT:0:7}"
pyenv virtualenv ${VENV} || true
pyenv virtualenv --force ${VENV}
source $(pyenv root)/versions/${{ inputs.python }}/envs/${VENV}/bin/activate
VERSION=$(python --version)
echo "version=${VERSION}" >> "$GITHUB_OUTPUT"
Expand Down
6 changes: 3 additions & 3 deletions .github/scripts/nm-run-benchmarks.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,18 +3,18 @@

set -e
set -u

if [ $# -ne 2 ];
then
echo "run_benchmarks needs exactly 2 arguments: "
echo " 1. Path to a .txt file containing the list of benchmark config paths"
echo " 2. The output path to store the benchmark results"
exit 1
fi

benchmark_config_list_file=$1
output_directory=$2

for bench_config in `cat $benchmark_config_list_file`
do
echo "Running benchmarks for config " $bench_config
Expand Down
143 changes: 121 additions & 22 deletions .github/workflows/build-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,34 +3,69 @@ on:
# makes workflow reusable
workflow_call:
inputs:
build_label:
description: "requested runner label (specifies instance)"
wf_category:
description: "categories: REMOTE, NIGHTLY, RELEASE"
type: string
required: true
timeout:
description: "time limit for run in minutes "
default: "REMOTE"
python:
description: "python version, e.g. 3.10.12"
type: string
required: true
gitref:
description: "git commit hash or branch name"
# build related parameters
build_label:
description: "requested runner label (specifies instance)"
type: string
required: true
default: "gcp-build-static"
build_timeout:
description: "time limit for build in minutes "
type: string
default: "60"
Gi_per_thread:
description: 'requested GiB to reserve per thread'
type: string
required: true
default: "1"
nvcc_threads:
description: "number of threads nvcc build threads"
type: string
default: "4"
# test related parameters
test_label_solo:
description: "requested runner label (specifies instance)"
type: string
required: true
python:
description: "python version, e.g. 3.10.12"
test_label_multi:
description: "requested runner label (specifies instance)"
type: string
required: true
test_timeout:
description: "time limit for test run in minutes "
type: string
required: true
gitref:
description: "git commit hash or branch name"
type: string
required: true
test_skip_list:
description: 'file containing tests to skip'
type: string
required: true
# benchmark related parameters
benchmark_label:
description: "requested benchmark label (specifies instance)"
type: string
default: ""
benchmark_config_list_file:
description: "benchmark configs file, e.g. 'nm_benchmark_nightly_configs_list.txt'"
type: string
required: true
benchmark_timeout:
description: "time limit for benchmarking"
type: string
default: "720"
push_benchmark_results_to_gh_pages:
description: "When set to true, the workflow pushes all benchmarking results to gh-pages UI"
type: string
default: "false"

# makes workflow manually callable
workflow_dispatch:
Expand All @@ -39,8 +74,20 @@ on:
description: "requested runner label (specifies instance)"
type: string
required: true
timeout:
description: "time limit for run in minutes "
build_timeout:
description: "time limit for build in minutes "
type: string
required: true
test_label_solo:
description: "requested runner label (specifies instance)"
type: string
required: true
test_label_multi:
description: "requested runner label (specifies instance)"
type: string
required: true
test_timeout:
description: "time limit for test run in minutes "
type: string
required: true
gitref:
Expand Down Expand Up @@ -70,25 +117,77 @@ jobs:
uses: ./.github/workflows/build.yml
with:
build_label: ${{ inputs.build_label }}
timeout: ${{ inputs.timeout }}
gitref: ${{ inputs.gitref }}
timeout: ${{ inputs.build_timeout }}
gitref: ${{ github.ref }}
Gi_per_thread: ${{ inputs.Gi_per_thread }}
nvcc_threads: ${{ inputs.nvcc_threads }}
python: ${{ inputs.python }}
secrets: inherit

TEST:
TEST-SOLO:
needs: [BUILD]
if: success()
strategy:
matrix:
test_label: [aws-avx2-192G-4-a10g-96G]
uses: ./.github/workflows/test.yml
with:
test_label: ${{ matrix.test_label }}
timeout: ${{ inputs.timeout }}
gitref: ${{ inputs.gitref }}
test_label: ${{ inputs.test_label_solo }}
timeout: ${{ inputs.test_timeout }}
gitref: ${{ github.ref }}
python: ${{ inputs.python }}
whl: ${{ needs.BUILD.outputs.whl }}
test_skip_list: ${{ inputs.test_skip_list }}
secrets: inherit

TEST-MULTI:
needs: [BUILD]
if: success() && contains(fromJSON('["NIGHTLY", "RELEASE"]'), inputs.wf_category)
uses: ./.github/workflows/test.yml
with:
test_label: ${{ inputs.test_label_multi }}
timeout: ${{ inputs.test_timeout }}
gitref: ${{ github.ref }}
python: ${{ inputs.python }}
whl: ${{ needs.BUILD.outputs.whl }}
test_skip_list: ${{ inputs.test_skip_list }}
secrets: inherit

PUBLISH:
needs: [TEST-SOLO, TEST-MULTI]
uses: ./.github/workflows/nm-publish.yml
with:
label: ${{ inputs.build_label }}
timeout: ${{ inputs.build_timeout }}
gitref: ${{ github.ref }}
python: ${{ inputs.python }}
whl: ${{ needs.BUILD.outputs.whl }}
tarfile: ${{ needs.BUILD.outputs.tarfile }}
secrets: inherit

BENCHMARK:
needs: [BUILD]
if: success()
uses: ./.github/workflows/nm-benchmark.yml
with:
label: ${{ inputs.test_label_solo }}
benchmark_config_list_file: ${{ inputs.benchmark_config_list_file }}
timeout: ${{ inputs.benchmark_timeout }}
gitref: ${{ github.ref }}
python: ${{ inputs.python }}
whl: ${{ needs.BUILD.outputs.whl }}
# Always push if it is a scheduled job
push_benchmark_results_to_gh_pages: "${{ github.event_name == 'schedule' || inputs.push_benchmark_results_to_gh_pages }}"
secrets: inherit

# TODO: decide if this should build or use the whl
# single gpu
# TODO: this should only run if doing a NIGHTLY or RELEASE
# Accuracy-Smoke-AWS-AVX2-32G-A10G-24G:
# if: ${{ inputs.wf_category == 'NIGHTLY' || inputs.wf_category == 'RELEASE' }}
# uses: ./.github/workflows/nm-lm-eval-smoke.yml
# with:
# label: ${{ inputs.test_label_solo }}
# timeout: ${{ inputs.benchmark_timeout }}
# gitref: ${{ github.ref }}
# Gi_per_thread: ${{ inputs.Gi_per_thread }}
# nvcc_threads: ${{ inputs.nvcc_threads }}
# python: ${{ inputs.python }}
# secrets: inherit
1 change: 1 addition & 0 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ jobs:
timeout-minutes: ${{ fromJson(inputs.timeout) }}
outputs:
whl: ${{ steps.build.outputs.whl }}
tarfile: ${{ steps.build.outputs.tarfile }}

steps:

Expand Down
67 changes: 17 additions & 50 deletions .github/workflows/nightly.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,64 +6,31 @@ on:
- cron: '0 1 * * *'

workflow_dispatch:
inputs:
push_benchmark_results_to_gh_pages:
description: "When set to true, the workflow pushes all benchmarking results to gh-pages UI "
type: choice
options:
- 'true'
- 'false'
default: 'false'
inputs:
push_benchmark_results_to_gh_pages:
description: "When set to true, the workflow pushes all benchmarking results to gh-pages UI "
type: choice
options:
- 'true'
- 'false'
default: 'false'

jobs:

NIGHTLY-MULTI:
BUILD-TEST:
uses: ./.github/workflows/build-test.yml
with:
build_label: aws-avx2-192G-4-a10g-96G
timeout: 480
gitref: ${{ github.ref }}
Gi_per_thread: 4
nvcc_threads: 8
wf_category: NIGHTLY
python: 3.10.12
test_skip_list:
secrets: inherit

NIGHTLY-SOLO:
uses: ./.github/workflows/build-test.yml
with:
build_label: aws-avx2-32G-a10g-24G
timeout: 480
gitref: ${{ github.ref }}
Gi_per_thread: 12
nvcc_threads: 1
python: 3.11.4

test_label_solo: aws-avx2-32G-a10g-24G
test_label_multi: aws-avx2-192G-4-a10g-96G
test_timeout: 480
test_skip_list:
secrets: inherit

# single gpu
AWS-AVX2-32G-A10G-24G-Benchmark:
uses: ./.github/workflows/nm-benchmark.yml
with:
label: aws-avx2-32G-a10g-24G
benchmark_config_list_file: ./.github/data/nm_benchmark_nightly_configs_list.txt
timeout: 720
gitref: '${{ github.ref }}'
Gi_per_thread: 12
nvcc_threads: 1
python: "3.10.12"
# Always push if it is a scheduled job
benchmark_label: aws-avx2-32G-a10g-24G
benchmark_config_list_file: ./.github/data/nm_benchmark_nightly_configs_list.txt
benchmark_timeout: 720
push_benchmark_results_to_gh_pages: "${{ github.event_name == 'schedule' || inputs.push_benchmark_results_to_gh_pages }}"
secrets: inherit

# single gpu
Accuracy-Smoke-AWS-AVX2-32G-A10G-24G:
uses: ./.github/workflows/nm-lm-eval-smoke.yml
with:
label: aws-avx2-32G-a10g-24G
timeout: 240
gitref: '${{ github.ref }}'
Gi_per_thread: 12
nvcc_threads: 1
python: "3.10.12"
secrets: inherit
Loading
Loading