-
Notifications
You must be signed in to change notification settings - Fork 941
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' into recordset-docstrings-fix
- Loading branch information
Showing
46 changed files
with
1,266 additions
and
513 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
name: Build Docker Images Main Branch | ||
|
||
on: | ||
push: | ||
branches: | ||
- 'main' | ||
|
||
jobs: | ||
parameters: | ||
if: github.repository == 'adap/flower' | ||
name: Collect docker build parameters | ||
runs-on: ubuntu-22.04 | ||
timeout-minutes: 10 | ||
outputs: | ||
pip-version: ${{ steps.versions.outputs.pip-version }} | ||
setuptools-version: ${{ steps.versions.outputs.setuptools-version }} | ||
flwr-version-ref: ${{ steps.versions.outputs.flwr-version-ref }} | ||
steps: | ||
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1 | ||
|
||
- uses: ./.github/actions/bootstrap | ||
id: bootstrap | ||
|
||
- id: versions | ||
run: | | ||
echo "pip-version=${{ steps.bootstrap.outputs.pip-version }}" >> "$GITHUB_OUTPUT" | ||
echo "setuptools-version=${{ steps.bootstrap.outputs.setuptools-version }}" >> "$GITHUB_OUTPUT" | ||
echo "flwr-version-ref=git+${{ github.server_url }}/${{ github.repository }}.git@${{ github.sha }}" >> "$GITHUB_OUTPUT" | ||
build-docker-base-images: | ||
name: Build base images | ||
if: github.repository == 'adap/flower' | ||
uses: ./.github/workflows/_docker-build.yml | ||
needs: parameters | ||
with: | ||
namespace-repository: flwr/base | ||
file-dir: src/docker/base/ubuntu | ||
build-args: | | ||
PIP_VERSION=${{ needs.parameters.outputs.pip-version }} | ||
SETUPTOOLS_VERSION=${{ needs.parameters.outputs.setuptools-version }} | ||
FLWR_VERSION_REF=${{ needs.parameters.outputs.flwr-version-ref }} | ||
tags: unstable | ||
secrets: | ||
dockerhub-user: ${{ secrets.DOCKERHUB_USERNAME }} | ||
dockerhub-token: ${{ secrets.DOCKERHUB_TOKEN }} | ||
|
||
build-docker-binary-images: | ||
name: Build binary images | ||
if: github.repository == 'adap/flower' | ||
uses: ./.github/workflows/_docker-build.yml | ||
needs: build-docker-base-images | ||
strategy: | ||
fail-fast: false | ||
matrix: | ||
images: [ | ||
{ repository: "flwr/superlink", file_dir: "src/docker/superlink" }, | ||
{ repository: "flwr/supernode", file_dir: "src/docker/supernode" }, | ||
{ repository: "flwr/serverapp", file_dir: "src/docker/serverapp" }, | ||
{ repository: "flwr/superexec", file_dir: "src/docker/superexec" }, | ||
{ repository: "flwr/clientapp", file_dir: "src/docker/clientapp" } | ||
] | ||
with: | ||
namespace-repository: ${{ matrix.images.repository }} | ||
file-dir: ${{ matrix.images.file_dir }} | ||
build-args: BASE_IMAGE=unstable | ||
tags: unstable | ||
secrets: | ||
dockerhub-user: ${{ secrets.DOCKERHUB_USERNAME }} | ||
dockerhub-token: ${{ secrets.DOCKERHUB_TOKEN }} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
# Evaluation for Medical challenge | ||
|
||
We build up a medical question answering (QA) pipeline to evaluate our fined-tuned LLMs. | ||
Three datasets have been selected for this evaluation: [PubMedQA](https://huggingface.co/datasets/bigbio/pubmed_qa), [MedMCQA](https://huggingface.co/datasets/medmcqa), and [MedQA](https://huggingface.co/datasets/bigbio/med_qa). | ||
|
||
|
||
## Environment Setup | ||
|
||
```shell | ||
git clone --depth=1 https://github.com/adap/flower.git && mv flower/benchmarks/flowertune-llm/evaluation/medical ./flowertune-eval-medical && rm -rf flower && cd flowertune-eval-medical | ||
``` | ||
|
||
Create a new Python environment (we recommend Python 3.10), activate it, then install dependencies with: | ||
|
||
```shell | ||
# From a new python environment, run: | ||
pip install -r requirements.txt | ||
|
||
# Log in HuggingFace account | ||
huggingface-cli login | ||
``` | ||
|
||
## Generate model decision & calculate accuracy | ||
|
||
```bash | ||
python eval.py \ | ||
--peft-path=/path/to/fine-tuned-peft-model-dir/ \ # e.g., ./peft_1 | ||
--run-name=fl \ # specified name for this run | ||
--batch-size=16 \ | ||
--quantization=4 \ | ||
--datasets=pubmedqa,medmcqa,medqa | ||
``` | ||
|
||
The model answers and accuracy values will be saved to `benchmarks/generation_{dataset_name}_{run_name}.jsonl` and `benchmarks/acc_{dataset_name}_{run_name}.txt`, respectively. | ||
|
||
|
||
> [!NOTE] | ||
> Please ensure that you provide all **three accuracy values (PubMedQA, MedMCQA, MedQA)** for three evaluation datasets when submitting to the LLM Leaderboard (see the [`Make Submission`](https://github.com/adap/flower/tree/main/benchmarks/flowertune-llm/evaluation#make-submission-on-flowertune-llm-leaderboard) section). |
Oops, something went wrong.