Skip to content

Commit

Permalink
Merge branch 'main' into int8
Browse files Browse the repository at this point in the history
  • Loading branch information
matthewdouglas authored Oct 30, 2024
2 parents 521da0c + 9568735 commit c75eecd
Show file tree
Hide file tree
Showing 9 changed files with 220 additions and 86 deletions.
41 changes: 16 additions & 25 deletions .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -163,45 +163,36 @@ jobs:
needs:
- build-wheels
steps:
- name: Download artifacts to tmp directory
- name: Download and rename artifacts
uses: actions/download-artifact@v4
with:
path: tmp/
pattern: "bdist_wheel_*"
merge-multiple: true
- name: Inspect tmp directory after downloading artifacts
run: ls -alFR tmp/
- name: Move and rename wheel files
- name: Move and rename wheel files with pattern replacement
run: |
mkdir -p wheels/
find tmp/ -type f -name '*.whl' -print0 | while IFS= read -r -d '' wheel; do
# exclude macos wheels for now
find tmp/ -type f -name '*.whl' ! -name '*macos*' -print0 | while IFS= read -r -d '' wheel; do
wheel_filename=$(basename "$wheel")
if [[ $wheel_filename == *linux*x86_64* ]]; then
mv "$wheel" wheels/bnb-linux-x86_64.whl
elif [[ $wheel_filename == *linux*aarch64* ]]; then
mv "$wheel" wheels/bnb-linux-aarch64.whl
elif [[ $wheel_filename == *macosx*x86_64* ]]; then
mv "$wheel" wheels/bnb-macos-x86_64.whl
elif [[ $wheel_filename == *macosx*arm64* ]]; then
mv "$wheel" wheels/bnb-macos-arm64.whl
elif [[ $wheel_filename == *win*amd64* ]]; then
mv "$wheel" wheels/bnb-windows-x86_64.whl
else
echo "Unknown wheel format: $wheel_filename"
exit 1
fi
# Remove the gith hash, e.g. `+1234567`, for a stable download link on the multi-backend pre-release
cleaned_filename=$(echo "$wheel_filename" | sed -E 's/\+[0-9a-f]{7}-/-/g')
mv "$wheel" "wheels/$cleaned_filename"
done
- name: Inspect wheels directory after renaming files
run: ls -alFR wheels/
- name: Create release and upload artifacts
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITHUB_CONTINUOUS_RELEASE_TYPE: prerelease
GITHUB_CONTINUOUS_RELEASE_TAG: continuous-release_main
run: |
wget -q https://github.com/TheAssassin/pyuploadtool/releases/download/continuous/pyuploadtool-x86_64.AppImage
chmod +x pyuploadtool-x86_64.AppImage
./pyuploadtool-x86_64.AppImage --appimage-extract-and-run wheels/*.whl
uses: softprops/[email protected]
with:
files: wheels/*.whl
prerelease: true
name: Latest `main` wheel
tag_name: continuous-release_main
make_latest: false
draft: false
target_commitish: ${{ github.sha }}

audit-wheels:
needs: build-wheels
Expand Down
6 changes: 3 additions & 3 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.3.2
rev: v0.6.9
hooks:
- id: ruff
args:
- --fix
- id: ruff-format
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.5.0
rev: v5.0.0
hooks:
- id: check-merge-conflict
- id: check-yaml
Expand All @@ -18,6 +18,6 @@ repos:
args:
- --fix=lf
- repo: https://github.com/crate-ci/typos
rev: v1.18.2
rev: v1.26.0
hooks:
- id: typos
14 changes: 9 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,15 +12,19 @@ There are ongoing efforts to support further hardware backends, i.e. Intel CPU +

**[https://huggingface.co/docs/bitsandbytes/main](https://huggingface.co/docs/bitsandbytes/main)**

## ALPHA TESTERS WANTED: `multi-backend-refactor` AMD GPU + Intel CPU/GPU specific BNB backend implementations
## `bitsandbytes` multi-backend _alpha_ release is out!

We're in the process of a complex refactor in order to allow the support of additional hardware backends, other than CUDA, in BNB. The efforts around this are already quite far along and there's plenty of functionality already in place that is in need for users to take a hands-on approach! Mac support will likely soon also see progress. However, I recommend waiting 2 weeks until the device abstraction has further consolidated (**breaking changes upcoming**).
🚀 Big news! After months of hard work and incredible community contributions, we're thrilled to announce the **bitsandbytes multi-backend _alpha_ release**! 💥

Currently, you still need to compile from source, after checking out the `multi-backend-refactor` branch (instructions WIP, but [the current docs on the compilation from source](https://huggingface.co/docs/bitsandbytes/main/en/installation#compile-from-source) are a good starting point; [feel free to share tips / input in this Github discussion](https://github.com/TimDettmers/bitsandbytes/discussions/1219). We'll soon enable nightly releases to make this much easier for you!
Now supporting:
- 🔥 **AMD GPUs** (ROCm)
-**Intel CPUs** & **GPUs**

Please give feedback to us in [this dedicated Github Discussion space](https://github.com/TimDettmers/bitsandbytes/discussions/categories/catch-all-alpha-testing-the-multi-backend-refactor)!
We’d love your early feedback! 🙏

We're super excited about these recent developments and grateful for any constructive input or support that you can give to help us make this a reality. BNB is a community project and we're excited for your collaboration 🤗
👉 [Instructions for your `pip install` here](https://huggingface.co/docs/bitsandbytes/main/en/installation#multi-backend)

We're super excited about these recent developments and grateful for any constructive input or support that you can give to help us make this a reality (e.g. helping us with the upcoming Apple Silicon backend or reporting bugs). BNB is a community project and we're excited for your collaboration 🤗

## License

Expand Down
6 changes: 4 additions & 2 deletions _typos.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,10 @@
extend-ignore-re = [
"@Ther-nul", # valid Github user
]

[default.extend-identifiers]
extend-ignore-identifiers-re = [
".*arange.*",
".*ARANGE.*",
]

[type.py.extend-words]
"BA" = "BA" # used as a commented-out variable in tests
Expand Down
2 changes: 1 addition & 1 deletion bitsandbytes/functional.py
Original file line number Diff line number Diff line change
Expand Up @@ -1860,7 +1860,7 @@ def percentile_clipping(grad: Tensor, gnorm_vec: Tensor, step: int, percentile:
gnorm_vec: torch.Tensor
Vector of gradient norms. 100 elements expected.
step: int
The current optimiation steps (number of past gradient norms).
The current optimization steps (number of past gradient norms).
"""
prev_device = pre_call(grad.device)
Expand Down
6 changes: 3 additions & 3 deletions csrc/kernels.cu
Original file line number Diff line number Diff line change
Expand Up @@ -2703,7 +2703,7 @@ template <int THREADS, int ITEMS_PER_THREAD, int TILE_ROWS, int TILE_COLS, int T
//const int global_col = base_row; // block offset for col
if((base_col + subrow_loop_row + jrow + warp_id < outRows) && (base_row+warp_lane < rows))
{
// each row hae 32 columns and is offset by 1 to prevent bank conflict during storage into smem
// each row has 32 columns and is offset by 1 to prevent bank conflict during storage into smem
char data = smem_data[(subrow_loop_row + jrow + warp_id)*33 + warp_lane];

// each 32 columns we have new tile
Expand Down Expand Up @@ -2742,7 +2742,7 @@ template <int THREADS, int ITEMS_PER_THREAD, int TILE_ROWS, int TILE_COLS, int T
//const int global_col = base_row; // block offset for col
if((base_col + subrow_loop_row + jrow + warp_id < outRows) && (base_row+warp_lane < rows))
{
// each row hae 32 columns and is offset by 1 to prevent bank conflict during storage into smem
// each row has 32 columns and is offset by 1 to prevent bank conflict during storage into smem
char data = smem_data[(subrow_loop_row + jrow + warp_id)*33 + warp_lane];

// each 32 columns we have new tile
Expand Down Expand Up @@ -2819,7 +2819,7 @@ template <int THREADS, int ITEMS_PER_THREAD, int TILE_ROWS, int TILE_COLS, int T
//const int global_col = base_row; // block offset for col
if((base_col + subrow_loop_row + jrow + warp_id < outRows) && (base_row+warp_lane < rows))
{
// each row hae 32 columns and is offset by 1 to prevent bank conflict during storage into smem
// each row has 32 columns and is offset by 1 to prevent bank conflict during storage into smem
char data = smem_data[(subrow_loop_row + jrow + warp_id)*33 + warp_lane];

// each 32 columns we have new tile
Expand Down
Loading

0 comments on commit c75eecd

Please sign in to comment.