add deepcopy and copy for Param4bit #1060

SunMarc · 2024-02-12T22:51:29Z

What does this PR do ?

This PR makes possible to deepcopy and copy the Params4bit class. With this feature, you can deepcopy/copy a 4-bit model
The tests related to 4-bit on transformers have also successfully passed.

Fixes huggingface/accelerate#2248

Deepcopy/copy `Params4bit`

from bitsandbytes.nn import Params4bit
import torch
import copy


t = torch.tensor([1.,2.,3.,4.])
param = Params4bit(data = t, requires_grad=False).cuda(0)

copy_param = copy.deepcopy(param)
assert param.quant_state is not copy_param.quant_state
assert param.data.data_ptr() != copy_param.data.data_ptr()

shallow_copy_param = copy.copy(param)
assert param.quant_state is shallow_copy_param.quant_state
assert param.data.data_ptr() == shallow_copy_param.data.data_ptr()

Deepcopy/copy a 4-bit model

model_name = "meta-llama/Llama-2-7b-hf"

tokenizer = AutoTokenizer.from_pretrained(model_name)

fp4_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=False,
    bnb_4bit_quant_type="fp4",
    bnb_4bit_compute_dtype=torch.float16
)

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=fp4_config,
    torch_dtype=torch.float16,
)

def generate(model):
    prompts = ["I would like to"]
    token_dict = tokenizer(prompts, return_tensors="pt").to(0)
    output_ids = model.generate(**token_dict, max_new_tokens=10)
    print(tokenizer.batch_decode(output_ids))
 
generate(model)

# deepcopy -> Memory increase 
model_copy = copy.deepcopy(model)
generate(model_copy)

# shallow copy -> No memory increase
model_shallow_copy = copy.copy(model)
generate(model_shallow_copy)

Cc @Titus-von-Koeller

github-actions · 2024-02-13T01:55:28Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Titus-von-Koeller · 2024-02-13T14:33:03Z

Thanks @SunMarc for taking the lead on this, greatly appreciated! I'll take a look and get back to you. Looks really good at a first glance 🤗

tests/test_functional.py

akx · 2024-02-15T08:03:22Z

bitsandbytes/nn/modules.py

+    def __getstate__(self):
+        state = self.__dict__
+        state["data"] = self.data
+        state["requires_grad"] = self.requires_grad
+        return state
+
+    def __setstate__(self, state):
+        self.requires_grad = state["requires_grad"]
+        self.blocksize = state["blocksize"]
+        self.compress_statistics = state["compress_statistics"]
+        self.quant_type = state["quant_type"]
+        self.quant_state = state["quant_state"]
+        self.data = state["data"]
+
+    def __deepcopy__(self,memo):
+        new_instance = type(self).__new__(type(self))
+        state = self.__getstate__()
+        new_instance.__setstate__(state)
+        new_instance.quant_state = copy.deepcopy(state["quant_state"])
+        new_instance.data = copy.deepcopy(state["data"])
+        return new_instance
+
+    def __copy__(self):
+        new_instance = type(self).__new__(type(self))
+        state = self.__getstate__()
+        new_instance.__setstate__(state)
+        return new_instance


Is having to do this dance common in Torch world? 🤔

I'm a little worried that someone adding a new field in __init__ will inevitably miss adding them here...

Is having to do this dance common in Torch world? 🤔

I don't think but I wasn't able to find a better solution. I based my solution over this specific code from torch.

I'm a little worried that someone adding a new field in init will inevitably miss adding them here...

Yeah, that's true :/ . I tried modify __setstate__ so that we udpate state.__dict__ using self.__dict__ but some attributes were not copied properly.

tests/test_linear8bitlt.py

tests/test_linear4bit.py

Co-authored-by: Aarni Koskela <[email protected]>

prathikr · 2024-02-20T19:18:48Z

@SunMarc can this be merged soon?

SunMarc · 2024-02-20T19:26:51Z

@SunMarc can this be merged soon?

Yes, I'm waiting the review from @Titus-von-Koeller.

Titus-von-Koeller · 2024-02-20T20:10:53Z

I'll review this tmr, but the release will be at earliest next week. The PR looks great as is and there's really no reason not to merge it other than that I need to verify it first. As a side note, we currently have a lot in the pipeline and the release process is still in transition and review. Also, there's no PR based CI pipeline yet which would make this easier to validate and green light. Other than that the cross platform effort and FSDP come first priority-wise this week.

prathikr · 2024-02-20T22:20:33Z

@Titus-von-Koeller I appreciate the transparency on prioritization of this task.

However, I would like to point out that I raised the original issue over 2 months ago and several people on the ONNX Runtime team have encountered this problem. If there is truly no harm in merging it, please do so ASAP. Thank you.

SunMarc · 2024-02-21T04:14:34Z

As Titus said, we will most probably merge it tomorrow. Can you also double check that this PR will actually solve the issue that the ONNX Runtime team have and do not face other issues @prathikr. Thank you for your patience 🤗

Titus-von-Koeller · 2024-02-21T17:43:02Z

Alright, I took a deep look this afternoon and also reran all the tests. Great work on this PR and thanks again for taking the initiative. Overall, everything looks perfect and is very polished; really appreciate this! The only thing that needs fixing is the last comment I made in the review that __getstate__ and __setstate__ should be matching. Then it's ready to merge.

SunMarc · 2024-02-21T18:02:44Z

Alright, I took a deep look this afternoon and also reran all the tests. Great work on this PR and thanks again for taking the initiative. Overall, everything looks perfect and is very polished; really appreciate this! The only thing that needs fixing is the last comment I made in the review that getstate and setstate should be matching. Then it's ready to merge.

Fixed ! Thanks again for your review !

prathikr · 2024-02-21T19:58:23Z

As Titus said, we will most probably merge it tomorrow. Can you also double check that this PR will actually solve the issue that the ONNX Runtime team have and do not face other issues @prathikr. Thank you for your patience 🤗

Yes @SunMarc this indeed resolves the issue, thank you for the efforts to merge this ASAP.

Titus-von-Koeller · 2024-02-21T20:48:50Z

Ok, added another test roundtripping the serialization. For that I added the capability to compare QuantState with each other. Also reran the tests, also Transformers BNB integration ones.

Happy to have this sorted now. Thanks @prathikr for raising it. Maybe you could check if things work for you now with BNB installed from source? Would be good to become aware of potential issues, given your particular use-case, before doing the release.

akx · 2024-02-22T06:16:23Z

@Titus-von-Koeller FWIW, no point adding a commit to .git-blame-ignore-revs if you do a squash merge... 😅

SunMarc added 3 commits February 12, 2024 23:19

fix deepcopy and copy

af7b492

add tests

6c8871b

remove line

1482d93

ruff fix

ae0fcdf

Titus-von-Koeller mentioned this pull request Feb 14, 2024

Rework CUDA/native-library setup and diagnostics #1041

Merged

akx reviewed Feb 15, 2024

View reviewed changes

SunMarc and others added 2 commits February 15, 2024 16:52

ruff

587e7c2

Update tests/test_linear4bit.py

b4f9384

Co-authored-by: Aarni Koskela <[email protected]>

Titus-von-Koeller self-assigned this Feb 21, 2024

Titus-von-Koeller added 2 commits February 21, 2024 14:58

Merge branch 'main' into deepcopy-param4bit

f62b0a2

Merge branch 'main' into deepcopy-param4bit

fe8eb71

add missing state

9e32d68

Titus-von-Koeller added 2 commits February 21, 2024 19:45

ruff format

34735ba

ignore formatting commit for git blame

eead51f

Titus-von-Koeller added 4 commits February 21, 2024 20:21

Params4bit should be initialized as frozen by default

c064373

add test for serialization round-tripping

00b6f31

add comparison capability for QuantSate

ad87fc4

add back accidentally remove line

b3a9bd5

Titus-von-Koeller merged commit cfd6ac7 into bitsandbytes-foundation:main Feb 21, 2024
9 of 10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add deepcopy and copy for Param4bit #1060

add deepcopy and copy for Param4bit #1060

SunMarc commented Feb 12, 2024 •

edited

Loading

github-actions bot commented Feb 13, 2024

Titus-von-Koeller commented Feb 13, 2024

akx Feb 15, 2024 •

edited

Loading

SunMarc Feb 15, 2024 •

edited

Loading

prathikr commented Feb 20, 2024

SunMarc commented Feb 20, 2024

Titus-von-Koeller commented Feb 20, 2024

prathikr commented Feb 20, 2024

SunMarc commented Feb 21, 2024

Titus-von-Koeller commented Feb 21, 2024 •

edited

Loading

SunMarc commented Feb 21, 2024

prathikr commented Feb 21, 2024

Titus-von-Koeller commented Feb 21, 2024 •

edited

Loading

akx commented Feb 22, 2024

add deepcopy and copy for Param4bit #1060

add deepcopy and copy for Param4bit #1060

Conversation

SunMarc commented Feb 12, 2024 • edited Loading

What does this PR do ?

Deepcopy/copy Params4bit

Deepcopy/copy a 4-bit model

github-actions bot commented Feb 13, 2024

Titus-von-Koeller commented Feb 13, 2024

akx Feb 15, 2024 • edited Loading

Choose a reason for hiding this comment

SunMarc Feb 15, 2024 • edited Loading

Choose a reason for hiding this comment

prathikr commented Feb 20, 2024

SunMarc commented Feb 20, 2024

Titus-von-Koeller commented Feb 20, 2024

prathikr commented Feb 20, 2024

SunMarc commented Feb 21, 2024

Titus-von-Koeller commented Feb 21, 2024 • edited Loading

SunMarc commented Feb 21, 2024

prathikr commented Feb 21, 2024

Titus-von-Koeller commented Feb 21, 2024 • edited Loading

akx commented Feb 22, 2024

SunMarc commented Feb 12, 2024 •

edited

Loading

Deepcopy/copy `Params4bit`

akx Feb 15, 2024 •

edited

Loading

SunMarc Feb 15, 2024 •

edited

Loading

Titus-von-Koeller commented Feb 21, 2024 •

edited

Loading

Titus-von-Koeller commented Feb 21, 2024 •

edited

Loading