Add quantization parameter for `lmi_dist` rolling batch backend for HF #888

maaquib · 2023-06-30T20:31:24Z

Description

Enables setting quantize from properties for the lmi_dist rolling batch backend

If this change is a backward incompatible change, why must this change be made?
Interesting edge cases to note here

engines/python/setup/djl_python/huggingface.py

lanking520 · 2023-07-01T02:45:33Z

can you also add to FasterTransformer container for this bitsandbytes flag?

maaquib · 2023-07-03T01:55:38Z

can you also add to FasterTransformer container for this bitsandbytes flag?

@lanking520 Done

lanking520 · 2023-07-03T18:05:03Z

engines/python/setup/djl_python/rolling_batch/lmi_dist_rolling_batch.py

-                               sharded=sharded,
-                               quantize=None,
-                               trust_remote_code=kwargs.get("trust_remote_code"))
+        quantize = self.properties.get("quantize", None)


in the original properties, we have option.load_in_8bit, can we reuse this param?

load_in_8bit is a boolean. Assuming we add gptq support in next release we need a parameter which can take the quantization algo name instead of just a boolean. @lanking520 thoughts?

pending discussion

engines/python/setup/djl_python/rolling_batch/lmi_dist_rolling_batch.py

deepjavalibrary#888) * Set quantization param from properties file * Format python * Set quantize if dtype==int8 * Address review comments * Adding BITSANDBYTES_NOWELCOME flag to fastertransformer * Add back

Set quantization param from properties file

9c502aa

maaquib force-pushed the quantize branch from 58a5eaa to 8ab8f38 Compare June 30, 2023 20:38

Format python

ffd8dc7

maaquib force-pushed the quantize branch from 8ab8f38 to ffd8dc7 Compare June 30, 2023 20:49

maaquib requested a review from sindhuvahinis June 30, 2023 20:51

maaquib marked this pull request as ready for review June 30, 2023 20:51

maaquib requested review from zachgk, frankfliu and a team as code owners June 30, 2023 20:51

maaquib requested a review from xyang16 June 30, 2023 20:53

Set quantize if dtype==int8

0a3dccf

sindhuvahinis approved these changes Jun 30, 2023

View reviewed changes

lanking520 suggested changes Jun 30, 2023

View reviewed changes

engines/python/setup/djl_python/huggingface.py Outdated Show resolved Hide resolved

maaquib force-pushed the quantize branch from e130232 to e367f48 Compare June 30, 2023 23:33

Address review comments

2bbbbff

maaquib force-pushed the quantize branch from e367f48 to 2bbbbff Compare June 30, 2023 23:34

Adding BITSANDBYTES_NOWELCOME flag to fastertransformer

0ea96ca

lanking520 previously approved these changes Jul 3, 2023

View reviewed changes

lanking520 reviewed Jul 3, 2023

View reviewed changes

Merge branch 'master' into quantize

834a7a6

frankfliu reviewed Jul 6, 2023

View reviewed changes

engines/python/setup/djl_python/rolling_batch/lmi_dist_rolling_batch.py Show resolved Hide resolved

Add back

3e91509

frankfliu approved these changes Jul 6, 2023

View reviewed changes

lanking520 merged commit c737451 into deepjavalibrary:master Jul 6, 2023
8 checks passed

maaquib deleted the quantize branch July 7, 2023 16:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add quantization parameter for `lmi_dist` rolling batch backend for HF #888

Add quantization parameter for `lmi_dist` rolling batch backend for HF #888

maaquib commented Jun 30, 2023 •

edited

Loading

lanking520 commented Jul 1, 2023

maaquib commented Jul 3, 2023

lanking520 Jul 3, 2023

maaquib Jul 6, 2023

Add quantization parameter for lmi_dist rolling batch backend for HF #888

Add quantization parameter for lmi_dist rolling batch backend for HF #888

Conversation

maaquib commented Jun 30, 2023 • edited Loading

Description

lanking520 commented Jul 1, 2023

maaquib commented Jul 3, 2023

lanking520 Jul 3, 2023

Choose a reason for hiding this comment

maaquib Jul 6, 2023

Choose a reason for hiding this comment

Add quantization parameter for `lmi_dist` rolling batch backend for HF #888

Add quantization parameter for `lmi_dist` rolling batch backend for HF #888

maaquib commented Jun 30, 2023 •

edited

Loading