Quantized Models Chunking into unequal sizes #2320

nighting0le01 · 2024-08-22T23:12:36Z

🐞Describing the bug

with reference to this issue apple/ml-stable-diffusion#353, i used the bisect_model() function to split a quantized model into 2 chunks, i tried with 7.1 and 7.0 with reference to this file:https://github.com/apple/ml-stable-diffusion/blob/cf16df8207dfcba685a9391bad04f7402ea87b73/python_coreml_stable_diffusion/chunk_mlprogram.py#L123 , but was facing same issue.

 prog = _load_prog_from_mlmodel(model)

# Compute the incision point by bisecting the program based on weights size
op_idx, first_chunk_weights_size, total_weights_size = _get_op_idx_split_location(
    prog)
print(f"First  chunk size = {first_chunk_weights_size:.2f} MB") # 152.67 MB
print(f"Second chunk size = {total_weights_size - first_chunk_weights_size:.2f} MB") #0.42 MB
print(index=587/2720)
prog_chunk1 = _make_first_chunk_prog(f"index={op_idx}/{len(main_block.operations)") # 587/3000
prog_chunk2 = _make_second_chunk_prog(_load_prog_from_mlmodel(model), op_idx)

System environment (please complete the following information):

coremltools version:8.0b2

cc: @aseemw

The text was updated successfully, but these errors were encountered:

jakesabathia2 · 2024-08-22T23:19:16Z

@nighting0le01 would you mind providing a standalone script for us to reproduce?

nighting0le01 · 2024-08-26T19:47:02Z

hi @jakesabathia2 !! here is the code to reproduce,
coremltools version 7.01, i know with 8.0b2 the chunking has moved to CoreMLtools but i think it has the same issue when chunking a quantized or palletized model

Model is simple MobileNet that can be downloaded from coremltools tutorial:https://apple.github.io/coremltools/docs-guides/source/opt-palettization-perf.html#:~:text=0.47-,MobileNetv2%2D1.0,-4%20bit

import coremltools as ct
from python_coreml_stable_diffusion.chunk_mlprogram import (
    _load_prog_from_mlmodel,
    _get_op_idx_split_location,
    _make_second_chunk_prog,
    _make_first_chunk_prog,
)
# link to get model:https://apple.github.io/coremltools/docs-guides/source/opt-palettization-perf.html#:~:text=0.47-,MobileNetv2%2D1.0,-4%20bit
model = ct.models.MLModel('MobileNetV2Alpha1ScalarPalettization4Bit.mlpackage')
prog = _load_prog_from_mlmodel(model)
# Load the MIL Program from MLModel
prog = _load_prog_from_mlmodel(model)

# Compute the incision point by bisecting the program based on weights size
op_idx, first_chunk_weights_size, total_weights_size = _get_op_idx_split_location(
    prog)
main_block = prog.functions["main"]
incision_op = main_block.operations[op_idx]

print(f"op_idx = {op_idx}")
print(f"First  chunk size = {first_chunk_weights_size:.2f} MB")
print(f"Second chunk size = {total_weights_size - first_chunk_weights_size:.2f} MB")

INFO:python_coreml_stable_diffusion.chunk_mlprogram:Loading MLModel object into a MIL Program object (including the weights)..
INFO:python_coreml_stable_diffusion.chunk_mlprogram:Program loaded in 0.1 seconds
INFO:python_coreml_stable_diffusion.chunk_mlprogram:Loading MLModel object into a MIL Program object (including the weights)..
INFO:python_coreml_stable_diffusion.chunk_mlprogram:Program loaded in 0.1 seconds
op_idx = 187
First  chunk size = 1.68 MB
Second chunk size = 0.15 MB

nighting0le01 · 2024-08-26T20:00:49Z

Hi @jakesabathia2 , below is with 8.0b2 version of CoreMLtools,
cc @aseemw :apple/ml-stable-diffusion#353

import coremltools as ct
# link to get model:https://apple.github.io/coremltools/docs-guides/source/opt-palettization-perf.html#:~:text=0.47-,MobileNetv2%2D1.0,-4%20bit
model = ct.models.MLModel('MobileNetV2Alpha1ScalarPalettization4Bit.mlpackage')
prog = _load_prog_from_mlmodel(model)
# Load the MIL Program from MLModel
prog = _load_prog_from_mlmodel(model)
output_dir = "./output/"
model_path = './MobileNetV2Alpha1ScalarPalettization4Bit.mlpackage'
# Compute the incision point by bisecting the program based on weights size
ct.models.utils.bisect_model(
    model_path,
    output_dir,
    merge_chunks_to_pipeline=False,
)

print(f"First  chunk size = {first_chunk_weights_size:.2f} MB")
print(f"Second chunk size = {total_weights_size - first_chunk_weights_size:.2f} MB")

nighting0le01 · 2024-08-28T21:53:18Z

@jakesabathia2 @DawerG @aseemw @atiorh @TobyRoseman any help appreciated thank you 🙏

nighting0le01 added the bug Unexpected behaviour that should be corrected (type) label Aug 22, 2024

nighting0le01 mentioned this issue Aug 27, 2024

Chunking quantized model leads to unequal Chunks apple/ml-stable-diffusion#353

Closed

nighting0le01 mentioned this issue Sep 3, 2024

Add bisect_model API under ct.models.utils #2286

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quantized Models Chunking into unequal sizes #2320

Quantized Models Chunking into unequal sizes #2320

nighting0le01 commented Aug 22, 2024 •

edited

Loading

jakesabathia2 commented Aug 22, 2024 •

edited

Loading

nighting0le01 commented Aug 26, 2024 •

edited

Loading

nighting0le01 commented Aug 26, 2024 •

edited

Loading

nighting0le01 commented Aug 28, 2024 •

edited

Loading

Quantized Models Chunking into unequal sizes #2320

Quantized Models Chunking into unequal sizes #2320

Comments

nighting0le01 commented Aug 22, 2024 • edited Loading

🐞Describing the bug

System environment (please complete the following information):

jakesabathia2 commented Aug 22, 2024 • edited Loading

nighting0le01 commented Aug 26, 2024 • edited Loading

nighting0le01 commented Aug 26, 2024 • edited Loading

nighting0le01 commented Aug 28, 2024 • edited Loading

nighting0le01 commented Aug 22, 2024 •

edited

Loading

jakesabathia2 commented Aug 22, 2024 •

edited

Loading

nighting0le01 commented Aug 26, 2024 •

edited

Loading

nighting0le01 commented Aug 26, 2024 •

edited

Loading

nighting0le01 commented Aug 28, 2024 •

edited

Loading