Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multifunction consumes linearly increasing memory in the number of functions, even with all the weights shared #2351

Open
0seba opened this issue Sep 27, 2024 · 1 comment
Labels
feature request Functionality does not currently exist, would need to be created as a new feature (type)

Comments

@0seba
Copy link

0seba commented Sep 27, 2024

🌱 Describe your Feature Request

Hi, when creating a multifunction with many functions that reuses the same weights for all cases, the memory increases on the number of functions. It would be useful to reduce the memory usage for speed and accessibility to use.

Example code to reproduce the problem

This code creates a simple model with 1 billion parameters, and creates many functions with different input sizes for the same model

import os
import shutil
import numpy as np
import coremltools as ct
from coremltools.converters.mil import Builder as mb
import coremltools.converters.mil as mil


w1 = np.random.normal(loc=0.01, size=(16_384, 16_384, 1)).astype(np.float16)
w2 = np.random.normal(loc=0.01, size=(16_384, 16_384, 1)).astype(np.float16)
w3 = np.random.normal(loc=0.01, size=(16_384, 16_384, 1)).astype(np.float16)
w4 = np.random.normal(loc=0.01, size=(16_384, 16_384, 1)).astype(np.float16)


def make_model(length):
    @mb.program(
        input_specs=[
            mb.TensorSpec(
                (1, 16_384, length),
                dtype=mil.input_types.types.fp16,
            ),
        ],
        opset_version=mil.builder.AvailableTarget.iOS18,
    )
    def program(x):
        x = mb.conv(x=x, weight=w1)
        x = mb.conv(x=x, weight=w2)
        x = mb.conv(x=x, weight=w3)
        return mb.conv(x=x, weight=w4)

    cml_converted = ct.convert(
        program,
        compute_units=ct.ComputeUnit.CPU_AND_NE,
        compute_precision=ct.precision.FLOAT16,
        minimum_deployment_target=ct.target.iOS18,
        skip_model_load=True,
    )

    cml_converted.save(f"./model_{length}")


def merge_mfs(mf_filename, new_model, length):
    if os.path.isdir(mf_filename):
        desc = ct.utils.MultiFunctionDescriptor(mf_filename)
    else:
        desc = ct.utils.MultiFunctionDescriptor(None)

    print(f"Adding length {length}, already created lengths: {desc._functions()}")
    desc.add_function(
        new_model,
        src_function_name="main",
        target_function_name=f"length_{length}",
    )
    desc.default_function_name = "length_1"
    ct.utils.save_multifunction(desc, mf_filename)

    shutil.rmtree(new_model)


if __name__ == "__main__":
    mf_name = "mf.mlpackage"
    for i in [1, 2, 4, 6, 8]:
        make_model(i)
        merge_mfs(mf_name, f"model_{i}.mlpackage", i)

Included a video showing the RAM usage.

  • Minute 1:15 peak RAM usage of 4.74GB for 1 function
  • Min 2:06 peak RAM usage of 7.74GB for 2 functions
  • min 4:07 peak RAM usage 9.74GB for 3 functions
  • min 6:31 peak RAM usage of 11.74GB for 4 functions
@0seba 0seba added the feature request Functionality does not currently exist, would need to be created as a new feature (type) label Sep 27, 2024
@TobyRoseman
Copy link
Collaborator

This is expected. Based on my understanding, there are a couple of relevant things here.

First, our Python binding don't use MLModelAsset which would (in some cases) share the weigths among MLModel instances from a same .mlmodelc. However I'm not sure how common this sharing will be.

Second, It also depends on which compute unit is being used. Even without MLModelAsset, ANE should still share the weights in some cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Functionality does not currently exist, would need to be created as a new feature (type)
Projects
None yet
Development

No branches or pull requests

2 participants