Add a `disable_mmap` option to the `from_single_file` loader to improve load performance on network mounts #10305

danhipke · 2024-12-19T17:49:43Z

What does this PR do?

This PR adds a no_mmap option to the from_single_file loader to disable the mmap loading behavior of safetensors.

This provides a huge performance benefit when loading from a file on a network mount (from 16 minutes -> <1 min for a 7.2GB model), which doesn't handle the seeky-ness of mmap based loading very well. Examples demonstrating this issue:

Safetensors loading uses mmap with multiple processes sharing the same fd cause slow gcsfuse performance #10280
Option for disabling mmap for safetensors loading for network storage users comfyanonymous/ComfyUI#2288
Speed up the first run. comfyanonymous/ComfyUI#1992 (comment)
[Bug]: slow loading .safetensors when switching to a new model AUTOMATIC1111/stable-diffusion-webui#11216

Fixes #10280

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@DN6 @yiyixuxu @asomoza

sayakpaul · 2024-12-20T03:10:08Z

@DN6 I think the slow loading issue is affecting the CI quite a bit. So, maybe this could be prioritized.

HuggingFaceDocBuilderDev · 2024-12-20T03:17:12Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

DN6 · 2024-12-20T04:59:20Z

Thanks @danhipke. The change looks good. But load_state_dict is also used in from_pretrained so we would need to add the option there as well.

diffusers/src/diffusers/models/modeling_utils.py

Line 589 in 41ba8c0

And then pass it to the subsequent load_state_dict calls.

diffusers/src/diffusers/models/modeling_utils.py

Line 868 in 41ba8c0

state_dict = load_state_dict(model_file, variant=variant)

and here

diffusers/src/diffusers/models/modeling_utils.py

Line 868 in 41ba8c0

state_dict = load_state_dict(model_file, variant=variant)

And a small nit. I would prefer naming the flag disable_mmap

cc: @yiyixuxu for awareness.

…)` (huggingface#10316) Update ltx_video.md to remove generator from `from_pretrained()`

Update pipeline_hunyuan_video.py docs: fix a mistake

…peError in function prepare_latents caused by audio_vae_length (huggingface#10306) [BUG FIX] [Stable Audio Pipeline] TypeError: new_zeros(): argument 'size' failed to unpack the object at pos 3 with error "type must be tuple of ints,but got float" torch.Tensor.new_zeros() takes a single argument size (int...) – a list, tuple, or torch.Size of integers defining the shape of the output tensor. in function prepare_latents: audio_vae_length = self.transformer.config.sample_size * self.vae.hop_length audio_shape = (batch_size // num_waveforms_per_prompt, audio_channels, audio_vae_length) ... audio = initial_audio_waveforms.new_zeros(audio_shape) audio_vae_length evaluates to float because self.transformer.config.sample_size returns a float Co-authored-by: hlky <[email protected]>

Update overview.md

add 2K related model for Sana

danhipke · 2024-12-20T21:33:25Z

@DN6 Added it to from_pretrained and renamed.

src/diffusers/loaders/single_file.py

src/diffusers/loaders/single_file_model.py

DN6 · 2024-12-23T03:37:41Z

cc: @yiyixuxu to take a look here too

Related issues:
AUTOMATIC1111/stable-diffusion-webui#11216 (comment)
huggingface/safetensors#562

Internal discussion:
https://huggingface.slack.com/archives/C0475Q1CS9G/p1732717604175859

Co-authored-by: Dhruv Nair <[email protected]>

danhipke · 2024-12-23T07:00:12Z

Applied suggestions.

danhipke · 2025-01-02T22:59:41Z

Any other changes needed before this can be merged?

bghira · 2025-01-03T16:07:44Z

fwiw ultralytics overrides torch.load behaviour to always use mmap so if that's used in combination here, they have to be loaded in the correct order or else the ultralytics one takes priority and removes the capability to disable mmap. this happens when using some face fix or photomaker pipeline etc.

DN6 · 2025-01-09T18:00:40Z

Hi @danhipke I think you just have to run make style && make quality so that the quality checks pass.

danhipke added 15 commits December 17, 2024 02:15

Add no_mmap arg.

fe205a6

Fix arg parsing.

a6b4d8f

Update another method to force no mmap.

3cf01bf

logging

2e08242

logging2

bcca53b

propagate no_mmap

c895d86

logging3

c081e0b

propagate no_mmap

7231c28

logging4

0c472b2

fix open call

c4d4d60

clean up logging

4f84222

cleanup

5fab6d1

fix missing arg

1d8cf69

update logging and comments

5ef288f

fix merge conflict

fec5753

danhipke changed the title ~~No mmap~~ Add a no_mmap option to the from_single_file loader to improve load performance on network mounts Dec 19, 2024

danhipke mentioned this pull request Dec 19, 2024

Safetensors loading uses mmap with multiple processes sharing the same fd cause slow gcsfuse performance #10280

Closed

hlky and others added 2 commits December 20, 2024 12:56

Merge branch 'main' into no-mmap

3cc50f0

Rename to disable_mmap and update other references.

f80644d

danhipke changed the title ~~Add a no_mmap option to the from_single_file loader to improve load performance on network mounts~~ Add a disable_mmap option to the from_single_file loader to improve load performance on network mounts Dec 20, 2024

sayakpaul and others added 6 commits December 20, 2024 21:25

[Docs] Update ltx_video.md to remove generator from `from_pretrained(…

ffe5aba

…)` (huggingface#10316) Update ltx_video.md to remove generator from `from_pretrained()`

docs: fix a mistake in docstring (huggingface#10319)

3fc4a42

Update pipeline_hunyuan_video.py docs: fix a mistake

[docs] Fix quantization links (huggingface#10323)

dbbcd0f

Update overview.md

[Sana]add 2K related model for Sana (huggingface#10322)

dfebda2

add 2K related model for Sana

Merge branch 'main' into no-mmap

a6e3745

wlhee mentioned this pull request Dec 22, 2024

GCSFuse is extremely slow for StableDiffusionPipeline.from_single_file GoogleCloudPlatform/gcsfuse#2828

Closed

DN6 reviewed Dec 23, 2024

View reviewed changes

src/diffusers/loaders/single_file.py Outdated Show resolved Hide resolved

src/diffusers/loaders/single_file_model.py Outdated Show resolved Hide resolved

danhipke and others added 2 commits December 22, 2024 22:59

Update src/diffusers/loaders/single_file_model.py

6720c51

Co-authored-by: Dhruv Nair <[email protected]>

Update src/diffusers/loaders/single_file.py

2926158

Co-authored-by: Dhruv Nair <[email protected]>

Merge branch 'main' into no-mmap

5fdd062

sayakpaul requested a review from yiyixuxu January 9, 2025 13:27

yiyixuxu added the close-to-merge label Jan 9, 2025

yiyixuxu approved these changes Jan 9, 2025

View reviewed changes

DN6 added the roadmap Add to current release roadmap label Jan 10, 2025

make style

22b3370

DN6 merged commit 52c05bd into huggingface:main Jan 10, 2025
12 checks passed

Narsil mentioned this pull request Feb 4, 2025

Loading bigger models is very slow using AutoModelForCausalLM.from_pretrained huggingface/safetensors#562

Open

4 tasks

richardm1 mentioned this pull request Feb 19, 2025

[Bug]: vLLM takes forever to load a locally stored 7B model vllm-project/vllm#6876

Closed

richardm1 mentioned this pull request Mar 7, 2025

Speed up the first run. comfyanonymous/ComfyUI#1992

Open

richardm1 mentioned this pull request Jun 16, 2025

Add --mmap-torch-files to enable use of mmap when loading ckpt/pt comfyanonymous/ComfyUI#8021

Merged

guoyuhong mentioned this pull request Jun 23, 2025

Support weight loading without mmap sgl-project/sglang#7469

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add a `disable_mmap` option to the `from_single_file` loader to improve load performance on network mounts #10305

Add a `disable_mmap` option to the `from_single_file` loader to improve load performance on network mounts #10305

Uh oh!

danhipke commented Dec 19, 2024

Uh oh!

sayakpaul commented Dec 20, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Dec 20, 2024

Uh oh!

DN6 commented Dec 20, 2024 •

edited

Loading

Uh oh!

danhipke commented Dec 20, 2024

Uh oh!

Uh oh!

Uh oh!

DN6 commented Dec 23, 2024

Uh oh!

danhipke commented Dec 23, 2024

Uh oh!

danhipke commented Jan 2, 2025

Uh oh!

bghira commented Jan 3, 2025

Uh oh!

DN6 commented Jan 9, 2025

Uh oh!

Uh oh!

Uh oh!

Add a disable_mmap option to the from_single_file loader to improve load performance on network mounts #10305

Add a disable_mmap option to the from_single_file loader to improve load performance on network mounts #10305

Uh oh!

Conversation

danhipke commented Dec 19, 2024

What does this PR do?

Before submitting

Who can review?

Uh oh!

sayakpaul commented Dec 20, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Dec 20, 2024

Uh oh!

DN6 commented Dec 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danhipke commented Dec 20, 2024

Uh oh!

Uh oh!

Uh oh!

DN6 commented Dec 23, 2024

Uh oh!

danhipke commented Dec 23, 2024

Uh oh!

danhipke commented Jan 2, 2025

Uh oh!

bghira commented Jan 3, 2025

Uh oh!

DN6 commented Jan 9, 2025

Uh oh!

Uh oh!

Uh oh!

Add a `disable_mmap` option to the `from_single_file` loader to improve load performance on network mounts #10305

Add a `disable_mmap` option to the `from_single_file` loader to improve load performance on network mounts #10305

DN6 commented Dec 20, 2024 •

edited

Loading