- Please see and Comment - Only 4 steps, amazing images #1205

danilomaiaweb · 2024-08-16T21:01:01Z

danilomaiaweb
Aug 16, 2024

Hi mate,

When I use the flux1-dev-bnb-nf4-v2.safetensors version, the image generation also takes a while. As I use it, the time decreases, I don't understand why.

So, I'm testing a merged version that I found on huggingface https://huggingface.co/drbaph/FLUX.1-schnell-dev-merged-fp8-4step and I'm liking the results.

With this merged version, I can generate very good images with only 4 steps. The time to generate each image is approximately 20s.

My Setup is:
Windows 11 Pro - Raizen 5 3500X
32GB RAM
8GB VRAM RTX 3050
Torch 2.3.1+cu121 autocast half
cuda: 12.1
cudnn: 8907
driver: 560.70
diffusers: 0.29.2
transformers: 4.44.0
python: 3.10.6

See some results:

All these images were generated using the FLUX.1-schnell-dev-merged-fp8-4step model, with only 4 steps, without VAE, with BNB-NF4. The average time of each generation was only 20s.

The first loading takes a while, but after loading the module, from the second creation onwards, the time is reduced considerably.
See more details below:

Stable Diffusion PATH: F:\ForgeFlux\webui
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Version: f2.0.1v1.10.1-previous-310-g695aad95
Commit hash: 695aad95e45a1cd24c016d3232d2be08b7faad17
CUDA 12.1
Launching Web UI with arguments: --precision full --opt-split-attention --always-batch-cond-uncond --no-half --skip-torch-cuda-test --pin-shared-memory --cuda-malloc --cuda-stream --ckpt-dir 'F:\ModelsForge\Checkpoints' --lora-dir 'F:\ModelsForge\Loras'
Using cudaMallocAsync backend.
Total VRAM 8191 MB, total RAM 32705 MB
pytorch version: 2.3.1+cu121
Set vram state to: NORMAL_VRAM
Always pin shared GPU memory
Device: cuda:0 NVIDIA GeForce RTX 3050 : cudaMallocAsync
VAE dtype preferences: [torch.bfloat16, torch.float32] -> torch.bfloat16
CUDA Using Stream: True
Using pytorch cross attention
Using pytorch attention for VAE
ControlNet preprocessor location: F:\ForgeFlux\webui\models\ControlNetPreprocessor
[-] ADetailer initialized. version: 24.8.0, num models: 10
sd-webui-prompt-all-in-one background API service started successfully.
17:32:07 - ReActor - STATUS - Running v0.7.1-a1 on Device: CUDA
2024-08-16 17:32:09,701 - ControlNet - INFO - ControlNet UI callback registered.
Model selected: {'checkpoint_info': {'filename': 'F:\ModelsForge\Checkpoints\FLUX1-SchnellDev-Merged-fp8-4step.safetensors', 'hash': '9e0fb423'}, 'additional_modules': [], 'unet_storage_dtype': 'nf4'}
Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch().
IIB Database file has been successfully backed up to the backup folder.
Startup time: 35.7s (prepare environment: 12.8s, launcher: 2.4s, import torch: 3.8s, initialize shared: 0.1s, other imports: 1.1s, opts onchange: 0.8s, list SD models: 0.4s, load scripts: 5.5s, create ui: 5.0s, gradio launch: 2.7s, app_started_callback: 1.0s).
Environment vars changed: {'stream': False, 'inference_memory': 1024.0, 'pin_shared_memory': False}
Loading Model: {'checkpoint_info': {'filename': 'F:\ModelsForge\Checkpoints\FLUX1-SchnellDev-Merged-fp8-4step.safetensors', 'hash': '9e0fb423'}, 'additional_modules': [], 'unet_storage_dtype': 'nf4'}
[Unload] Trying to free 953674316406250018963456.00 MB for cuda:0 with 0 models keep loaded ...
StateDict Keys: {'transformer': 776, 'vae': 244, 'text_encoder': 198, 'text_encoder_2': 220, 'ignore': 0}
Using Default T5 Data Type: torch.float16
Working with z of shape (1, 16, 32, 32) = 16384 dimensions.
K-Model Created: {'storage_dtype': 'nf4', 'computation_dtype': torch.bfloat16}
Model loaded in 93.1s (unload existing model: 0.2s, forge model load: 92.8s).
Skipping unconditional conditioning when CFG = 1. Negative Prompts are ignored.
To load target model JointTextEncoder
Begin to load 1 model
[Unload] Trying to free 13465.80 MB for cuda:0 with 0 models keep loaded ...
[Memory Management] Current Free GPU Memory: 7184.00 MB
[Memory Management] Required Model Memory: 9570.62 MB
[Memory Management] Required Inference Memory: 1024.00 MB
[Memory Management] Estimated Remaining GPU Memory: -3410.62 MB
[Memory Management] Loaded to GPU for backward capability: 73.14 MB
[Memory Management] Loaded to CPU Swap: 4790.00 MB (blocked method)
[Memory Management] Loaded to GPU: 4852.99 MB
Moving model(s) has taken 18.67 seconds
Distilled CFG Scale will be ignored for Schnell
To load target model KModel
Begin to load 1 model
[Unload] Trying to free 9137.91 MB for cuda:0 with 0 models keep loaded ...
[Unload] Current free memory is 1637.82 MB ...
[Unload] Unload model JointTextEncoder
[Memory Management] Current Free GPU Memory: 7121.19 MB
[Memory Management] Required Model Memory: 6241.47 MB
[Memory Management] Required Inference Memory: 1024.00 MB
[Memory Management] Estimated Remaining GPU Memory: -144.28 MB
[Memory Management] Loaded to CPU Swap: 1426.84 MB (blocked method)
[Memory Management] Loaded to GPU: 4814.55 MB
Moving model(s) has taken 151.16 seconds
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:32<00:00, 8.12s/it]
To load target model IntegratedAutoencoderKL██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:18<00:00, 4.55s/it]
Begin to load 1 model
[Unload] Trying to free 2730.93 MB for cuda:0 with 0 models keep loaded ...
[Unload] Current free memory is 2037.39 MB ...
[Unload] Unload model KModel
[Memory Management] Current Free GPU Memory: 7114.03 MB
[Memory Management] Required Model Memory: 159.87 MB
[Memory Management] Required Inference Memory: 1024.00 MB
[Memory Management] Estimated Remaining GPU Memory: 5930.16 MB
Moving model(s) has taken 1.79 seconds
Total progress: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:28<00:00, 7.11s/it]
Skipping unconditional conditioning when CFG = 1. Negative Prompts are ignored.███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:28<00:00, 4.55s/it]
To load target model JointTextEncoder
Begin to load 1 model
[Unload] Trying to free 13560.04 MB for cuda:0 with 0 models keep loaded ...
[Unload] Current free memory is 6944.74 MB ...
[Unload] Unload model IntegratedAutoencoderKL
[Memory Management] Current Free GPU Memory: 7104.61 MB
[Memory Management] Required Model Memory: 9643.11 MB
[Memory Management] Required Inference Memory: 1024.00 MB
[Memory Management] Estimated Remaining GPU Memory: -3562.50 MB
[Memory Management] Loaded to GPU for backward capability: 145.52 MB
[Memory Management] Loaded to CPU Swap: 4998.00 MB (blocked method)
[Memory Management] Loaded to GPU: 4789.86 MB
Moving model(s) has taken 1.51 seconds
Distilled CFG Scale will be ignored for Schnell
To load target model KModel
Begin to load 1 model
[Unload] Trying to free 9137.91 MB for cuda:0 with 0 models keep loaded ...
[Unload] Current free memory is 1796.03 MB ...
[Unload] Unload model JointTextEncoder
[Memory Management] Current Free GPU Memory: 7100.03 MB
[Memory Management] Required Model Memory: 6241.47 MB
[Memory Management] Required Inference Memory: 1024.00 MB
[Memory Management] Estimated Remaining GPU Memory: -165.44 MB
[Memory Management] Loaded to CPU Swap: 1446.64 MB (blocked method)
[Memory Management] Loaded to GPU: 4794.75 MB
Moving model(s) has taken 2.27 seconds
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:13<00:00, 3.41s/it]
To load target model IntegratedAutoencoderKL██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:10<00:00, 2.79s/it]
Begin to load 1 model
[Unload] Trying to free 2730.93 MB for cuda:0 with 0 models keep loaded ...
[Unload] Current free memory is 2040.43 MB ...
[Unload] Unload model KModel
[Memory Management] Current Free GPU Memory: 7099.45 MB
[Memory Management] Required Model Memory: 159.87 MB
[Memory Management] Required Inference Memory: 1024.00 MB
[Memory Management] Estimated Remaining GPU Memory: 5915.57 MB
Moving model(s) has taken 1.00 seconds
Total progress: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:12<00:00, 3.02s/it]
Skipping unconditional conditioning when CFG = 1. Negative Prompts are ignored.███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:12<00:00, 2.79s/it]
To load target model KModel
Begin to load 1 model
[Unload] Trying to free 9137.91 MB for cuda:0 with 0 models keep loaded ...
[Unload] Current free memory is 6939.57 MB ...
[Unload] Unload model IntegratedAutoencoderKL
[Memory Management] Current Free GPU Memory: 7099.45 MB
[Memory Management] Required Model Memory: 6241.47 MB
[Memory Management] Required Inference Memory: 1024.00 MB
[Memory Management] Estimated Remaining GPU Memory: -166.02 MB
[Memory Management] Loaded to CPU Swap: 1446.64 MB (blocked method)
[Memory Management] Loaded to GPU: 4794.75 MB
Moving model(s) has taken 1.66 seconds
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:13<00:00, 3.42s/it]
To load target model IntegratedAutoencoderKL██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:10<00:00, 2.78s/it]
Begin to load 1 model
[Unload] Trying to free 2730.93 MB for cuda:0 with 0 models keep loaded ...
[Unload] Current free memory is 1995.33 MB ...
[Unload] Unload model KModel
[Memory Management] Current Free GPU Memory: 7098.29 MB
[Memory Management] Required Model Memory: 159.87 MB
[Memory Management] Required Inference Memory: 1024.00 MB
[Memory Management] Estimated Remaining GPU Memory: 5914.42 MB
Moving model(s) has taken 1.00 seconds
Total progress: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:12<00:00, 3.01s/it]
Total progress: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:12<00:00, 2.78s/it]

See, only 17s to generate the image:

Result in 4 steps, 17s...

Feel free to test it too
Best Regards here from Brazil

queenofinvidia · 2024-08-17T07:00:24Z

queenofinvidia
Aug 17, 2024

I'll give this a shot. 12gb vram rtx 3060 and 16gb system ram. I'll let you know my results later, because these look amazing

1 reply

danilomaiaweb Aug 18, 2024
Author

I recommend it, and I'm already curious to see your results. You can use just 4 steps, and you will see the magic happen.

HMRMike · 2024-08-17T09:12:56Z

HMRMike
Aug 17, 2024

Wow that's quite the feat for just 4 steps! We've seen with Turbo and such models before that "creativity" was a bit limited, with results somewhat similar to each other on random seeds, how does this merge perform?

1 reply

danilomaiaweb Aug 18, 2024
Author

I don't know how the blending works because I found this model almost by chance. But as we can see, with just 4 steps, we can get incredible images.

queenofinvidia · 2024-08-18T08:29:06Z

queenofinvidia
Aug 18, 2024

I get a weird error when I try to load this and Forge crashes.
[error now fixed]

0 replies

danilomaiaweb · 2024-08-18T19:56:54Z

danilomaiaweb
Aug 18, 2024
Author

I get a weird error when I try to load this and Forge crashes.

Can you please help me learn what I am doing wrong?

Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Version: f2.0.1v1.10.1-previous-319-g93bfd7f8
Commit hash: 93bfd7f85bcd84ed4c47ca3e39e5f4b81a75e5f9
Launching Web UI with arguments:
Total VRAM 12288 MB, total RAM 16125 MB
pytorch version: 2.1.2+cu121
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 3060 : native
Hint: your device supports --cuda-malloc for potential speed improvements.
VAE dtype preferences: [torch.bfloat16, torch.float32] -> torch.bfloat16
CUDA Using Stream: False
C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\system\python\lib\site-packages\transformers\utils\hub.py:127: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
  warnings.warn(
*** Error running preload() for C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\extensions\stable-diffusion-webui-wd14-tagger\preload.py
    Traceback (most recent call last):
      File "C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\modules\script_loading.py", line 30, in preload_extensions
        module = load_module(preload_script)
      File "C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\modules\script_loading.py", line 13, in load_module
        module_spec.loader.exec_module(module)
      File "<frozen importlib._bootstrap_external>", line 883, in exec_module
      File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
      File "C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\extensions\stable-diffusion-webui-wd14-tagger\preload.py", line 4, in <module>
        from modules.shared import models_path
    ImportError: cannot import name 'models_path' from partially initialized module 'modules.shared' (most likely due to a circular import) (C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\modules\shared.py)

---
*** Error running preload() for C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\extensions-builtin\sd_forge_controlnet\preload.py
    Traceback (most recent call last):
      File "C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\modules\script_loading.py", line 32, in preload_extensions
        module.preload(parser)
      File "C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\extensions-builtin\sd_forge_controlnet\preload.py", line 2, in preload
        parser.add_argument(
      File "argparse.py", line 1441, in add_argument
      File "argparse.py", line 1807, in _add_action
      File "argparse.py", line 1643, in _add_action
      File "argparse.py", line 1455, in _add_action
      File "argparse.py", line 1592, in _check_conflict
      File "argparse.py", line 1601, in _handle_conflict_error
    argparse.ArgumentError: argument --controlnet-loglevel: conflicting option string: --controlnet-loglevel

---
2024-08-18 04:24:54.074402: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-08-18 04:24:55.154650: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
Using pytorch cross attention
Using pytorch attention for VAE
==============================================================================
You are running torch 2.1.2+cu121.
The program is tested to work with torch 2.3.1.
To reinstall the desired version, run with commandline flag --reinstall-torch.
Beware that this will cause a lot of large files to be downloaded, as well as
there are reports of issues with training tab on the latest version.

Use --skip-version-check commandline argument to disable this check.
==============================================================================
*** Duplicate canonical name "sd_forge_controlnet" found in extensions "sd_forge_controlnet" and "sd_forge_controlnet". Former will be discarded.
ControlNet preprocessor location: C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\models\ControlNetPreprocessor
[-] ADetailer initialized. version: 24.8.0, num models: 12
*** Error loading script: pa.py
    Traceback (most recent call last):
      File "C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\modules\scripts.py", line 525, in load_scripts
        script_module = script_loading.load_module(scriptfile.path)
      File "C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\modules\script_loading.py", line 13, in load_module
        module_spec.loader.exec_module(module)
      File "<frozen importlib._bootstrap_external>", line 883, in exec_module
      File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
      File "C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\extensions\sd-webui-prevent-artifact\scripts\pa.py", line 55, in <module>
        sd_hijack_clip.FrozenCLIPEmbedderWithCustomWordsBase.process_tokens = process_tokens
    AttributeError: module 'modules.sd_hijack_clip' has no attribute 'FrozenCLIPEmbedderWithCustomWordsBase'

---
*** Error loading script: attention.py
    Traceback (most recent call last):
      File "C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\modules\scripts.py", line 525, in load_scripts
        script_module = script_loading.load_module(scriptfile.path)
      File "C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\modules\script_loading.py", line 13, in load_module
        module_spec.loader.exec_module(module)
      File "<frozen importlib._bootstrap_external>", line 883, in exec_module
      File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
      File "C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\extensions\sd-webui-regional-prompter\scripts\attention.py", line 3, in <module>
        import ldm.modules.attention as atm
    ModuleNotFoundError: No module named 'ldm'

---
*** Error loading script: latent.py
    Traceback (most recent call last):
      File "C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\modules\scripts.py", line 525, in load_scripts
        script_module = script_loading.load_module(scriptfile.path)
      File "C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\modules\script_loading.py", line 13, in load_module
        module_spec.loader.exec_module(module)
      File "<frozen importlib._bootstrap_external>", line 883, in exec_module
      File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
      File "C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\extensions\sd-webui-regional-prompter\scripts\latent.py", line 11, in <module>
        import scripts.attention as att
      File "C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\extensions\sd-webui-regional-prompter\scripts\attention.py", line 3, in <module>
        import ldm.modules.attention as atm
    ModuleNotFoundError: No module named 'ldm'

---
*** Error loading script: rp.py
    Traceback (most recent call last):
      File "C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\modules\scripts.py", line 525, in load_scripts
        script_module = script_loading.load_module(scriptfile.path)
      File "C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\modules\script_loading.py", line 13, in load_module
        module_spec.loader.exec_module(module)
      File "<frozen importlib._bootstrap_external>", line 883, in exec_module
      File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
      File "C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\extensions\sd-webui-regional-prompter\scripts\rp.py", line 15, in <module>
        import scripts.attention
      File "C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\extensions\sd-webui-regional-prompter\scripts\attention.py", line 3, in <module>
        import ldm.modules.attention as atm
    ModuleNotFoundError: No module named 'ldm'

---
*** Error loading script: tagger.py
    Traceback (most recent call last):
      File "C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\modules\scripts.py", line 525, in load_scripts
        script_module = script_loading.load_module(scriptfile.path)
      File "C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\modules\script_loading.py", line 13, in load_module
        module_spec.loader.exec_module(module)
      File "<frozen importlib._bootstrap_external>", line 883, in exec_module
      File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
      File "C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\extensions\stable-diffusion-webui-wd14-tagger\scripts\tagger.py", line 5, in <module>
        from tagger.ui import on_ui_tabs
      File "C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\extensions\stable-diffusion-webui-wd14-tagger\tagger\ui.py", line 10, in <module>
        from webui import wrap_gradio_gpu_call
    ImportError: cannot import name 'wrap_gradio_gpu_call' from partially initialized module 'webui' (most likely due to a circular import) (C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\webui.py)

---
C:\Users\Beth\Desktop\AI Generated\SDXL\Forge\webui\extensions\ABG_extension\scripts\app.py:92: GradioDeprecationWarning: unexpected argument for ColorPicker: default
  custom_background_color = gr.ColorPicker(
2024-08-18 04:25:04,142 - ControlNet - INFO - ControlNet UI callback registered.
Model selected: {'checkpoint_info': {'filename': 'C:\\Users\\Beth\\Desktop\\AI Generated\\SDXL\\Forge\\webui\\models\\Stable-diffusion\\FLUX.1-schnell-dev-merged-fp8-4step.safetensors', 'hash': '9e0fb423'}, 'additional_modules': [], 'unet_storage_dtype': 'nf4'}
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 25.8s (prepare environment: 4.8s, import torch: 7.0s, initialize shared: 0.1s, other imports: 4.6s, load scripts: 5.3s, create ui: 2.5s, gradio launch: 1.4s).
Environment vars changed: {'stream': False, 'inference_memory': 1024.0, 'pin_shared_memory': False}
Model selected: {'checkpoint_info': {'filename': 'C:\\Users\\Beth\\Desktop\\AI Generated\\SDXL\\Forge\\webui\\models\\Stable-diffusion\\FLUX.1-schnell-dev-merged-fp8-4step.safetensors', 'hash': '9e0fb423'}, 'additional_modules': [], 'unet_storage_dtype': None}
Model selected: {'checkpoint_info': {'filename': 'C:\\Users\\Beth\\Desktop\\AI Generated\\SDXL\\Forge\\webui\\models\\Stable-diffusion\\FLUX.1-schnell-dev-merged-fp8-4step.safetensors', 'hash': '9e0fb423'}, 'additional_modules': [], 'unet_storage_dtype': 'nf4'}
Loading Model: {'checkpoint_info': {'filename': 'C:\\Users\\Beth\\Desktop\\AI Generated\\SDXL\\Forge\\webui\\models\\Stable-diffusion\\FLUX.1-schnell-dev-merged-fp8-4step.safetensors', 'hash': '9e0fb423'}, 'additional_modules': [], 'unet_storage_dtype': 'nf4'}
[Unload] Trying to free 953674316406250018963456.00 MB for cuda:0 with 0 models keep loaded ...
StateDict Keys: {'transformer': 776, 'vae': 244, 'text_encoder': 198, 'text_encoder_2': 220, 'ignore': 0}
Using Default T5 Data Type: torch.float16
Press any key to continue . . .

and this one too when I tried again

[Unload] Trying to free 953674316406250018963456.00 MB for cuda:0 with 0 models keep loaded ...

I just updated forge today.

Only close the Forge
Open webui folder and delete tmp folder
Reload Forge
Work fine

The first image generation takes a while because the module is heavy. But after the first creation, the generation time decreases. It is worth testing this merged module because with only 4 steps it is possible to create interesting images.

0 replies

queenofinvidia · 2024-08-20T13:48:16Z

queenofinvidia
Aug 20, 2024

I finally got it working :D I only tested with one prompt but it's pretty nice~ thanks for the recommendation

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

- Please see and Comment - Only 4 steps, amazing images #1205

{{title}}

Replies: 5 comments 2 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

- Please see and Comment - Only 4 steps, amazing images #1205

danilomaiaweb Aug 16, 2024

Replies: 5 comments · 2 replies

queenofinvidia Aug 17, 2024

danilomaiaweb Aug 18, 2024 Author

HMRMike Aug 17, 2024

danilomaiaweb Aug 18, 2024 Author

queenofinvidia Aug 18, 2024

danilomaiaweb Aug 18, 2024 Author

queenofinvidia Aug 20, 2024

danilomaiaweb
Aug 16, 2024

Replies: 5 comments 2 replies

queenofinvidia
Aug 17, 2024

danilomaiaweb Aug 18, 2024
Author

HMRMike
Aug 17, 2024

danilomaiaweb Aug 18, 2024
Author

queenofinvidia
Aug 18, 2024

danilomaiaweb
Aug 18, 2024
Author

queenofinvidia
Aug 20, 2024