Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: latest comfyui; fix: better GPU utilization for SD15 #270

Merged
merged 50 commits into from
Sep 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
daad6b5
feat: use `horde_engine~=2.14.2`
tazlin Aug 23, 2024
5dfc26b
fix: use `--novram` approach
tazlin Aug 24, 2024
aac8c6e
refactor/fix: `unload_models_from_vram` more often
tazlin Aug 24, 2024
c344985
feat: use `horde_engine~=2.14.3`
tazlin Aug 24, 2024
364c7b1
fix: always unload models from ram
tazlin Aug 24, 2024
4234412
fix: more aggressively unload from system ram
tazlin Aug 24, 2024
5debcf2
feat: use `horde_engine~=2.14.4`
tazlin Aug 26, 2024
c90a019
fix: unload models more often and more appropriately
tazlin Aug 26, 2024
73418c1
docs: add missing arg docstrings
tazlin Aug 26, 2024
602a958
feat: more configurable memory management
tazlin Aug 27, 2024
47f08c7
chore: log addtl info/config; warn for incorrect high memory mode
tazlin Aug 27, 2024
1c46af6
feat: use `horde_engine~=2.14.5`
tazlin Aug 27, 2024
9e60095
fix: don't pass memory arg to comfyui when init safety process
tazlin Aug 27, 2024
67ba52b
feat: more informative kudos/user log messages
tazlin Aug 27, 2024
55ddc03
fix: more acuate units in log messages
tazlin Aug 28, 2024
22202b3
fix: pass very high memory mode config to inf. proc.
tazlin Aug 28, 2024
6dd52aa
feat: use `horde_sdk~=0.14.1`
tazlin Aug 29, 2024
00b6033
fix: print to console `PROCESS_ENDED` message's info
tazlin Aug 29, 2024
3d41fdd
chore: version bump
tazlin Sep 11, 2024
3f7a47c
feat: use `horde_model_reference>=0.9.0` for flux support
tazlin Sep 11, 2024
70f7a0b
fix: use latest compat. `horde_model_reference`
tazlin Sep 11, 2024
aa0ca39
fix: use `horde_sdk==0.14.2`
tazlin Sep 11, 2024
e21f192
fix: clarify "currently popped" in log messages
tazlin Sep 12, 2024
b0bf64b
doc: custom models
db0 Sep 13, 2024
1ab8b07
adds extra_slow_worker and limit_max_steps vars
db0 Sep 13, 2024
48206d8
feat: `horde_sdk==0.14.3` for `extra_slow_worker`/`limit_max_steps`
tazlin Sep 13, 2024
6d74831
fix: remove redundant bridge data fields (already in SDK)
tazlin Sep 13, 2024
b1f5013
style: fix
tazlin Sep 13, 2024
f1bbd5e
feat: use `horde_engine~=2.14.6`
tazlin Sep 14, 2024
0934de5
feat: use `horde_engine==2.15.0`
tazlin Sep 16, 2024
6dfcf93
fix: add flux to known slow/vram heavy models
tazlin Sep 16, 2024
0c1de03
fix: enforce constraints on other configs w/ `extra_slow_worker`
tazlin Sep 17, 2024
e142601
fix: respect `exit_on_unhandled_faults` on deadlocks
tazlin Sep 17, 2024
5ef71b4
feat: adds remove_maintenance_on_init secret var
db0 Sep 20, 2024
519f734
style: fix
tazlin Sep 21, 2024
91d6f35
fix: less flux slowdown
tazlin Sep 21, 2024
ddc7bfe
style: fix
tazlin Sep 22, 2024
a52d5a3
fix: better process crash handling/logging
tazlin Sep 22, 2024
f5ba73c
fix: dont default to `low_memory_mode` by default
tazlin Sep 22, 2024
5e01bfe
fix: detect more deadlocks; less crashes w/ unsresponsive logic
tazlin Sep 22, 2024
d817884
chore: version bump
tazlin Sep 22, 2024
f972979
fix: use `horde_engine==2.15.2`
tazlin Sep 22, 2024
2f88886
fix: reset failed job counter after conseq. pause
tazlin Sep 22, 2024
9aa79b7
feat: time spent w/o jobs logging
tazlin Sep 23, 2024
c5f1bd9
chore: add suggested settings in README.md
tazlin Sep 23, 2024
b54376f
chore: update `bridgeData_template.yaml`
tazlin Sep 23, 2024
8de3e4d
docs: fix extra line in tempalte
tazlin Sep 23, 2024
e8de582
chore: version bump
tazlin Sep 23, 2024
79e71b6
style: fix
tazlin Sep 23, 2024
f89812c
chore: version meta update to require >=9.0.2 by sept 26 UTC
tazlin Sep 23, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ repos:
- id: mypy
args: []
additional_dependencies:
- pydantic==2.7.4
- pydantic==2.9.2
- types-requests
- types-pytz
- types-setuptools
Expand All @@ -40,7 +40,7 @@ repos:
- horde_safety==0.2.3
- torch==2.3.1
- ruamel.yaml
- horde_engine==2.13.3
- horde_sdk==0.14.0
- horde_model_reference==0.8.1
- horde_engine==2.15.2
- horde_sdk==0.14.7
- horde_model_reference==0.9.0
- semver
72 changes: 72 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,52 @@ You can double click the provided script files below from a file explorer or run
1. Make a copy of `bridgeData_template.yaml` to `bridgeData.yaml`
1. Edit `bridgeData.yaml` and follow the instructions within to fill in your details.

#### Suggested settings

Models are loaded as needed and just-in-time. You can offer as many models as you want **provided you have an SSD, at least 32gb of ram, and at least 8gb of VRAM (see [Important Info](#important-info)**. Workers with HDDs are not recommended at this time but those with HDDs should run exactly 1 model. A typical SD1.5 model is around 2gb each, while a typical SDXL model is around 7gb each. Offering `all` models is currently around 700gb total and we commit to keeping that number below 1TB with any future changes.

> Note: We suggest you disable any 'sleep' or reduced power modes for your system while the worker is running.

- If you have a **24gb+ vram card**:
```yaml
- safety_on_gpu: true
- high_memory_mode: true
- high_performance_mode: true
- post_process_job_overlap: true
- unload_models_from_vram_often: false
- max_threads: 1 # If you have Flux/Cascade loaded, otherwise 2 max
- queue_size: 2 # You can set to 3 if you have 64GB or more of RAM
- max_batch: 8 # or higher

- If you have a **12gb - 16gb card**:
```yaml
- safety_on_gpu: true # Consider setting to `false` if offering Cascade or Flux
- high_memory_mode: true
- moderate_performance_mode: true
- unload_models_from_vram_often: false
- max_threads: 1
- max_batch: 4 # or higher

- If you have an **8gb-10gb vram card**:
- ```yaml
- queue_size: 1 # max **or** only offer flux
- safety_on_gpu: false
- max_threads: 1
- max_power: 32 # no higher than 32
- max_batch: 4 # no higher than 4
- allow_post_processing: false # If offering SDXL or Flux, otherwise you may set to true
- allow_sdxl_controlnet: false

- Be sure to shut every single VRAM consuming application you can and do not use the computer with the worker running for any purpose.

- Workers which have **low end cards or have low performance for other reasons**:
```yaml
- extra_slow_worker: true
# gives you considerably more time to finish job, but requests will not go to your worker unless the requester opts-in (even anon users do not use extra_slow_workers by default). You should only consider using this if you have historically had less than 0.3 MPS/S or less than 3000 kudos/hr consistently **and** you are sure the worker is otherwise configured correctly.
- limit_max_steps: true
# reduces the maximum total number of steps in a single job you will receive based on the model baseline.
- preload_timeout: 120
# gives you more time to load models off disk. **Note**: Abusing this value can lead to a major loss of kudos and may also lead to maintainance mode, even with `extra_slow_worker: true`.

### Starting/stopping

Expand Down Expand Up @@ -166,6 +212,32 @@ To update:
- **Advanced users**: If you do not want to use mamba or you are comfortable with python/venvs, see [README_advanced.md](README_advanced.md).
1. Continue with [Starting/stopping](#startingstopping) instructions above

# Custom Models

You can host your own image models on the horde which are not available in our model reference, but this process is a bit more complex.

To start with, you need to manually request the `customizer` role from then horde team. You can ask for it in the discord channel. This is a manually assigned role to prevent abuse of this feature.

Once you have the customizer role, you need to download the model files you want to host. Place them in any location on your system.

Finally, you need to point your worker to their location and provide some information about them. On your bridgeData.yaml simply add lines like the following

```
custom_models:
- name: Movable figure model XL
baseline: stable_diffusion_xl
filepath: /home/db0/projects/CUSTOM_MODELS/PVCStyleModelMovable_beta25Realistic.safetensors
```

And then add the same "name" to your models_to_load.

If everything was setup correctly, you should now see a `custom_models.json` in your worker directory after the worker starts, and the model should be offered by your worker.

Note that:

* You cannot serve custom models with the same name as any of our regular models
* The horde doesn't know your model, so it will treat it as a SD 1.5 model for kudos rewards and cannot warn people using the wrong parameters such as clip_skip

# Docker

See [README_advanced.md](README_advanced.md).
Expand Down
Loading
Loading