Open
Description
Would it be possible to use models based on the CompVis style used by stabilityai and supported in HF diffusers? My personal goals are:
- Support for 1.5 and 2.1 ckpt models (converted over to .ot)
- Support for third party merges and training checkpoints (based on 1.5 or 2.1so probably would work the same). Analog Diffusion 1.0
I tried the following to convert the file over, and got the names of the tensors using the tensor tools. Maybe these can be extracted and compiled back together?
import numpy as np
import torch
model = torch.load("./data/analog-diffusion-1.0.ckpt")
x = {k: v.numpy() for k, v in model["state_dict"].items()}
np.savez("./data/analog-diffusion-1.0.npz", **x)
cargo run --release --example tensor-tools cp ./data/analog-diffusion-1.0.npz ./data/analog-diffusion-1.0.ot
cargo run --release --example tensor-tools ls ./data/analog-diffusion-1.0.ot
./data/analog-diffusion-1.0.ot: model.diffusion_model.input_blocks.0.0.weight Tensor[[320, 4, 3, 3], Half]
./data/analog-diffusion-1.0.ot: model.diffusion_model.input_blocks.0.0.bias Tensor[[320], Half]
./data/analog-diffusion-1.0.ot: model.diffusion_model.time_embed.0.weight Tensor[[1280, 320], Half]
./data/analog-diffusion-1.0.ot: model.diffusion_model.time_embed.0.bias Tensor[[1280], Half]
./data/analog-diffusion-1.0.ot: model.diffusion_model.time_embed.2.weight Tensor[[1280, 1280], Half]
./data/analog-diffusion-1.0.ot: model.diffusion_model.time_embed.2.bias Tensor[[1280], Half]
./data/analog-diffusion-1.0.ot: model.diffusion_model.input_blocks.1.1.norm.weight Tensor[[320], Half]
...
./data/analog-diffusion-1.0.ot: cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.out_proj.weight Tensor[[768, 768], Half]
./data/analog-diffusion-1.0.ot: cond_stage_model.transformer.text_model.encoder.layers.11.self_attn.out_proj.bias Tensor[[768], Half]
./data/analog-diffusion-1.0.ot: cond_stage_model.transformer.text_model.encoder.layers.11.layer_norm1.weight Tensor[[768], Half]
./data/analog-diffusion-1.0.ot: cond_stage_model.transformer.text_model.encoder.layers.11.layer_norm1.bias Tensor[[768], Half]
./data/analog-diffusion-1.0.ot: cond_stage_model.transformer.text_model.encoder.layers.11.mlp.fc1.weight Tensor[[3072, 768], Half]
./data/analog-diffusion-1.0.ot: cond_stage_model.transformer.text_model.encoder.layers.11.mlp.fc1.bias Tensor[[3072], Half]
./data/analog-diffusion-1.0.ot: cond_stage_model.transformer.text_model.encoder.layers.11.mlp.fc2.weight Tensor[[768, 3072], Half]
./data/analog-diffusion-1.0.ot: cond_stage_model.transformer.text_model.encoder.layers.11.mlp.fc2.bias Tensor[[768], Half]
./data/analog-diffusion-1.0.ot: cond_stage_model.transformer.text_model.encoder.layers.11.layer_norm2.weight Tensor[[768], Half]
./data/analog-diffusion-1.0.ot: cond_stage_model.transformer.text_model.encoder.layers.11.layer_norm2.bias Tensor[[768], Half]
./data/analog-diffusion-1.0.ot: cond_stage_model.transformer.text_model.final_layer_norm.weight Tensor[[768], Half]
./data/analog-diffusion-1.0.ot: cond_stage_model.transformer.text_model.final_layer_norm.bias Tensor[[768], Half]
Full list analog-diffusion-1.0.ot.log
Thanks!
Metadata
Metadata
Assignees
Labels
No labels