support for customizing LoRA multipliers through the sdapi#1982
support for customizing LoRA multipliers through the sdapi#1982wbruna wants to merge 4 commits intoLostRuins:concedo_experimentalfrom
Conversation
|
Does it have any implications on memory use or runtime file loading? |
|
For For |
|
Personally I have seen this request a few times. There is demand for it. If its a bit slower during a switch that is better than not having it at all. Just make sure nothing changes if its not used. |
b0735b5 to
1ddd1a8
Compare
|
Got a first somewhat-working version. I've included code for the As suspected,
What do you think? |
|
By the way, it's also possible to support the |
8d4bc54 to
f013f51
Compare
|
Should be ready enough for reviewing. As described before:
|
|
Cleaned up the code, and reorganized the commits. Tested with Klein 9b and SDXL. Probably needs some polishing on the launcher and config side, once we decide the zero-multiplier approach is OK. I'll leave this aside a bit, to focus on master-509-4cdfff5 🙂 |
|
the default behavior right now (before this PR), is when one multiplier is provided (which is the current status quo of the launcher), all loras are initialized at the same strength, which is what should be default i think. E.g. Then the API override should augment it to a new value temporarily for that request (only adjustable for those loras loaded at mult 0). Also I think |
Intentionally omitted, since it could be considered sensitive information. Usually, we'd have a root directory for all the LoRA files, then we could show subpaths under it. But all LoRAs now are specified by full path, so we can't know which part could be shown. (@LostRuins , a |
Alright, I'll adjust it later (and fix the bug @Riztard mentioned).
|
|
Rebased on top of #2006 to get a fix for zero-multiplier LoRAs getting stuck, and to be able to test both PRs at the same time; but I'll keep the branches separate. Also restored the behavior when a single multiplier is specified. Now:
|
ca2cced to
54cf43a
Compare
e59abca to
8115263
Compare
|
hmm i merged your other big PR, i dunno why the files are still shown here as modified |
|
Weird. Well, rebasing cleaned it up. |
|
besides above, The logic for zero multipliers seems slightly confusing. So I think the goal is for only loras with multiplier = 0.0 to be adjustable over the API. I.e. if you have on load loraA=0.8, loraB=0.0, loraC=1.0 then only loraB is adjustable over the api. But in the |
why only some lora need to be adjustable. is less = more performance/hassle/something? |
That is what's happening.
That boolean is not per-LoRA: the only thing it does is to short-circuit the LoRA list processing when no dynamic LoRAs were requested. I'll drop that part to simplify the code; the |
Yes; supporting any dynamic LoRAs means more memory usage and extra processing time for all LoRAs. It may be possible to optimize that, but the code would be too fragile without upstream changes.
I do not disagree, but we can't document a feature before agreeing on what it should do 🙂 |
|
@Riztard it does make perfect sense I think. if the user has specified a multipler (e.g. 0.75) in the launcher, then that multiplier should be obeyed for that lora. If the user chose not to specify a multiplier (e.g. 0.0) in the launcher, only then the backend is going to use what the API requests (defaulting to disabled since Wx0=0). But since multiple multipliers can be applied separately this logic extends to each one individually, so LoRA A might be adjustable (because it was set to 0) while LoRA B is fixed at 0.65 as per launcher args. Do you think that's not intuitive? |
Kinda, if the user know about that. |
|
Folded the fixes into each commit, and added the LoRA tags to the generated image metadata. |
Also fix typo in the function name.
The `sdloramult` flag now accepts a list of multipliers, one for each LoRA. If all multipliers are non-zero, LoRAs load as before, with no extra VRAM usage or performance impact. If any LoRA has a multiplier of 0, we switch to `at_runtime` mode, and these LoRAs will be available to multiplier changes via the `lora` sdapi field and show up in the `sdapi/v1/loras` endpoint. All LoRAs are still preloaded on startup, and cached to avoid file reloads. If the list of multipliers is shorter than the list of LoRAs, the multiplier list is extended with the first multiplier (1.0 by default), to keep it compatible with the previous behavior.



This is still just an idea!Since we just got support for multiple LoRAs, we could include LoRA customization on the API side, by:
/sdapi/v1/loraslorafileld at/sdapi/v1/txt2imgand/sdapi/v1/img2imgI recently implemented support on my Python client script for the mainline sd-server implementation, so I have a reasonable idea about how complicated that would be. I'm also aware that the sd.cpp C API would have to be adapted to allow changing LoRA weights without reloading the models.
Do you think this would be worth implementing?