UPSTREAM PR #1156: fix: sanitize LoRA paths and enable dynamic loading by loci-dev · Pull Request #43 · auroralabs-loci/stable-diffusion.cpp

loci-dev · 2026-02-02T11:44:10Z

Note

Source pull request: leejet/stable-diffusion.cpp#1156

Implement sanitize_lora_path in SDGenerationParams to prevent directory traversal attacks via LoRA tags in prompts.
Restrict LoRA paths to be relative and strictly within the configured LoRA directory (no subdirectories allowed, optional? drawback: users cannot organize their LoRAs into subfolders).
Update server example to pass lora_model_dir to process_and_check, enabling LoRA extraction from prompts.
Force LORA_APPLY_AT_RUNTIME in the server to allow applying LoRAs dynamically per request without reloading the model and avoiding weight accumulation.

- Implement `sanitize_lora_path` in `SDGenerationParams` to prevent directory traversal attacks via LoRA tags in prompts. - Restrict LoRA paths to be relative and strictly within the configured LoRA directory (no subdirectories allowed, optional? drawback: users cannot organize their LoRAs into subfolders.). - Update server example to pass `lora_model_dir` to `process_and_check`, enabling LoRA extraction from prompts. - Force `LORA_APPLY_AT_RUNTIME` in the server to allow applying LoRAs dynamically per request without reloading the model.

- Remove the restriction that LoRA models must be in the root of the LoRA directory, allowing them to be organized in subfolders. - Refactor the directory containment check to use `std::mismatch` instead of `lexically_relative` to verify the path is inside the allowed root. - Remove redundant `lexically_normal()` call when resolving file extensions.

loci-review · 2026-02-06T05:16:52Z

Overview

Analysis of 48,102 functions (100 modified, 10 new, 4 removed) across two binaries reveals minimal performance impact from security enhancements. Power consumption: build.bin.sd-server decreased 0.06% (512,975.76 nJ → 512,668.64 nJ), build.bin.sd-cli increased 0.1% (479,167.23 nJ → 479,645.75 nJ).

Function Analysis

extract_and_remove_lora (both binaries): Response time increased 21.8% (+49.5μs) due to new sanitize_lora_path() security validation preventing path traversal attacks. Throughput time remained constant (+0.6%), confirming overhead is in filesystem validation calls, not core logic. This is a justified security-correctness tradeoff for server deployments, with negligible impact since it's called once per generation request during initialization.

Standard library regressions (compiler/toolchain differences, no source changes): std::deque::back +243% throughput (+194ns), std::to_string(int) +121% throughput (+197ns), mz_zip_get_file_modified_time +130% throughput (+99ns), _S_max_size +206% throughput (+212ns). All occur in non-critical initialization or utility paths.

Standard library improvements: std::vector<ggml_tensor*> copy constructor -34% throughput (-74ns), ggml_e8m0_to_fp32_half -24% throughput (-35ns), std::to_string(long) -44% throughput (-133ns), benefiting tensor operations and memory management.

Other analyzed functions showed negligible changes in non-critical paths.

Additional Findings

Core ML inference pipeline (diffusion sampling, attention mechanisms, VAE operations) remains unaffected. The 5 commits focused on "sanitize LoRA paths and enable dynamic loading" successfully implement security hardening with <0.01% impact on end-to-end image generation time (5-30 seconds). Compiler optimizations offset security overhead, resulting in near-zero net power consumption change.

🔎 Full breakdown: Loci Inspector.
💬 Questions? Tag @loci-dev.

loci-review · 2026-02-22T05:20:43Z

Overview

Analysis of 48,312 functions across two stable diffusion inference binaries reveals minimal overall performance impact despite 99 modified functions. Power consumption shows negligible changes: build.bin.sd-server decreased 0.062% (518,798.18 nJ → 518,475.41 nJ) and build.bin.sd-cli decreased 0.029% (483,665.30 nJ → 483,523.85 nJ). Ten new functions were added and four removed, with 48,199 functions unchanged.

Function Analysis

All significant performance changes occur in C++ standard library functions rather than application code, indicating compiler optimization differences between builds:

Regressions:

std::vector<ggml_backend_feature>::begin() (build.bin.sd-server): +180.81ns response time (+217%), +180.81ns throughput (+289%)
std::vector<ggml_backend_device*>::_S_max_size() (build.bin.sd-server): +212.41ns response time (+152%), +212.41ns throughput (+206%)
std::swap<nlohmann::json> (build.bin.sd-cli): +76.13ns response time (+76%), +76.13ns throughput (+106%)
std::less<void>::operator() variants (build.bin.sd-server): +44ns response time (+30%), +44ns throughput (+69%) each

Improvements:

std::vector<gguf_kv>::end() (build.bin.sd-server): -183.29ns response time (-69%), -183.29ns throughput (-75%)
std::vector<std::thread>::end() (build.bin.sd-cli): -183.29ns response time (-69%), -183.29ns throughput (-75%)
std::shared_ptr<T5CLIPEmbedder>::_M_destroy() (build.bin.sd-cli): -189.00ns response time (-38%), -188.71ns throughput (-64%)

No application source code changes were detected for any analyzed functions. Performance variations stem from GCC 13 standard library implementation differences or compiler optimization flag changes. The balanced improvements and regressions result in negligible net impact, confirmed by sub-0.1% power consumption changes.

Additional Findings

All analyzed functions affect initialization, memory management, and utility operations rather than compute-intensive inference paths. String comparison regressions impact model loading (tensor name lookups during GGUF parsing), but cumulative overhead remains under 2ms for typical models. Core ML operations (GGML tensor kernels, attention mechanisms, VAE processing) execute in backend-specific implementations not included in this analysis, explaining why standard library changes have minimal impact on overall performance.

🔎 Full breakdown: Loci Inspector.
💬 Questions? Tag @loci-dev.

MateusGPe added 2 commits December 31, 2025 18:42

loci-dev force-pushed the main branch 22 times, most recently from f99a420 to a234621 Compare February 4, 2026 04:39

MateusGPe and others added 3 commits February 4, 2026 16:45

Merge branch 'master' into master

1868eb1

fix: sanitize LoRA paths and enable dynamic loading

117281d

fix: sanitize LoRA paths and enable dynamic loading

a492cda

loci-dev force-pushed the main branch from a234621 to d762b55 Compare February 5, 2026 04:41

loci-dev temporarily deployed to stable-diffusion-cpp-prod February 6, 2026 04:16 — with GitHub Actions Inactive

loci-dev force-pushed the main branch from d762b55 to 76645dd Compare February 6, 2026 04:40

loci-dev force-pushed the main branch 6 times, most recently from 3ad80c4 to 74d69ae Compare February 12, 2026 04:47

loci-dev force-pushed the main branch from 74d69ae to 10ea7dd Compare February 20, 2026 04:16

Merge branch 'leejet:master' into master

72344c3

loci-dev temporarily deployed to stable-diffusion-cpp-prod February 22, 2026 04:18 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

UPSTREAM PR #1156: fix: sanitize LoRA paths and enable dynamic loading#43

UPSTREAM PR #1156: fix: sanitize LoRA paths and enable dynamic loading#43
loci-dev wants to merge 6 commits intomainfrom
loci/pr-1156-master

loci-dev commented Feb 2, 2026

Uh oh!

loci-review bot commented Feb 6, 2026

Uh oh!

loci-review bot commented Feb 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

loci-dev commented Feb 2, 2026

Uh oh!

loci-review bot commented Feb 6, 2026

Overview

Function Analysis

Additional Findings

Uh oh!

loci-review bot commented Feb 22, 2026

Overview

Function Analysis

Additional Findings

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants