5
pipeline kinds
16
GB VRAM
1
worker thread
1547
Python LOC
8
MCP tools

flux owns the GPU. Generation jobs are serialized through one worker thread because 16 GB of VRAM has headroom for one FLUX pass at a time, and parallelism here would be a regression. The MCP surface that other projects on this machine call (flux_generate, flux_edit, flux_fill, flux_variation, flux_structural, flux_search_gallery) routes through MasterAgent on :8420 and into this loopback service. Every result lands in a SQLite gallery with prompt, model, params, and thumbnail.

Tech scope

  • Five pipeline kinds: text-to-image, image-to-image (Kontext), inpainting (Fill), structural conditioning (Control), and image-conditioned variation (Redux). Each is one entry in MODEL_REGISTRY in worker/config.py.
  • config.ensure_dirs() runs before any HuggingFace import so HF_HOME is set correctly and models cache where they should. Any new module that imports transformers / diffusers at the top level must be loaded after import config.
  • Cancellation is cooperative. The API sets a per-job Event; the pipeline''s per-step callback raises Cancelled() when it sees it. Adding a new pipeline kind requires wiring this callback or cancellation silently breaks.
  • Prompt composer (compose.py + prompts/<pack>/) reads pack files (pack.json, fixed_block.txt, variable_*.txt, subjects/*.txt), assembles FIXED + SUBJECT + VARIABLE + EXTRA (capped at 2000 chars), runs a forbidden-pattern audit (negation-aware), and either returns the prompt or queues a job via /compose/generate.

Layout

worker/
―― main.py         # FastAPI entry, lifespan, /healthz, route registration
―― jobs.py         # single-thread job queue + cancellation
―― pipeline.py     # diffusers pipeline loaders, keyed by MODEL_REGISTRY[kind]
―― compose.py      # prompt assembly + audit (used by /compose/* routes)
―― gallery.py      # SQLite persistence
―― progress.py     # per-step progress callbacks
―― schemas.py      # Pydantic request models for every route
―― config.py       # paths, ports, MODEL_REGISTRY
―― deploy/         # register-service.ps1
prompts/<pack>/   # composer packs
data/             # runtime: gallery.sqlite, images/, uploads/, hf-cache/

Why local

The image work in every other project on this site — including meshgen’s featured images and Pinterest pins — runs through this worker. The headline cost difference (zero API tokens for the pixel data) is only part of it. The other part is that the prompt pipeline, the seeds, the audit rules, and the gallery all stay on the same disk as the projects that consume them.

compose schemas queue pipeline callback gallery MCP return
as of 2026-04-26
main.py · 350 compose.py · 283 pipeline.py · 254 gallery.py · 238 jobs.py · 213 progress.py · 78 schemas.py · 76 config.py
real LOC across 8 modules · as of 2026-04-26
text2img100 kontext45 fill30 control20 redux15
as of 2026-04-26

Surface

The surface presented to other projects on this machine is the MCP layer (flux_generate, flux_edit, flux_fill, flux_variation, flux_structural, plus flux_search_gallery and the job-lifecycle pair flux_get_job / flux_cancel_job) routed through MasterAgent on :8420 into the loopback FastAPI on :8421. From the caller's side it is a request and a job id; from the worker's side it is a serialized queue of one.

Numbers

Five pipeline kinds is a hard count — adding a sixth means a new MODEL_REGISTRY entry, a new schema, and wiring the per-step cancellation callback or cancellation silently breaks. Sixteen GB of VRAM has headroom for one FLUX pass at a time; a second concurrent generation regresses both. One worker thread is the consequence, not a workaround. The prompt composer caps at 2000 chars after assembly and runs a negation-aware forbidden-pattern audit before queuing.

:/ ESC