Don’t Fight the Model — One Month of AI Image Generation Trial and Error

March Was All About AI Image Generation

March disappeared in a flash. Looking back, nearly every waking hour went into AI image generation work of some kind.

This post isn’t a technical deep-dive. Instead, it covers the lessons and frustrations from producing images for Harugorou (春語廊) — an image posting project on Patreon — over the past month.

What Is Harugorou?

Harugorou is a Patreon-based project where AI-generated artwork is created and shared with patrons. The skill level behind it is still firmly beginner-tier, and the only way to improve is to keep generating, keep experimenting, and keep shipping. Nothing goes well from day one.

The Battle Against Model Exposure Bias

The defining struggle of this month was fighting against a particular model tendency: clothing removal and exposure bias.

No matter how carefully the prompt specified fully clothed characters, the output consistently drifted toward wardrobe malfunctions and exposed skin. The model seemed almost determined — as if it would rather produce a broken output than keep clothing intact.

For scenes later in a narrative, a degree of exposure might be acceptable. But for early story scenes or images intended for general posting, that simply won’t work.

What followed was an absurd tug-of-war: one side insisting on clothing integrity, the other side insisting on taking it all off. It was strangely human in its stubbornness — and thoroughly unproductive.

The Illusion of Prompt Control

The initial assumption was that this could be handled through prompting techniques: stronger negative prompts, weight adjustments, ControlNet — the usual toolkit.

The result? Near-total failure.

When a tendency is deeply embedded in a model’s training data, inference-time prompt manipulation alone has limited power to override it. Trying to bend an outcome that the model has essentially already decided on is, in hindsight, a losing proposition from the start.

Choosing Not to Fight

The conclusion was simple: the key is not fighting the model at all.

The plan was to teach the model a lesson. Instead, the model taught the lesson.

The practical solution is to separate clothing-focused generation from general-purpose generation. For scenes requiring accurate clothing, a model with strong garment reproduction is used first, and the output is then passed to a second model for refinement. This is essentially a two-stage pipeline with model separation by purpose.

It adds an extra step. But spending that time is far more efficient than endlessly pulling a slot machine with a confirmed 0% SSR drop rate — generating unwanted exposed images over and over with no hope of a different outcome.

For those familiar with workflows in Stable Diffusion ecosystems (Forge, ComfyUI, etc.), this is conceptually similar to an img2img handoff between specialized models. The first model handles composition and clothing accuracy; the second handles style and final rendering.

What’s Next

Improvement is ongoing. The goal for Harugorou is to produce work with a darker, more atmospheric quality — something that peers into the abyss — and share it with patrons.

AI image generation is genuinely enjoyable, but it is also full of moments where things refuse to go as planned. Ironically, those moments of resistance are exactly where workflow insights hide. Choosing not to fight is itself a valid strategy.