Persistent Characters in Stable Diffusion
You can create an unlimited number of characters for fun, social media, games, visual novels, in any art style.
There are many popular ways to go about this: Training your own AI model (requires the most time), writing a crazy-specific prompt with many parameters (like copying a prompt from the Internet), or finally, taking a hybrid approach — using low-weights on models that already exist and worrying about things like Seed values last. This lesson teaches you that hybrid way.
In this first example, we will create a fictional AI Persona, a character that doesn’t exist. If you’re interested in creating a specific celebrity, you’ll quickly learn how to do that, too.
Step 1: Describe the person
One of the first things you should do is invent their full name. It may feel a little silly at first, but repeating this later on helps the AI understand who and what you’re talking about. A name won’t be enough on its own, so support it with other very specific attributes like age, ancestry, and the most recognizable physical features like hair color and ancestry. Then towards the end, describe what they are doing, what they are wearing, and where they are, in that particular order.
WHO, WHAT, WHERE
A portrait of a woman named Louise Betsy Jean, a 24 year old Texan American, bright orange curly hair, freckles, wearing jeans and a white shirt, is washing her red Camaro on her driveway in front of her house
This is a good start, but this won’t be enough to always work the way we want. We’ll need a few more elements.
Step 2: Find a stunt double
With over 3000 models in our system at the time of this tutorial, there’s surely someone that roughly resembles the persona you’re trying to create.
In the fat cat and logo creation lessons that are linked above, we used models to support the visual style of the image. The same concept applies, but this time we’ll use weights to control how much they influence our image.
Pop up the models browser and click the AI Personas tag, and search for “doll”. An assortment of personas will appear, and you’ll notice that these are either LoRA or Textual Inversion files. These are mini models that we will use to guide the general likeness of the person.
Right off the bat, “highlands doll” looks the most like the person I’m making up, so click that model and add it.
Step 3: Start the LoRA weight at around 0.25
You can either type it out or use the little knob slider in WebUI. Using a low weight on a character LoRA works like giving them a DNA, a common ground so similarities are more likely across images. Starting low is a good idea. Too high might result in conflicts with other poses and effects we’ll want to add later.
You can put the LoRA of your choice anywhere in your prompt. Most people like to add it at the end, so it’s easy to find and edit later.
Step 4: Choose a base art style and negative
HighLands Doll is a LoRA that is part of the Stable Diffusion 1.5 family, so I’ll need to choose a base and negative that’s also from this family. For that reason, I cannot use SDXL in this example.
I’m going for a realistic photo, so I’ll choose <realvis51> as my base art style, which is a fantastic model. It can be found under full models. It tends to have a cinematic look, so if you’re looking for something more natural, try full base models like Photon, Lazy, or Photogasm.
Your setup should look something like this
Optional but recommended:
Add #boost to your prompt.
Why: Adding negative embeddings to your prompts is the easiest way to boost your quality. You can do this the easy way by memorizing a few handy recipes that include them like #boost or do it like a pro and pick out the ones you like most. You can add as many negatives as you like to your prompt, but do dial it back if you’re getting artifacts.
[[<verybad-negative:-2>]] – a powerful, general purpose choice
[[<fast-negative:-2>]] – similar to the above, seems to compliment it
[[<negative-hands:-1.5>]] – it’s not perfect, but it seems to help hands
Remember to upscale your results.
Step 5: Render it
Pretty spot-on! And look how realistic without a complicated prompt. We got almost* exactly what we asked for.
*Blooper: Notice that the AI decided to ignore my request for a “red” Camaro, and gave us a white one instead. This means two things: (1) Words aka Tokens towards the end of a prompt have a higher risk of getting lost and (2) There is very likely more training data on white cars for this kind of situation in these AI models that we have selected, which is what is known as Bias. Garbage in, garbage out — it’s only as smart as its training. If this was an important point, we can correct this by making ((red Camaro)) a more heavily weighted positive prompt. It’s not really the focus of this lesson, so let’s continue.
Step 6. Click the Prompt button
We didn’t assign every possible Stable Diffusion parameter in our prompt, so a few were automatically selected for us. Among them is something called a Seed Value. A seed is not a specific photo or person, but a way to repeat a photo. If you know the seed, guidance, and sampler and models used in an image, you can repeat the results over and over — and that’s a sure-fire way to move a character to another scene.
Step 7. Click Copy Prompt
Hit the copy button as shown above, bottom left
Starting a new image, same character
Step 1. Paste that old hybrid prompt
The prompt has the codes for guidance, seed, sampler, models, and negatives all in one line. They can be pasted into the positive prompts box and that syntax will run exactly as it did the first time.
We don’t have to select the models and negatives again.
Step 2. Change the situation and render!
Your new image should follow the format of the copied prompt
We’re going on a fancy date, she’s wearing a great dress, and we’re in Switzerland. Why not. Paste the entire prompt but change the details slightly, like this:
/seed:346034 /recipe:boost [[<verybad-negative:-1.5>]] /size:768x768 /sampler:dpm2m ((((/recipe:boost /steps:less /guidance:8 A portrait of a woman named Louise Betsy Jean, a 24 year old Texan American, bright orange curly hair, freckles, wearing a red dress, at a fancy restaurant in Switzerland <realvis51> <highlands-doll:0.24> )))) high quality, best quality, in focus
Slide the LoRA weight up to 0.5 to get stronger likeness. There’s no substitute for training your own character, but using this hybrid approach is more reliable than just prompting and repeating seed values, although that also works, too.
The negative prompts and negative inversions can impact the look of the character. Try not to change those, lock them in.
When taking a hybrid approach to making an AI persona, describing things like cheekbones, face shape (round, heart-shaped), kind of nose, and so on can help.
Add more LoRAs — check the models system for clothing, hair, poses, and more. Play with the weights, remember to keep things low and increase them if the results are not sharp. It requires a little patience, but you can get exactly what you want every time. But remember — three LoRAs is a lot, don’t overdo it.
If we change a ton of details here, especially the word order, the hybrid technique doesn’t work as well. Those sort of bold moves are for persons who trained their own model or are comfortable sliding that LoRA weight up higher.
The LoRA will look completely different on every model, and your prompt will, too. Remember that models are trained on photos and descriptions by different persons (the model creator) so it’s best to stick to a few models that you really like at first, until you get to know their properties well, before moving on to others too quickly.
Models (also known as checkpoints, loras, inversions, or concepts) are a blanket term for “AI Models”, that can produces a certain affect. Learn more about Models in Lesson 2.
Weights has a few definitions, but it most commonly means what percentage a model should impact the image. In our system, the maximum weight of a model is 2, while the total absence of a model is -2. The default weight is 0.7. Increasing or reducing the weight impacts how much influence the model has. When using multiple models, lowering each weight reduces the chance of conflicts. Reducing or Lowering weights is best way to troubleshoot models and reduce artifacts.
Textual Inversions aka Embeddings – Focused, small models that can be used together with other models, and the weights (or amount of influence) can be controlled. They have less “layers” so these are more commonly used for simple reinforcements. These files are tiny, so adding over ten of them is usually not a problem, as long as they are complimentary and not creating a push-and-pull effect. We jokingly have a recipe called #everythingbad that loads about 20 negative inversions, which will result in nice images! Try that in your next prompt.