Join leaders in Boston on March 27th for a special evening of networking, insight, and conversation. Request an invitation here.
Popular AI image generation service The middle of a journey introduced one of the most frequently requested features: the ability to consistently recreate characters across new images.
This, by its very nature, has been a major hurdle for previous AI image generators.
That’s because most AI image generators rely on “.popularization model”, tools similar to or based on Stability AI’s Stable Diffusion open source image generation algorithm. As we’ve learned from similar tools, it roughly works by taking the text the user types in and trying to stitch together images that match that description pixel by pixel. Image and text tags in a large (and controversial) training data set of millions of human-generated images.
Why consistent characters are so powerful and elusive for AI-generated images
But as with text-based large-scale language models (LLMs) such as OpenAI’s ChatGPT and Cohere’s new Command-R, the problem with all generative AI applications is inconsistent response. The AI will generate a new one for every prompt. The prompt is filled in even if the prompt is repeated or some of the same keywords are used.
VB event
AI Impact Tour – Boston
request an invitation
This is great for generating entirely new content (images in Midjourney’s case). But what if you’re creating storyboards for a movie, novel, graphic novel, comic book, or other visual medium? same Do you want your characters to move around and appear in different scenes and settings with different facial expressions and props?
This exact scenario, typically required for narrative continuity, has so far been extremely difficult to achieve with generative AI. But Midjourney is now taking a shot at that, introducing a new tag “-cref” (short for “character reference”) that users can add to the end of text prompts in the Midjourney Discord to match a character’s face. It captures features, body shape, and even clothing from the URL a user pastes after a tag.
As this functionality evolves and becomes more refined, Midjourney could evolve from a cool toy and source of ideas to a more professional tool.
How to use the new Midjourney Consistent Character feature
This tag works best with previously generated Midjourney images. So, for example, a user’s workflow would first generate or retrieve the URL of a previously generated character.
Suppose you start from scratch and generate a new character with the prompt “Muscular bald man with beads and an eyepatch.”
Upscale the image you like the most and Control-click on the Midjourney Discord server to find the “Copy Link” option.
Then type the new prompt as “-cref, standing in a villa, wearing a white tuxedo.” [URL]” and paste the URL of the generated image, Midjourney will attempt to generate the same characters as before with the newly entered settings.
As you can see, the result is far from the original character (or the original prompt), but it’s definitely encouraging.
Additionally, users can add the tag “-cw” followed by a number from 1 to 100 at the end of the new prompt (“–clef [URL]” is a string, so do: “-cref [URL] –CW 100. ” The lower the “cw” number, the more variance the resulting image will have. The higher the “cw” number, the closer the resulting new image will be to the original reference.
As you can see in this example, entering the very low “cw 8” actually returns what we wanted: a white tuxedo. However, the character’s signature eyepatch has now been removed.
Well, there’s nothing that a little “regional difference” can’t solve.
OK, the eyepatch is on the wrong eye…but we’re getting there!
You can also combine multiple characters into one by using two “-cref” tags side by side with each URL.
The feature just rolled out earlier this evening, but it’s already being tested by artists and creators. If you have Midjourney, please give it a try. And read founder David Holz’s full statement about it below.
@Everyone, today we’re testing the new “Character Reference” feature. This is similar to the “Style Reference” feature, except instead of matching a reference style, it attempts to match the character to the “Character Reference” image.
How to use
- type
--cref URL
After the prompt for the URL to the character’s image. - can be used
--cw
To change the reference strength from 100 to 0 - Strength 100 (
--cw 100
) is the default and uses face, hair, and clothing - If the intensity is 0 (
--cw 0
) Focus only on the face (good for changing clothes, hair, etc.)
what it means
- This feature works best when using characters created from Midjourney images. Not designed for real people or photos (like regular image prompts, images may be distorted)
- Cref works like a regular image prompt, except it “focuses” on a character trait.
- This technique has limited accuracy and cannot accurately copy dimples, freckles, T-shirt logos, etc.
- Cref works with both Niji models and regular MJ models,
--sref
advanced features
- You can use multiple URLs like this to blend information and text from multiple images.
--cref URL1 URL2
(This is similar to a multiple image or style prompt)
How does it work in web alpha?
- When you drag or paste an image to the image bar, three icons will appear. Selecting these sets whether they are image prompts, style references, or text references.To use images for multiple categories, hold down the Shift key while selecting options
Please note that this and other features may change suddenly while MJ V6 is in alpha, but an official beta of V6 will be available soon. We welcome your feedback on ideas and features. We hope you enjoy this early release and find it useful in your story and world building.
VentureBeat’s mission will be a digital town square for technical decision makers to gain knowledge and transact on transformative enterprise technologies. Please see the briefing.