Dall-E mini: Creator explains blurred faces, going viral and the future of the project (2024)

AI image generators are having their moment right now. Thanks to OpenAI and their creation known as Dall-E 2, people across the internet have been able to make their own detailed images just from worded prompts.

But quickly after OpenAI’s creation, we then saw Google release a direct competitor, using OpenAI’s open-source code to help create Imagen – an equally as impressive AI image generator, capable of once again making images just from simple phrases.

However, while both of these inventions were revolutionary in the AI world, they were only available to a select few, offering waitlists as they slowly gave access to new users.

Shortly after, the internet exploded with people making their own Dall-E images, albeit at a much lower level of quality. It wasn’t because OpenAI suddenly opened up access but instead because someone had made their own version of the software based heavily on the original, known as Dall-E mini.

We spoke to the creator of Dall-E mini about how it came to be, its viral potential and the future of the project.

What is Dall-E mini and how did it come to be?

Dall-E mini: Creator explains blurred faces, going viral and the future of the project (1)

Dall-E mini is yet another AI image generator taking the internet by storm. However, where it differs is that it is completely free for everyone to use. Despite the near-identical name, it has nothing to do with OpenAI, other than making use of the large amount of publicly-available information OpenAI has provided on their model.

Instead, this project was created by a software engineer known as Boris Dayma. “When I heard about it [Dall-E], I thought that was so cool and that I want to build something like that. So I read their paper on the model, but I would never understand it, it was so complicated,” says Dayma.

It wasn’t until July 2021 that Boris had the chance to try and recreate this project when he signed up for a competition run by Google and Hugging Face, an AI community. He was paired up with a team and given support on his project, where they all decided to try and create an AI image generator like Dall-E.

“By the end of the month, we had something kind of cool. It was not at the level it is now, but it could produce simple prompts like beach at night or day. We won the competition and I continued to work on the product, making improvements since then.”

The model didn’t pick up at first with just a small audience, but around two months ago, the internet picked it up, embracing it for its viral image abilities.

One key difference with Dall-E mini is that it is not filtered at all due to the smaller team and free-to-use nature. This means that in comparison to Google's Imagen and OpenAI's Dall-E 2, which have safety protocols, any prompt will be accepted. This means people are able to use Dall-E mini for everything from cartoons performing a Ted Talk and celebrities playing Quidditch, to uses of racism, extreme violence or depictions of real-world traumatic situations.

Dall-E mini: Creator explains blurred faces, going viral and the future of the project (2)

Going viral

With this free service going viral online, there were suddenly a lot more people than just Boris using the platform. His main takeaway was the creativity of its newfound users.

“I would write something like a view of a lake under the moonlight, or Eiffel Tower on the moon and these were my most complex prompts. But when I see what people use it for, I’m amazed. I don’t have that level of creativity and they learn how to tweak the model to create really specific prompts that I could never come up with,” says Boris.

He has even taken to scrolling through Twitter when he needs to relax, checking out what people can create. He has a particular fondness for the use of the term ‘trail cam’, creating grainy images that look like they have come from a low-res camera at night.

Dall-E mini: Creator explains blurred faces, going viral and the future of the project (3)

Blurred faces and creative inputs

Despite the model’s popularity, it isn’t without its limits. Compared to OpenAI’s original model, or Google’s more recent Imagen, Dall-E mini clearly struggles to match in terms of image quality.

While any term will likely produce a result that matches, no matter how niche, you could find yourself squinting to see the comparison. Celebrities and cartoon characters can often come out as blobs that vaguely resemble the original, and an even weirder issue, the model really can’t do faces.

“The image is encoded into a very shot sequence of numbers so that the model can learn faster. Because of this, the model makes a lot of mistakes. However, when you draw the Moon, a landscape or a tree, you don’t really notice the issues there.

“When it is on a face, we pay a lot more attention. If the eyes are out of order or the nose is misshaped, it is weird. It is the same on animals and cartoon characters, it’s just something we pay more attention to than misshaped objects. Really, the model is equally good or bad at everything.”

This doesn’t mean that the model is incapable of making faces, it simply requires a lot of work on the user’s part. Some have found ways to force the model to create a face by writing long and detailed prompts, listing the size and location of each part of the face.

Dall-E mini: Creator explains blurred faces, going viral and the future of the project (4)

Dealing with the huge numbers and the future of Dall-E mini

While the free nature of Dall-E mini is what makes it stand out, it isn’t without its limits. Compared to OpenAI’s queue system, offering access to a few thousand here and there, Dall-E mini was instantly available to everyone.

“The number of people using it is crazy right now. As it became viral, I made small changes to make it more efficient and then I could handle more traffic, but then the traffic would increase again, and I could never keep up.

“I’m looking to scale it up with more servers and be able to adapt. Little by little we’re able to support more traffic and hopefully in the future, traffic won’t be an issue.”

However, with more scales and growth, Boris is now asking the same question that both OpenAI and Google will be questioning – whether this keep going without any financial aid or monetisation.

“I think monetisation is important. I want to be able to make it scalable so everyone use it now and it is very important to me to make it free for everyone to use. My goal is for this to be a self-sustainable project that everyone can use for fee.”

Read more:

  • We badly described cartoon characters to an AI. Here’s what it drew
  • Artificial intelligence quietly relies on workers earning $2 per hour
  • Artificial intelligence could help to predict the next virus to jump from animals to humans
Dall-E mini: Creator explains blurred faces, going viral and the future of the project (2024)

FAQs

Why does DALL-E blur faces? ›

“When it is on a face, we pay a lot more attention. If the eyes are out of order or the nose is misshaped, it is weird. It is the same on animals and cartoon characters, it's just something we pay more attention to than misshaped objects. Really, the model is equally good or bad at everything.”

How to get better DALL-E results? ›

Dall-E: 10 prompt tips for effective results!
  1. Be precise in your description.
  2. Use specific keywords.
  3. Add movement.
  4. Ambiance and atmosphere.
  5. Specify style or theme.
  6. Use examples or references.
  7. Play with light and time of day.
  8. Perspective and composition.
Feb 12, 2024

How to get free DALL-E credits? ›

Sign up for an account with DALL-E-2 using your email. 2. After you sign up, you will earn 50 free credits in the first month and 15 free credits in the second month. Each credit allows you to create four photos.

Why is DALL·E 2 bad at faces? ›

The reason it can't do faces well are very likely due to the filters being applied to try and stop people making pictures of real people. This is probably also the explanation for the random misses where it paints pictures of something that's not a llama.

Why does AI generate weird faces? ›

Unlike humans who get inspiration from reality, AI models derive what they make only from what they've been trained on. They don't really “know” anything; the best they can do is spot and reproduce trends from data. They can't make sense of what they're given, which is why some AI images are so weird.

Is DALL-E better than Midjourney? ›

User reviews indicate that DALL-E is better for photorealistic images, especially with the improvements that have been made through DALL-E 3. Midjourney tends to be better for illustrations, surrealism, and digital art queries.

What are the disadvantages of DALL-E? ›

Challenges and Limitations: Understanding DALL-E's Capabilities
  • Difficulty in Generating Highly Detailed Images.
  • Inconsistency in Image Generation Based on Slight Textual Variations.
  • Inability to Ask for Clarification When Given Ambiguous Input.
Nov 6, 2023

What is the algorithm behind DALL-E? ›

DALL-E is a neural network and works on a transformer model. This model works on handling input data and making highly flexible data to run the various task o generative. Some of the applications of transformers are DALL-E which transforms the text into an image as per the need of the user.

Why does DALL-E misspell words? ›

At its core, DALL-E is optimized for visual creativity, not textual accuracy. Generating text within images is a complex task for the model.

How many words can DALL-E take? ›

A text description of the desired image(s). The maximum length is 1000 characters for dall-e-2 and 4000 characters for dall-e-3 . You can find more in the docs for image endpoint or the new DALLE3 cookbook page…

Is Dall E free of rights? ›

Subject to the Content Policy and Terms, you own the images you create with DALL·E, including the right to reprint, sell, and merchandise — regardless of whether an image was generated through a free or paid credit.

Is Dall E completely free? ›

With DALL E-3, the possibilities are endless, and it's completely free to access. So, just sign up with Bing Image Creator and get access to DALL E-3. Once you do, you'll be amazed at what this AI image generator can create.

Is Dall E illegal? ›

The images and other content generated by AI don't have the status of “natural person” required to hold copyright, so they are instantly public domain. OpenAI gives you the right to use and “own” their generations to your input. Hi and welcome to the Developer Forum!

Does DALL-E distort faces? ›

Like your real dreams, these digital dreams have unreadable text or blank or distorted faces in many cases, among many other signs that all of this is the end product of the simulated brains we've successfully tapped into here in our corner of the matrix. It's because it uses a VQGAN.

How do you make DALL-E images look better? ›

Let's look at some tips to create perfect DALL-E 2 prompts for better image generation.
  1. Get specific about the style. ...
  2. Talk about every little element in your image. ...
  3. Don't forget to talk about colors. ...
  4. Talk about emotions. ...
  5. Describe the composition of the image. ...
  6. Avoid complicated prompts.
Mar 6, 2023

Why are the faces in my photos blurry? ›

Your shutter speed is too slow. For anyone who handholds their camera, a too-slow shutter speed is the number one culprit of blurry photos. The slower your shutter speed, the more likely it is that vibrations in your camera – generally caused by tiny movements in your hands and arms – will create blur.

Can DALL-E generate faces? ›

The AI art generator DALL-E 2 has become something of a household name in a short space of time, and it continues to roll out game-changing features. It's already stunned us with its ability to create photorealistic images of people who don't exist, and now users can do the same with real people.

Top Articles
Latest Posts
Article information

Author: Twana Towne Ret

Last Updated:

Views: 6594

Rating: 4.3 / 5 (64 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Twana Towne Ret

Birthday: 1994-03-19

Address: Apt. 990 97439 Corwin Motorway, Port Eliseoburgh, NM 99144-2618

Phone: +5958753152963

Job: National Specialist

Hobby: Kayaking, Photography, Skydiving, Embroidery, Leather crafting, Orienteering, Cooking

Introduction: My name is Twana Towne Ret, I am a famous, talented, joyous, perfect, powerful, inquisitive, lovely person who loves writing and wants to share my knowledge and understanding with you.