I tested the first major AI image generator, Dall-E, when it originally launched. Since then, I’ve watched as the world of generative AI has exploded, but one feature has always bugged me: text in images.
As faces looked clearer, and hands went down to the correct number of fingers, every model still seemed to really struggle with creating text.
But as faces looked clearer, and hands went down to the correct number of fingers, every model still seemed to really struggle with creating text. Whether that was on a poster, a sign or even a T-shirt, it often looked like a giant smudge of hieroglyphics.
Best picks for you
In recent updates, the problem has started to fade. ChatGPT could reliably recreate text, but only to a certain extent, quickly having a meltdown if your request became too specific.
Then, in steps Gemini 3, or more specifically, Google’s recent update to Nano Banana Pro, its big AI image upgrade. This upgrade improved a lot of key areas for the tool, but text recreation was by far the biggest in my eyes.
What’s new with Nano Banana Pro
(Image credit: Tom’s Guide)
Nano Banana Pro has improved image quality and added the ability to see if an image is AI-generated. It can also now edit multiple reference images into one coherent final product. What’s more, you can now translate text in images into another language and also create complicated text-based images.
As our How To editor, Kaycee Hill, pointed out, this has made it an incredible tool for creating infographics. With a simple prompt, Gemini 3 can pump out a complicated infographic, including clear text and accompanying images to explain it.
(Image credit: Gemini)
However, on top of that, the AI model now has a better understanding of fonts, text colors and sizes. This allows for far more creativity than before, offering the ability to customize your infographics, labels and magazine covers like never before.
In one example from Gemini, an image of an astronaut is turned into a storyboard sketch, complete with legible written text and the reference image turned into a drawing.
Elsewhere, Gemini creates an energy drink brand, writing the text on the can in English. Then, using the prompt: “translate all the English text on the three yellow and blue cans into Korean, while keeping everything else the same,” the cans transformed, keeping the text in the same place, now simply translated.
Don’t miss these
(Image credit: Gemini)
Is text in an image really that exciting?
(Image credit: Shutterstock)
Over the years, AI has faced challenges that have been clear identifiers of its weaknesses. For a while, AI image generators couldn’t make hands, but now they can. AI video generators couldn’t recreate the intricate nature of gymnastics, but it is improving rapidly. Now, AI image generators are finally getting text.
This opens up a huge avenue for these kinds of tools that just weren’t reliable before. Translating text in images, creating detailed infographics and reliably remaking different fonts are incredibly useful across a number of sectors.
As this kind of technology improves, it could be used to create entire storyboards, magazines or posters completely from scratch.
Not only that, but this is an area where Gemini has a sizable advantage, leaving the likes of ChatGPT behind.
Follow Tom’s Guide on Google News and add us as a preferred source to get our up-to-date news, analysis, and reviews in your feeds.
More from Tom’s Guide
Back to Laptops
SORT BYPrice (low to high)Price (high to low)Product Name (A to Z)Product Name (Z to A)Retailer name (A to Z)Retailer name (Z to A)
