With DeepFloyd, generative AI art gets a text upgrade

by Kyle Wiggers

Generative AI is pretty impressive in terms of its fidelity these days, as viral memes like Balenciaga Pope would suggest. The latest systems can conjure up scenescapes from city skylines to cafes, creating images that appear startlingly realistic — at least on first glance.

But one of the longstanding weaknesses of text-to-image AI models is, ironically, text. Even the best models struggle to generate images with legible logos, much less text, calligraphy or fonts.

But that might change.

Last week, DeepFloyd, a research group backed by Stability AI, unveiled DeepFloyd IF, a text-to-image model that can "smartly" integrate text into images. Trained on a data set of more than a billion images and text, DeepFloyd IF, which requires a GPU with at least 16GB of RAM to run, can create an image from a prompt like "a teddy bear wearing a shirt that reads 'Deep Floyd'" — optionally in a range of styles.

Techcrunch blog!

Friday, May 5, 2023

Generative AI art has a text problem. This upgrade may solve it

With DeepFloyd, generative AI art gets a text upgrade

More News

No comments:

Post a Comment

Thinking Machines amps up its bet against one-size-fits-all AI with its first open model, Inkling

go

n1 ad