In case you’re wondering, the picture above is “an intricate drawing of eternity.” But it’s not the work of a human artist; it’s the creation of BigSleep, the latest amazing example of generative artificial intelligence (A.I.) in action.
A bit like a visual version of text-generating A.I. model GPT-3, BigSleep is capable of taking any text prompt and visualizing an image to fit the words. That could be something esoteric like eternity, or it could be a bowl of cherries, or a beautiful house (the latter of which can be seen below.) Think of it like a Google Images search — only for pictures that have never previously existed.
How BigSleep works
“At a high level, BigSleep works by combining two neural networks: BigGAN and CLIP,” Ryan Murdock, BigSleep’s 23-year-old creator, a student studying cognitive neuroscience at the University of Utah, told Digital Trends.
The first of these, BigGAN, is a system created by Google that takes in random noise and outputs images. BigGAN is a generative adversarial network: A pair of dueling neural networks that carry out what Murdock calls an “adversarial tug-of-war” between an image-generating network and a discriminator network. Over time, the interaction between generator and discriminator results in improvements being made to both neural networks.
CLIP, meanwhile, is a neural net made by OpenAI that has been taught to match images and descriptions. Give CLIP text and images, and it will attempt to figure out how well they match and give them a score accordingly.
By combining the two, Murdock explained that BigSleep searches through BigGAN’s outputs for images that maximize CLIP’s scoring. It then slowly tweaks the noise input in BigGAN’s generator until CLIP says that the images that are produced match the description. Generating an image to match a prompt takes about three minutes in total.
“BigSleep is significant because it can generate a wide variety of concepts and objects fairly well at 512 x 512 pixel resolution,” Murdock said. “Previous work has produced impressive results, but, by my knowledge, much of it has been restricted to lower-resolution images and more everyday objects.”
BigSleep isn’t the first time A.I. has been used to generate images. Its name is reminiscent of DeepDream, an A.I. created by Google engineer Alex Mordvintsev that creates psychedelic imagery using classification models. A GAN-based system was also used to create the A.I. painting sold at auction in 2018 for a massive $432,500. However, it’s certainly a fascinating step forward.
To try out BigSleep for yourself, Murdock suggested checking out his Google Colab notebook regarding the project. There’s a bit of a learning curve involving using the Colab GUI and a few other steps, but it’s free to take for a spin. Other ways of testing it will liekly also open up in the weeks to come. If you’re interested, you can also visit r/MediaSynthesis, where users are posting some of the best images they’ve generated with the system so far.