Nvidia's GauGan AI can now turn your crappy art into decent synthetic videos

0 5

By Matthew Griffin Intelligence and the Senses 20th November 2020

WHY THIS MATTERS IN BRIEF

AI’s like Nvidia GauGAN are improving quickly and are able to generate increasingly life-like and sophisticated synthetic content which will turn the entire global creative industry on its head.

Love the Exponential Future? Join our XPotential Community, future proof yourself with courses from XPotential University, connect, watch a keynote, read our codexes, or browse my blog.

Synthetic content, content that’s created by machines and not humans, has come on a lot in just the past five years – whether it’s AI’s that can make art or music, write books, or create games, images, or videos from scratch using nothing more than their well trained AI minds.

Earlier this year in March during its GPU Technology Conference in California Nvidia took the wraps off of GauGAN, a semantic image synthesis generative adversarial AI system, that I’ve written about before and talked about in keynotes, that lets users create amazing lifelike landscape images that never existed using nothing more than their crappy art abilities and a computer stylus. And now, thanks to some human ingenuity one researcher has shown how the same technology can be used to help people create increasingly convincing synthetic videos and virtual worlds – as you can see in the video below.

See how the tech has evolved in a short space of time

In the first month following the beta version’s release on Playground, the web hub for Nvidia’s AI and deep learning research, the company says GauGAN was used to create 500,000 images including concept art for films and video games. And now Nvidia has said it’s updating GauGAN with a new filter feature that layers on lighting and styles from uploaded photos which will make the images it produces even more lifelike.

“As researchers working on image synthesis, we’re always pursuing new techniques to create images with higher fidelity and higher resolution,” said Nvidia researcher Ming-Yu Liu. “That was our original goal for the project.”

GauGAN — whose namesake is post-Impressionist painter Paul Gauguin — improves upon Nvidia’s Pix2PixHD system introduced last year, which was similarly capable of rendering synthetic worlds but that left artifacts in its images. The machine learning model underpinning GauGAN was trained on more than one million images from Flickr, imbuing it with an understanding of the relationships among over 180 objects including snow, trees, water, flowers, bushes, hills, and mountains. In practice, trees next to water have reflections, for instance, and the type of precipitation changes depending on the season depicted.

Paintbrush and paint bucket tools allow users to design their own landscapes with labels including river, grass, rock, and cloud, and the aforementioned style transfer feature lets them modify the colors and aesthetic on the fly. For example, images can adopt a warm sunset glow, or display the cooler lights of a city skyline. Alternatively, they’re able to upload their own landscape images, which GauGAN converts to segmentation maps — maps describing the location of objects in rough detail — that serve as foundations for artwork.

Nvidia says that GauGAN has been used by a health care organization exploring its use as a therapeutic tool for patients, and by a modeler — Colie Wertz — whose credits include Star Wars, Transformers, and Avengers movies.

“We want to make an impact with our research,” said Liu. “This work creates a channel for people to express their creativity and create works of art they wouldn’t be able to do without AI. It’s enabling them to make their imagination come true.”

The code for GauGAN’s AI model was open-sourced on GitHub earlier this year, and an interactive demo is available on Nvidia’s website.

GauGAN is one of the newest reality-bending AI tools from Nvidia, creator of deepfake tech like StyleGAN, which can generate lifelike images of people who never existed. Last September, researchers at the company described in an academic paper a system that can craft synthetic scans of brain cancer. And in December, Nvidia detailed a generative model that’s capable of creating virtual environments using real-world videos.

GauGAN’s initial debut preceded GAN Paint Studio, a publicly available AI tool that lets users upload any photograph and edit the appearance of depicted buildings, flora, and fixtures. Elsewhere, generative machine learning models have been used to produce realistic videos by watching YouTube clips, create images and storyboards from natural language captions, and animate and sync facial movement with audio clips containing human speech.

Matthew Griffin / About Author

Matthew Griffin, multi-award winning Futurist and named Futurist of the Year 2024, has been described as a "Walking encyclopaedia of the future" by NASA and a futurist polymath. One of the world's most renowned futurists and strategic foresight experts Matthew is the 15 times author of the blockbuster "Codex of the Future" series, and is the Founder and Futurist in Chief of the 311 Institute, a global Futures and Deep Futures advisory firm working across the next 50 years, XPotential University, the world's first free futures and foresight university, and the World Futures Forum which works with the United Nations to solve the worlds greatest challenges. Matthew is an in demand international keynote, acclaimed university lecturer and mentor, and host of the hit Fanatical Futurist podcast.

A rare talent in his past Matthew helped build and run several multi-billion dollar business units for Atos, Dell-EMC, and IBM, and his ability to identify, track, and explain the impacts of hundreds of emerging technologies and trends on global business, culture, and society has earned him a powerful reputation and a roster of clients that include royal households, world leaders, G7, G20, and G77+ governments, and many of the world's most respected brands including ABB, Accenture, Adidas, AON, ARM, BCG, Centrica, Citi Group, Coca Cola, Dentons, Deloitte, Disney, Dow, EY, KPMG, Lego, Legal & General, LinkedIn, Microsoft, PepsiCo, Qualcomm, RWE, Samsung, T-Mobile, UBS, VISA, and many others. He was also the only futurist invited to talk at the UN COP28 held in Dubai alongside world leaders.

Regularly featured in the global media including the AP, BBC, Bloomberg, CNBC, Discovery, Forbes, Khaleej Times, Telegraph, TIME, ViacomCBS, WIRED, and the WSJ, Matthews mission is to help organisations create a fair and sustainable future whose benefits are shared by everyone irrespective of their ability, background, or circumstances.