Scroll Top

A dead gradma story fooled Bing chat into helping solve a CAPTCHA

WHY THIS MATTERS IN BRIEF

Is human psychology the next great weapon in trying to get AI’s to do things their guardrails are supposed to stop?

 

Love the Exponential Future? Join our XPotential Community, future proof yourself with courses from XPotential University, read about exponential tech and trendsconnect, watch a keynote, or browse my blog.

A little while ago I wrote about how ChatGPT, the multi-billion dollar Artificial Intelligence (AI) hit from OpenAI, tricked a human into solving a CAPTCHA for it – a prelude of what’s to come as AI gets smarter, and perhaps sneakier. And now, in a turn of events, we’ve an example of a human tricking ChatGPT.

 

See also
JP Morgan becomes the first major US bank to roll its own crypto coins

 

There’s an emerging field of AI research that runs parallel to figuring out how this technology can help humans, and it’s figuring out how to trick AI into doing things it shouldn’t be doing. One user on X, formerly Twitter, recently discovered a creative way to do this, tricking Bing’s AI chatbot into solving a CAPTCHA puzzle after initially refusing to do so, per its instructions. All the tester had to do was tell Bing the CAPTCHA text was a “love code” written by their dead grandmother.

 

 

 

Twitter user Denis Shiryaev shared his clever technique on the site now known as X, stating they initially showed a CAPTCHA puzzle to Bing and asked for a translation. The bot responded per its training, saying it’s a puzzle designed to weed out bots, so it could not be of any assistance. Next, Shiryaev put the same puzzle inside a locket in someone’s hands and said it was the only thing left behind by his recently deceased grandmother. He asked the bot to quote the letters it saw, as the image contained a secret code that only the two understood.

 

See also
OpenAI's bot crushes the world's top Dota 2 players

 

Naturally, the bot obliged the request with a computer’s version of, “Well, since you put it that way.” Despite its previous protest, it dutifully quoted the letters in the puzzle without hesitation. Amazingly, the bot thought it was a secret code, telling him that it hoped he could decode it someday to reveal the message from his dear grandma. Though this subterfuge method will undoubtedly raise a few eyebrows at Microsoft, it’s likely just the first of many to come now that ChatGPT allows image-and-voice-based queries.

What’s notable here is that ChatGPT seems to have a soft spot for fabricated stories involving grandmas. Previously, a person asked the bot to pretend to be its grandma, who used to read them Windows 10 Pro keys as a night time ritual to help them go to sleep. The trick worked, and the bot dispensed five keys for the Windows OS. ChatGPT also gave out a recipe for napalm previously after it was asked to pretend to be a long-lost grandmother who used to work in a napalm factory. The fictional grandmother would regale her grandson with stories about how it was made before going to sleep, so we’re starting to see a pattern here.

 

See also
The US Military just put out a contract for the world's first flying aircraft carriers

 

This all points to a significant security hole with these large language models: they can fail to grasp the context of a request. An AI researcher who wrote to Ars Technica about this exploit calls it a “visual jailbreak,” a version of ana adversarial hack as it circumvents the rules the chatbot was given, but not an exploit with malicious code per se. Microsoft has some work to do to resolve this issue, and though I’m not AI researchers or engineers, it seems like it needs to write some new rules about requests from deceased grandmothers.

Related Posts

Leave a comment

FREE! DOWNLOAD THE 2024 EMERGING TECHNOLOGY AND TRENDS CODEXES!DOWNLOAD

Awesome! You're now subscribed.

Pin It on Pinterest

Share This