The Scary Rise of Deepfakes

It’s a rainy afternoon. You are right in the middle of a client presentation when your mobile starts ringing off the hook.

It’s your dad on the video call. You excuse yourself, and take the call. Your dad looks panic-stricken.

He tells you he is at a hardware store, buying building materials for house repairs and he has just realized he does not have his credit card. He implores you to transfer funds to the supplier’s account.

You immediately swing to his rescue and transfer the money. There is no reason to doubt your own father except that it isn’t really your father on the video call.

You find out later that you were duped. The entity you spoke to was a deepfake – a hyper-realistic simulation of your father’s image and voice.

This makes the current phone scams – where the perpetrators talk gullible people into disclosing their financial information – look like kindergarten pranks. The convenient part of such scams is that you always know you are talking to a stranger.

Deepfakes, on the other hand, could involve someone very familiar. And, they could sound and look totally legit. This example is just a tip of the iceberg of vast deceptions deepfakes can unleash on unsuspecting citizens.

Deepfakes

A deepfake is a term used to describe an audio or a video clip that has been created using Artificial Intelligence (AI) and whose primary or ulterior purpose is to make a person appear to say something that he or she has never said.

A portmanteau of deep learning and fakes, deepfakes rose to prominence in 2016 when some tech hoodlums started using neural networks to superimpose the faces of celebrities onto those of pornstars.

However, what started as a goofy, face-swapping, lip-syncing operation is turning into something way more serious and has the potential to turn into something far more sinister.

In 2018, a YouTube video surfaced, created by comedian and dystopian-horror director, Jordan Peele, where he synthesized his voice into a clip of the former US President Barack Obama.

Many viewers when they saw the clip were bamboozled. A few discerning ones, however, claimed that it failed to fool them. What these smart people did not figure in their calculations was the rapid rate at which AI was advancing and how soon their doubts were going to give way.

Listen to this audio clip before reading further:

Early this year, the applied machine-learning startup Dessa released this audio clip of the famous podcaster and UFC commentator Joe Rogan, except that it wasn’t the real Joe Rogan on tape.

It was an AI-rendered simulation of his voice.

This clip threw me for a loop and I have been listening to Rogan’s podcasts for over two years.

The weaponization of machine learning

In 2001, Robert Viola and Michael Jones proposed a real-time, object detection technology now known as the Viola-Jones objection detection framework. This framework allows machines to easily detect faces. Even today this technology lies at the heart of many face-mapping tools.

Like any piece of technology, machine learning and its offshoots run the risk of being weaponized when fallen in the wrong hands.

Because deepfakes appear so authentic, they can become a potent disinformation weapon. Add to that our tendency to believe what we hear and see. It can turn fiction into fact, muddling our understanding of truth.

Today an expert with voice samples of the person they want to spoof can produce a falsified audio or video in seconds.

And, mind you, a deepfake doesn’t have to be a complete message. A 15-second doctored clip, spreading like wildfire on social media could do enormous damage to a politician’s or a corporate honcho’s career.

Implications

AI and its subsets – machine learning and deep learning are momentous technologies which carry implications, both good and bad.

Today only specialists who know how to entwine technological ingenuity with computing power can create deepfakes. But as the tech curve climbs higher, speech synthesis and facial reenactments will be cheaper to make.

Soon there will be apps that are going to render near-perfect manipulative audios or videos in seconds. Call it the dark side of the democratization of technology, but it’s inevitable.

AI editors like FaceApp are already blurring the lines between reality and fiction, raising red flags among law enforcement agencies. The FBI even called the Russian app a counterintelligence threat.

Dessa’s principal machine-learning architect, Alex Krizhevsky says, “Human-like speech synthesis is soon going to be a reality everywhere.”

In short, it’s a scary, dystopian scenario.

What can be done about it?

Since deepfakes are an evolving scenario, no flawless solutions exist as of today. Broadly speaking, raising public awareness and technology countermeasures are two ways to combat deepfakes.

People need to be told that seeing is no longer believing and that the words and actions in a video can be startlingly fabricated.

If you see an out-of-place video that stirs up sentiments, you should not trust its veracity. So awareness can go a long way in stemming the distribution of deepfakes; stopping the production is an altogether different challenge.

On the tech front, a number of organizations, including startups, university labs, and research institutions are devising measures to detect deepfakes. Apparently, spotting a fake image is much easier – via unrealistic skin tones and different background hues – than spotting a fake video.

Vincent Nozick of Université Gustave Eiffel explains that while there are lots of small imperfections in an image, a video is a succession of images with a lot less noise because it has been absorbed by very high compression.

The US-based non-profit research institute, SRI, is working at developing a multi-pronged technique to detect deepfakes called SAVI (Spotting Audio-Visual Inconsistencies). But the success of these technologies is also contingent upon what the rogue parties do. Will they find new ways to blunt or trump the detection countermeasures?

In my opinion, it is going to be a long battle between detectors and deepfake creators. Right now, public awareness coupled with technological solutions seems to be the key to thwarting this impending threat.