Growing distrust may even lead some of you to search for the original news sources like official documents or recordings where particular important words were used.
Your quest for that authentic digital evidence is about to get much trickier.
Computer scientists and young startupers are already working on technologies that make it easy for users to hijack anyone’s face and voice to generate fake news media that is more technically impressive than ever. The average internet user will be unable to distinguish it from the real thing.
Let’s cover the visual aspect first.
Led by Professor Justus Thies, the combined efforts of scientists from the University of Erlangen-Nuremberg, Max Planck Institute for Informatics, and Stanford have brought the world a system that allows one person’s facial expressions to be visually dubbed onto the video recording of someone else.
The system, named Face2Face, can operate using a simple internet camera and can produce YouTube-quality videos. You can see for yourself how it handles fabricating TV performances from George W. Bush and Vladimir Putin:
As you can see, the researchers didn’t talk during the dubbing, as the technology isn’t designed to handle audio.
For that we need to bring in another solution to synthesize statements we want to put into a famous mouth that Face2Face has opened for us.
Canadian startup Lyrebird* is creating a system that requires only a one-minute sample of someone’s speech to learn his or her voice characteristics and patterns. It can then build any statement – even with various intonations – in that voice.
You can check out its possibilities by listening to the following sample in which Donald Trump (or his voice, anyway) praises the benefits of Lyrebird:
Or in this conversation between Obama, Trump and Clinton:
The creators of Lyrebird are aware of the potential impact that their project may have on society and on the judicial system in particular. They claim, however, that by making the results of their work public they are raising awareness that such forgeries are possible. As they write in the part of their website dedicated to ethics (https://lyrebird.ai/ethics):
“Voice recordings are currently considered as strong pieces of evidence in our societies and in particular in jurisdictions of many countries. Our technology questions the validity of such evidence as it allows to easily manipulate audio recordings. This could potentially have dangerous consequences such as misleading diplomats, fraud and more generally any other problem caused by stealing the identity of someone else.
By releasing our technology publicly and making it available to anyone, we want to ensure that there will be no such risks. We hope that everyone will soon be aware that such technology exists and that copying the voice of someone else is possible. More generally, we want to raise attention about the lack of evidence that audio recordings may represent in the near future.”
Even if their current attempts are still a little clumsy, they are firm proof that such ideas can be implemented. As with any technology, it is just a matter of time before it reaches maturity, which here will result in fluent and natural-sounding fake dialogues.
All this is very interesting – but slightly terrifying too. You can raise multiple questions: can it be the case that in a couple of years audio and video evidence will have no value in court and we will need eyewitnesses for all legal purposes? Will we be able to invalidate any statement that has not been seen live by a large crowd, simply by saying that it’s fake? How will you react when someone “kindly” passes you a recording in which your spouse “confesses” that they’ve been cheating on you for years?
If you are interested in the technical details, you can take a look at the website of Face2Face project (http://www.graphics.stanford.edu/~niessner/thies2016face.html) or become a beta-tester for Lyrebird (https://lyrebird.ai/contact).