AI-synthesized Faces Indistinguishable From Real Faces

The line between what is real and not is blurring, according to the findings of a detailed, new study published last month. And the average man on the street can no longer reliably identify photos of real people from those generated with existing AI technology.

Titled “AI-synthesized faces are indistinguishable from real faces and more trustworthy”, the study was conducted by Sophie Nightingale from Lancaster University and Hany Farid from the University of California, Berkeley.

A total of 757 participants were recruited through Amazon Mechanical Turk and assigned to one of three different experiments that tasked them with either identifying synthetic faces or offering their perceived trustworthiness of subjects from a total of 128 faces.

Synthesizing faces

The faces for this study were generated with StyleGAN2, a state-of-the-art generative adversarial network (GAN) that was first introduced by Nvidia researchers in 2018. GANs are hardly new and are used for a variety of tasks such as generating realistic photographs, enhancing the resolution of images, or creating emojis from photos.

A GAN essentially pits two neutral networks – consisting of a generator and a discriminator, against each other. For this study, the generator starts with a random array of pixels and iteratively learns to synthesize a realistic face, while the discriminator is tasked to distinguish the synthetic face from a collection of real faces.

“If the synthesized face is distinguishable from the real faces, then the discriminator penalizes the generator. Over multiple iterations, the generator learns to synthesize increasingly more realistic faces until the discriminator is unable to distinguish it from real faces,” the authors explained.

Not much better than chance

It is worth noting that participants were told that they would be paid extra if their accuracy is deemed to be within the top 20% in accuracy and that a failure to catch nine out of 10 planted “obviously synthetic” faces will see their payment withheld and results excluded from the study.

From the first experiment, it was found that the average accuracy is 48.2%, or close to a chance performance of 50% (Experiment 1), though this improved slightly to 59.0% with training and the offering of trial-by-trial feedback (Experiment 2).

Interestingly, synthetic white faces were least accurately classified in both experiments, a fact that the authors hypothesized could be due to an overrepresentation in the STyleGAN2 training dataset resulting in the creation of more realistic white faces.

In the final experiment (Experiment 3), participants generally deemed the synthetically generated photos to be more trustworthy than those of real people. Indeed, the top three most trustworthy faces are synthetic. According to the authors, this might be due to how synthesized faces tend to look more like an “average face” which tend to be deemed more trustworthy.

Implications of the study

Much has been written about the threats of deep fakes and it doesn’t take much imagination to understand the potential of synthesized faces for creating fake social media profiles to support social engineering efforts or for outright scams.

In Singapore alone, nearly one billion Singapore dollars (USD734) was lost by scam victims since 2016, with Internet love scams costing victims some SGD33 million (USD24.2) in 2020 alone. While a reverse image search would work to expose love scams due to reused images, it falls apart with synthetic images.

Perhaps equally as bad would be a world where any image or video can be easily faked, noted the authors, as that would allow any “inconvenient” or “unwelcome” digital recording to be disputed and called into question as synthetic.

“Synthetically generated faces are not just highly photorealistic; they are nearly indistinguishable from real faces and are judged more trustworthy,” wrote the authors.

“Although progress has been made in developing automatic techniques to detect deep-fake content current techniques are not efficient or accurate enough to contend with the torrent of daily uploads.”

For now, they call for the parallel development of “safeguards” to help mitigate the inevitable harms from the resulting synthetic media, including the incorporation of watermarks for reliable identification of synthetic media.

Images used in the study can be accessed here, and anonymized responses and training data can be found here.

For now, you don’t even need to set up StyleGAN2 on a powerful workstation to generate fake imagery – high-quality fake imagery generated using StyleGAN can be found here

Paul Mah is the editor of DSAITrends. A former system administrator, programmer, and IT lecturer, he enjoys writing both code and prose. You can reach him at [email protected].​

Image credit: iStockphoto/Kubkoo