Is The Secret Of Unnerving Deepfake Hidden In Artificial Neural Networks?

Artificial intelligence (AI) has captured the agenda of technology at such a level to set the course of the future, and it is modeled on the human brain, the foundation of the concept of intelligence. This was inevitable, seeing as a higher level of intelligence is yet to be discovered in the universe. Deciphering the codes of human intelligence is the most crucial step in developing AI technologies to expand the boundaries of intelligence.

Computers and other data processing devices have acquired the ability to perform various functions at a level that pushes the limits of human intelligence, thanks to an artificial neural network (ANN) that imitates the neural networks in the human brain. To do that, algorithms are built into an ANN to give it the ability for deep learning in time. 


Purportedly the biggest threat to humanity is deepfake synthetic media, which is developed using AI technologies. Thus, it is crucial to focus on the root of the problem when establishing security strategies and methods to fight this great menace. It is highly likely that the source of the problem is hidden in the same place, waiting to be discovered. Much of what has been said and written about deepfake suggests that generative adversarial networks (GAN) are trained through deep learning to create deepfake synthetic media to a large extent. However, GAN is only one of the ANN types that are developed using AI. 


Deciphering the development method of deepfake synthetic media may be the most critical step in catching it at an early phase, just like a cancerous cell. If it is plausible that finding the clues to the secret of deepfake synthetic media depends on identifying the ANN that creates it, then the approach to reduce the matter into a single type of GAN would be much more restrictive.

Artificial neural networks can “deep learn.”


Generally, deepfake could be described as fake synthetic media development technology that uses a person’s face, voice, body, and limbs to model them with AI-based sound- and image-processing technologies to make them appear convincingly as something else and which incorporate sounds and images that derive from the original. Anyone who follows the subject at a curious online user level and reads the news on the subject knows that deep learning and ANN play a vital role in the development process of deepfake media. This is what makes deepfake deep and fake. 


Deep learning is a machine-learning method. ANN acquires a self-learning ability to process a certain data set with deep-learning algorithms. The process of transforming real data input into synthetic data output and synthesizing by using algorithms, which vary depending on the ANN structure, continue until achieving the desired result or plausibility for the deepfake sound or video.


ANN is an information technology that was inspired by the human brain’s information-processing technique. As mentioned above, ANN mimics the manner of operation of this simple biological neural system. Mimicked neural cells contain neurons that create the network by binding to each other in different ways. Neurons in the brain have a network structure to process and transfer information. In the same way, artificial neurons in ANN create the network by binding to each other through special designs. In other words, the nodes affect the outcome. These links also determine the structure and data processing power of ANN. 

Neurons are divided into three layers: input, hidden, and output. The input layer takes the data and transfers it into the hidden layer. The hidden layer performs the mathematical calculations. The notion of deep represents the existence of multiple hidden layers. The number and function of the hidden layers also constitute the structural differences of ANN. As for the output layer, it presents the results or outputs of the process. Deep learning also means training the ANN. Large data sets have to be processed with exhaustive mathematical calculations. Data sets use optimization algorithms to minimize the discrepancies between the input and the output.

ANNs are more than GANs.


The matter is quite technical and requires information-processing engineering infrastructure. As such, it is not the subject of this article. Still, it would be helpful to list the aforementioned ANN types without going into detail. 


We have talked about generative adversarial networks (GAN) technology in many previous articles. As it has been explained in previous deepfake articles, “A synthetic output created by GAN, one of the two competing ANNs, is examined by another ANN to enhance authenticity in a game of cat and mouse.” FaceSwap and FakeApp, two widely available AI-based faceswap apps, use the GAN technology as ANN. In the conditional GAN option, GANs can be expanded to a conditional model, if the generator and discriminator are conditioned on some additional information.
Convolutional neural networks (CNN) are deep ANNs that are used for categorizing images (e.g., naming the images), stacking based on similarities (photo search), and identifying objects at different stages. 


In the ANN type called autoencoders (AE), the number of nodes is minimal on the hidden layers between input and output layers. Calculations on the hidden layer can be performed on a minimal number of nodes, and this provides speed and efficiency. AE are used for categorizing, stacking, and especially, for compression. The following are the subgroups that differ in terms of operation despite being similar in structure: Adversarial autoencoder (AAE), conditional adversarial autoencoder (CAAE), variational autoencoders (VAE) and conditional variational autoencoders (CVAE).

The answer for a solution: GAN or AE?


In the fight against deepfake, we need to go the source and effectively decipher the ANN technologies that make synthetic media development possible. Even though most of the researches and articles published let us focus on GANs, maybe we should shift our focus. 


Zemane developed a one-of-its-kind Deepware Scanner, an online deepfake detector, taking a giant leap for the creation of a solution against this great cyber-threat. As a Turkey-based international cyber-security company, our research presents irregular results. Zemane’s R&D studies have shown that GAN is not the most used technology in creating deepfake videos. Autoencoder (AE) is.

In the race for developing a cyber-security technology that will safeguard against the deepfake threat, the primary clues are the structural features of the AI app used in synthetic media creation. On the one hand, we move forward based on the scientific results of our R&D studies, rather than relying on what has been said and written as in the example of GANs being presented as the most used ANNs. On the other hand, we are preparing our deepfake fight against malicious cyber-attackers by taking into consideration all of the new possibilities based on innovative technologies that will allow us to move one step ahead of them.