TRUSTAI addresses the role of digital change and its significance for us humans and society in a dialogical interplay between the visitors and a machine.
Today, procedural media in particular – media whose functionality is based on computational processes – determine how the individual perceives society and how society perceives the individual. In TRUSTAI, the machine demonstrates to visitors how it can influence their lives with the data it collects.
On a table is a glass cube with a holographic representation of a human face. An AI enters into communication with the user, although its exact capabilities remain unclear. In the course of the dialogue, the other person's face is hijacked by the machine, and the machine assumes their identity. In TRUSTAI, the human face is the central motif and, along with the voice, the only channel of communication. We become involved in a conversation, we open up, we reveal things. Seemingly banal things, but which a learning AI can combine into surprisingly accurate conclusions about our personality. How complex is the machine with which we communicate? What processes are really running in the background and to what extent does it have access to my data?
Today, media content is not only filtered by algorithms, but content is produced by them. Media users usually »retouch« the image that society perceives of them . However, truths about others can also be produced outside of their control: So-called deepfakes, in which faces in videos are authentically faked, are perceived by us as authentic images of reality.
The fourth, public power, has today migrated from the press and broadcasting into the hands of large corporations and technically powerful states, which use it to pursue their own agendas. These corporations collect data, make judgments with the help of algorithms, and use them to create their own truths. In the end, media and factual truth can no longer be distinguished by the individual.
In TRUSTAI, the visitor experiences a feeling of insecurity that is increasingly spreading through the digital transformation in society. To what extent the machine can be trusted remains unclear. But trust in self-regulatory processes is a central pillar of our society according to Niklas Luhmann's systems theory . With the loss of system trust, society is also falling apart.
The installation was triggered by scientific results in a field of research known as face reenactment, in which a person's facial movements are transferred to another person's video in real time. The results shown in 2016 in the scientific publication Face2Face: Real-time Face Capture and Reenactment of RGB Videos  seem frighteningly real. A little later, an artificial intelligence-based software was published by an anonymous developer that can be used to exchange faces of people in videos, not in real time, but freely available to everyone . The so-called DeepFake videos can be produced with little technical background and quickly developed such good quality that the question arose as to the consequence for the credibility of moving image material (Obama imitation insults US President Trump  and Hello, Adele – is it really you? ). Meanwhile, this technology is even making its way into mainstream media with virtual newscasters . What Photoshop heralded for still images, namely the justified suspicion against the credibility of any kind of still images, takes the production and manipulation of still and moving images driven by developments in Artificial Intelligence to a new level: photorealistic images of people can be generated without technical knowledge. On the website thispersondoesnotexist.com , one can view faces generated using StyleGAN2 , a freely available AI developed by NVidia. The principle can also be applied to any other type  of object. But not only a person's face can be faked: Using AI, fraudsters mimicked the voice of a CEO and captured €220,000 in 2019. 
The research field of artificial intelligence has made enormous progress in recent years, particularly with »deep learning« based on artificial neural networks. Deep learning is one of many machine learning methods in which many parallel computational rules are connected in series. If one visualizes the dependence of the computational rules on each other, a network with different levels of computational units results. Each individual computational unit realizes a simple computational rule from several inputs and a single output. Although the perceptron  (the historical motif of these computational units) developed in 1958 had a neuron as its model, a neural network (so called only for this historical reason) is in fact far from a simulation of the brain. In no way does it emulate a brain.
The technical success of neural networks is based on the fact that the large number of data points stored in the misleadingly named neurons conceals a very sophisticated stochastic model of the properties of the data trained with them. If a network is trained with images of human faces whose age is known, it is able to make a good estimate of a person's age. If a network is trained with images of faces with the gender specified as »male« or »female«, the network is also only able to distinguish between the categories male and female. The network does not know intermediate tones. Thus, the network also encodes our biases , which we pass on by selecting such categories . If we use such a prejudiced network as a compass for our decisions, we cement a social behavior. However, unbalanced or unrepresentative datasets can also introduce biases into a network's calculations that can result in socially undesirable effects, such as when an AI makes recommendations for incarcerating people  and disadvantages people of color. Researchers are already calling for banning emotion recognition based on facial images, which is widely used in business , because the widely used database  is simply too clearly riddled with human errors of judgment.
Data is the »oil of the 21st century« is a popular phrase. If you want to train a powerful AI, you need a lot of good data. That's why the big Internet giants, but also many countries, are interested in collecting data about us in ever new ways. Possible applications often only open up much later with the further development of technology and the merging of different data sources. Even the media communication channels we use as a matter of course, such as email, chat or video conferencing, carry unexpected information about us. From images, supposedly innocuous attributes such as hair and skin color, beard, wearing glasses  or wearing a mask  can be determined by machine. Emotions can be assessed via the image, but also via the voice . Thus, data can be collected about us, even if we are not aware of it. Who would have thought that with the help of artificial intelligence our political attitude  and preference of sexual orientation can be assessed , or via our eye movements a psychological personality profile , which makes our buying behavior  or our influence in political decisions more calculable? Personalized prices  of the products offered to us included? No need for body trackers , all it takes is a still video image of us to determine our pulse . We are waiting for the video conferencing tool that tells us whether our counterpart is speaking the truth . Artificial intelligence can already automatically detect if you are typing an email on the keyboard  while feigning interest in the other person. Our health will become transparent if our image is sufficient to predict our predispositions to hereditary diseases  or our life expectancy . The cell phone, the spy in our pocket , not only allows us to create our movement profile  with the data tracked by Internet corporations. It continuously records data that can be assigned to us. In the process, we can be identified by our face wherever we are in the vicinity of cameras . And if our face is turned away from the camera, artificial intelligence can also recognize people by their gait .
t is not even necessary to use the social scoring system  introduced in China as a deterrent vision, in which a points system rewards desirable social behavior and sanctions undesirable behavior. If I no longer start a course of study because I am certified to be at high risk of dropping out , or I vote for a party about which I am predominantly informed based on my profile , algorithms are improperly influencing my life. If an AI is better at detecting breast cancer than a doctor , or better at choosing a mate, all accessible for free, just for a small data fee, how long will we hold back before we voluntarily give up our sovereignty ?
Text: Bernd Lintermann, 2021
AI generated faces:
- GANs (Generative Adversarial Network) are a special AI method for generating various types of images, some of which are no longer recognizable to us as computer-generated. These can be photorealistic images of people, animals, landscapes or objects, but GANs can also imitate drawings or paintings, or change the mood of a landscape or the season. GANs were co-invented significantly by Ian Goodfellow in 2014 and are based on the idea that two neural networks compete with each other, improving each other in the process. One network (generator) produces an image of the desired type. These fakes are mixed with real images of the desired kind, and a second network (discriminator) tries to recognize the fakes from them. In the learning process, both networks improve each other's quality of generation and recognition until the generating network is able to generate high quality forgeries. On the website thispersondoesnotexist.com, you can see portraits generated in this way, but also pictures of cats, horses or three-dimensional chemical molecule structures.
- Implementation based on an AI project to generate photorealistic portrait images.
- Neue KI generiert fotorealistische Menschen und Katzen, derstandard.de, 02.02.2020
Image-based emotion recognition:
- »Emotionen« sind schwer definierbar, sueddeutsche.de, 27.03.2020
- Expertenstreit über Emotionserkennung durch KI, heise.de, 25.02.2020
Pulse detection with photoplethysmography (rPPG):
- Implementation based on a project for pulse detection in video streams of faces. From the image stream of the camera of the cell phone, the face of the person is recognized and spread out computer-internally in a two-dimensional (isometric) manner, so that the forehead of the person is always displayed in the same size and the same position of the resulting image. The color average is now formed from this image area and recurring patterns in the green component of this color average are examined over time using signal processing techniques, such as Fourier analysis, and the pulse is determined. No data is sent or stored during this process.
- Webcam fühlt den Puls des Nutzers, spiegel.de, 06.06.2013
- Remote heart rate measurement using low-cost RGB face video: A technical literature review, researchgate.net, 07.2018
Recognize facial attributes such as hair color, beard, glasses:
- Implementation based on an AI project for attribute recognition of faces in images. The image stream from the camera of the cell phone is analyzed locally on the PC for attributes in faces. No data is sent or stored in the process.
- Recognition of faces with masks has been researched even before the pandemic triggered by COVID-19, accompanied by the creation of large datasets of faces with medical masks. Exemplary of this in 2017 is the publication of the MAFA dataset. As of August 2020, datasets with community masks and AI-based software to detect them were also released.
- Implementation based on an AI project for mask recognition in images. The image stream from the cell phone's camera is scanned locally on the PC for faces with masks. No data is sent or stored in the process.
- Implementation based on an AI project for age recognition of faces in images. The image stream from the cell phone's camera is scanned locally on the PC for faces and their age is determined. No data is sent or stored in the process.
- Implementation based on an AI project for gender recognition in images. The image stream from the cell phone camera is analyzed locally on the PC for gender in faces. The AI is trained for binary gender recognition. The percentages displayed for gender represent the probability with which the AI performs the binary classification and have nothing to do with the real or perceived gender. No data is sent or stored.
- Self-implementation based on 3D face data captured by the smartphone. Using a 3D sensor, the cell phone provides a geometric description of the face, from which the geometry of the neutral facial expression can be inferred using knowledge about the facial distortion in the currently recognized facial expression. Faces are recognized if the neutral face geometries thus computed do not differ significantly from each other. The installation keeps the face geometries of the last five visitors in the main memory and does not send any geometry or store any geometry permanently.
- Implementation based on the Apple speech recognition API. The audio stream from the cell phone's microphone is continuously analyzed locally on the phone for spoken language. No data is sent or stored in the process.
- Implementation based on the Apple image analysis API. The image stream from the cell phone's camera is analyzed locally on the phone for faces. No data is sent or stored in the process.
The attempt to determine the biological sex of a person with the help of neuronal networks via the image of the face goes back to the 90s  of the last century. The norm established at that time to divide gender into male or female in a binary fashion is reflected in the development of gender recognition technology to this day. While Google responded in February 2020 by announcing it would remove gender labeling from its datasets , the technique is widely available and widely used. It thus reinforces the classic gender norm . AI often has a conservative effect, consolidating social structures.
Even though AI tends to be seen as male in a representative survey of the German population , voice assistants such as Siri and Alexa have a female attribution. In science fiction, AIs are also more often portrayed as female than male. In an article , the British journalist Laurie Penny attributes this to our historically male-dominated society, in which men (who develop the technology) see women in a servant role .
 LUHMANN, Niklas: Vertrauen. Ein Mechanismus Der Reduktion Sozialer Komplexität. 2. Erw. Aufl. Stuttgart 1973
 Face2Face: Real-time Face Capture and Reenactment of RGB Videos, Justus Thies, Michael Zollhöfer, Marc Stamminger, Christian Theobalt, Matthias Nießner., Proc. Computer Vision and Pattern Recognition (CVPR), IEEE, June 2016
 Synthesizing High-Resolution Images with StyleGAN2
 Golomb BA, Lawrence DT, Sejnowski TJ. SEXNET: A neural network identifies sex from human faces. In: Proceedings of NIPS; 1990. pp. 572–579.
Concept: Bernd Lintermann, Florian Hertweck
Project management, content and software: Bernd Lintermann
Dramaturgy and direction: Florian Hertweck
Actress: Annemarie Brüntjen
Design: Matthias Gommel
AI generated faces: Daniel Heiss
Production: ZKM | Hertz-Labor
Production support: ZKM | Videostudio, Xenia Leidig, Moritz Büchner, Jan Gerigk, Thomas Schwab
English voice: Manon Kahle
Englisch dubbing director: Jeff Burrell
English language version in co-production with EPFL Pavillons for the exhibition »Deep Fakes: Art and Its Double«, 2021