Interactivity and authenticity in AI-augmented videos
Author(s)
Sankaranarayanan, Aruna
DownloadThesis PDF (25.76Mb)
Advisor
Lippman, Andrew
Terms of use
Metadata
Show full item recordAbstract
The arrival of AI augmented devices, illusions, algorithms, and a new wave of users who are engaging with these digital artefacts creates the possibility of new and delightful digital experiences. Can AI augmented digital media allow consumers to engage actively with the content they consume? How would the design of interfaces that enable viewers to not just select the content they watch but actively alter it look like? Would we be able to build dials and buttons into a new AI augmented video that viewers can meddle with to invoke a range of alterations to the media they see, that is now under their control? How would such alterations colour the underlying information in the videos? In such a world, would the viewer be able to tell apart the synthesized from the real? This thesis begins with a series of explorations that apply existing computer vision algorithms to modify the content in a video. I create artistic renditions of news and music videos using neural style transfer algorithms, share how segmentation models could be used to change the way in which the underlying information is coloured by transplanting backgrounds and foregrounds between videos, utilize latent expression transformations to modify different intrinsic qualities of a person in an image, create delightful virtual communication experiences by making certain objects disappear in a video. Inspired by the dramatic differences in visual affect between newscasters from 1969 and 2017, I then design and implement a new computer vision algorithm that allows the viewer to modify the facial affect of a person in the video they are watching using a generative adversarial network. Recognizing the dilemmas associated with commercializing such augmentations for mass consumption, I explore how neural network driven manipulations, deepfakes, are discerned by individuals. To do this, I create a new deepfakes dataset of Presidents Donald Trump and Joseph Biden and test how individuals employ visual, auditory and textual reasoning to differentiate between real and synthesized media objects. I also report findings that Biden voters show a shift towards motivated reasoning based discernment when the political content in a media object is visible in audio or text modalities.
Date issued
2021-09Department
Program in Media Arts and Sciences (Massachusetts Institute of Technology)Publisher
Massachusetts Institute of Technology