A technological monument, a celebratory milestone in human history, or the beginning of the downfall of modern day democracies? In early February, OpenAI announced the release of Sora, an innovative tool that turns textual prompts into videos which are both impressively, and shockingly realistic. This text-to-video model is a significant advance in the integration of deep learning, natural language processing, and computer vision. It is the first of its kind, transcending the limits set by previous text-to-video technologies. It has overcome limitations relating to the kinds of visual data it can interpret, video length, and improved resolution. The tool marks yet another leap in the realm of A.I., which experts have warned, is accelerating faster than we can imagine. It has sparked both lively conversations regarding the positive usage of text-to-video prompts in the realm of graphic design, and great concern by many, both for its political and humanitarian implications. While its release date has not yet been announced, its consequences are already in effect with its potential to not only upturn democratic values, but also national security, child protective rights, and the fine arts industry.
In September of 2022, Jason Allen entered the Colorado State Fair’s fine arts competition with an A.I. generated piece, winning first place. The piece of artwork, titled Théâtre D’opéra Spatial, amazed judges and went undetected for its creation through an A.I. generator. The later discovery that the artwork had been prompted by an A.I. generator called Midjourney, left many artists outraged, sparking debates surrounding the true meaning of art. Critics argue that Allen’s piece was outright plagiarism, having been constructed from millions of integrated art pieces. They argue that actions like his would lead to the eventual dissemination of the fine arts industry, and were an embarrassment to artistic values. Allen, on the other hand, argued that A.I. is merely a tool and “without the person behind it there is no creative force.” “I won,” he said, “[A.I. is] here now. Recognize it. Stop denying the reality. AI isn’t going away.”
The creation of Sora not only allows artists to create art from text, but also to create text into seemingly real videos. This has serious implications in the field of fine arts. OpenAI is currently facing a wave of lawsuits from artists, authors, and The New York Times over its use of alleged copyrighted material as training data. What does this mean for filmmakers and animators? While the consequences of a text-to-video tool is far from achieving the level of artistry by filmmakers such as Hayao Miyazaki, the everyday animator might still be at risk.
With the upcoming U.S. Presidential election, along with several other elections around the globe, a worrisome risk that the creation of Sora brings is the spread of misinformation. While Sora, being in its early stages, is far from perfect in transferring text to video, deep fake expert Henry Farid argues that “there is no reason to believe that text-to-video will not continue to rapidly improve—moving us closer and closer to a time when it will be difficult to distinguish the fake from the real.” The potential development of technology to devise realistically undetectable deep fakes have, and will likely continue to be used to skew and misinform voters. In early January of this year, a call impersonating the voice of U.S. President Joe Biden, otherwise known as a robocall, called for democrats not to vote in the upcoming federal election. Later on, it was discovered that a telemarketing company based out of Texas was behind the call and was thereby investigated for illegal voter suppression. The concern, as Lawon argued, “is not the big, bad deep fake of somebody at the top of the ticket, where all kinds of national press is going to be out there to verify it. It’s about your local mayor’s race.”
Generative AI may amplify cybersecurity risks, making it easier and cheaper to engulf democratic nations with fake content during election times. This gives global powers who oppose democratic values the tools to undermine Western democracies. Don Fallis, a philosophy professor at Northeastern University, argues in a recent paper that counterfeit news causes people to be misled and lose trust in epistemic authoritative institutions, such as the government, leading to epistemic decline. Fair representation, reliability, and trust are just a few pillars of Western democracies; the potential unveiling powers of deep fakes may disrupt this.
The rapid progression for AI to create fictional images democratizes video production, allowing for enhanced accessibility and efficiency for individuals to generate video content. In the wrong hands, video deep fakes have led to the detrimental dehumanization and exploitation of children. In April of last year, a Quebec man was sentenced to more than three years in prison for the creation of synthetic videos depicting child pornography. Using advanced deep fake technologies, he was able to superimpose the faces of individuals onto the bodies of other individuals. Clearly, the increased accessibility of deep fakes underscored by recent advancements grants everyone accessibility to the kind of software that prompts hyperrealistic videos. Deep fake technology can be used by child sex predators to create CSAM images or videos based on children. While a child may not be physically harmed by the making of fictional depictions of child pornography, it is still a form of child sexualization that may only perpetrate the use of deep fakes to create such videos.
Deep fakes are not a double edged sword. There is no silver lining beneficial enough to balance the corruption and deterioration of national security, child protection, and democratic values that they cause. Rather, many AI experts have cautioned that we should steer clear from further refining deep fake technology until stricter regulations are in place. Just a few weeks ago, AI expert Yoshua Bengio, amongst 900 others, signed an open letter calling for stricter regulations surrounding the use and distribution of deep fakes. The statement contained three core demands related to deep fakes, which they define as “non-consensual or misleading AI generated voices, images, or videos.” The first demand is to criminalize deep fake child pornography, second to establish penalties for individuals spreading harmful deep fakes, and third, to enact requirements for software developers to prevent their product from creating harmful deep fakes.
Currently, there are very few effective regulations in place that constrict the imminent threat of deep fakes. In terms of political ads, the First Amendment to the U.S. Constitution protects those who wish to lie. Therefore, it is not illegal for candidates to lie in their paid ads, which complicates the regulation of deep fakes. While several U.S. states have attempted to regulate the use of deep fakes, there remains no federal legislation which protects against the threats of deep fakes.
Yet, the European Union AI Act, finalized in September of last year, generates some hope. The Act stands to enforce the world’s first comprehensive law to govern the development and use of AI technology. It is, however, not expected to come into full force until the end of 2025 or 2026. The regulation aims to ensure that our fundamental rights, democracy, and rule of law are environmentally sustainable and protected from high risk AI.
Technology on its own is a neutral tool, assuming only the identity that its user assigns. The future of deep fakes stands at a crossroads, defined by how experts and authorities will exercise their power to protect the lives of individuals of all ages, around the world. As further measures are taken to regulate the use of deep fakes and generative AI, anchoring new policies in human rights is crucial. As the human race moves forward, we must not forget to reflect on the past and remind ourselves, as Raquel Vasquez notes, that “progress is not how far you go, but how many people you bring along.” Technological advancements that disadvantage, marginalize, and put certain demographics at risk is not progress at all. As humans, we have a duty to ensure that technological exploitation is not shadowed by its glamour in order to ensure an equitable, sustainable, and brighter future.
Edited by Madeline Chisholm
Megan Tan is in her third year at McGill University, currently pursuing a BA&Sc in Cognitive Science with a minor in Philosophy. As a Staff Writer at Catalyst Publications, Megan aims to bridge her background in Behavioural Science with International Development as her writing is mainly focused on the Health and Technological dimensions of global political issues. Having grown up in Singapore, Qatar, and Canada, Megan strives to use her diverse upbringing to offer a multifaceted lens through which she examines the interplay of technology, health, and cognitive science.