Cybersecurity & Tech Surveillance & Privacy

Human Subjects Protection in the Era of Deepfakes

Aimee Nishimura
Thursday, November 2, 2023, 10:17 AM
The unique risks posed by deepfakes require special consideration for the Defense Department’s use of the technology.
"Facial Recognition 1." (EFF Photos, https://tinyurl.com/2w279xtt; CC BY 2.0 DEED, https://creativecommons.org/licenses/by/2.0/)

Published by The Lawfare Institute
in Cooperation With
Brookings

For years, the U.S. government has funded research to detect and interdict deepfakes to combat foreign disinformation campaigns. Deepfakes are convincing digital forgeries—manipulated or generated images, video, and audio—created by artificial intelligence (AI). Last year, the U.S. military took a nearly unprecedented step by declaring its interest in deepfake technology specifically for offensive purposes. U.S. Special Operations Command (USSOCOM)—which is made up of the most exceptional units from the Army, Marine Corps, Navy, and Air Force—released an update in a procurement document to explicitly solicit technologies that “[p]rovide a next generation of ‘deep fake’ or other similar technology to generate messages and influence operations via non-traditional channels in relevant peer/near peer environments.” The government is asking for AI technology that uses millions of data points of images, people’s faces, and voices to produce fake images, audio, and video to mimic a real person or to generate a realistic fake one. Beyond defensive uses, the government wants to produce fake content to influence overseas populations to serve U.S. interests. Fake information has long been used to support and undermine national security efforts. The development of highly sophisticated generative AI, however, is a game changer.

Deepfakes can be used for coordinated disruption and information warfare, to impair military and intelligence operations, manipulate elections, erode trust in public institutions, and threaten the economy, as well as for blackmail and reputation damage. For example, a deepfake could be used to generate war propaganda to sway public support in a given country. Further, deepfakes create a new type of harm dubbed the “Liar’s Dividend,” in which bad actors avoid accountability for their real words and actions by claiming to be victims of deepfakes. These harms are already manifesting in threats to democratic institutions and legitimacy. So, it is no surprise that the Defense Department has prioritized the procurement of this technology both to defend against it and to use it to conduct campaigns against adversaries.

There are reasons why the Defense Department’s exploration of these technologies may be applauded, especially because they have already been deployed by adversaries. Recently, various deepfakes have circulated that depict Ukrainian President Volodymyr Zelenskyy ordering his troops to surrender and Russian President Vladimir Putin declaring martial law. There was also a deepfaked trophy hunting photo of a U.S. diplomat with Pakistan’s national animal, as well as (for the first time ever) news broadcasters in China, among others. A digital forensics professor at U.C. Berkeley has already documented several deepfake audio, images, and videos related to the upcoming 2024 presidential election, with more expected to come over the next year.

To develop offensive and defensive applications of this technology, the Defense Department relies on cooperation with private industry and academia. However, cooperation in the development of a potentially potent tool for disinformation comes at a time when public trust in the government is low. Public trust determines who will and won’t contract with the government, and ultimately what programs do and don’t get funded.

With respect to trust, deepfake technology, unlike other AI, is uniquely concerning from a privacy and ethical standpoint due to the volume and type of biometric data needed. It takes hundreds to thousands of images of a person to generate a believable deepfake. Biometric data is arguably the most personally identifiable data an individual possesses—people can replace a credit card or change a social media handle in ways they cannot replace or change their faces or voices. Beyond the privacy considerations of stockpiling the public’s most immutable data points, there are also ethical concerns about using an individual’s image or voice for military or political purposes without their consent.

Solving this quandary may not be the onerous task it might seem at first. While much of AI regulation is lagging behind the technology, there are existing laws on human subjects research (HSR) that should be applied to deepfake technology. Invoking these institutions and principles for Defense Department use of deepfake technology is especially crucial when public trust is near historic lows. Adherence to universal principles of human subjects protections in the technology the Pentagon develops and procures will help restore some of that lost trust while advancing its interest in protecting the security of the nation. In practice, this means empowering extant human subjects protections oversight within organizations that conduct this research, such as the Defense Department.

Deepfakes and the Policy That Informs Defense Department Use

Real, identifiable human subjects are at the core of deepfake technology. Early use of “deepfake” as a term is attributed to a 2017 Reddit user, “deepfakes,” who shared synthetic pornographic videos of celebrities online. In fact, the most common deepfakes in circulation are nonconsensual pornographic images and videos of women. A 2019 deepfake detection study by Deeptrace Labs found that 96 percent of identified deepfake videos were nonconsensual porn of mostly celebrity women. Pornographic or not, however, deepfakes are distortions of real people and are often identifiable as the original human subjects.

Most of the deepfake-related data sets available to the public are compiled from scraped images and videos of varying quality that are uploaded by users to platforms such as YouTube. High-quality, proprietary data sets are created with images and videos of often paid actors who provide their image and voice to be recorded directly by researchers. Hundreds, and sometimes thousands, of facial images of a specific individual are required to generate and detect deepfakes, with the greater amount and variety resulting in better output quality. Both of these collection methods raise important questions about affirmative, informed consent, and whether subjects are informed of intended use of their data and the risks associated with storing and sharing such data. While scraped data has been an ongoing issue for civil liberties and intellectual property worthy of its own safeguards, it is meaningfully different from the high-quality biometric data necessary to test and train deepfake detection and generation models.

Many technologies will have similar concerns and considerations when it comes to data subject consent and risks. However, unlike the Defense Department, which has openly proclaimed its intentions to use digital deception and disinformation campaigns at the tactical edge, other technologies don’t foster the same dignity harms from the intended use (generating and deploying deepfakes) of their technology.

When the Defense Department wants to acquire data and technology, it must adhere to multiple authorities. Title 10 of the U.S. Code outlines the role of armed forces, and § 980 expressly forbids Defense Department funds from being used for human subjects research absent informed consent. The National Defense Science & Technology Strategy communicates department priorities and goals with respect to defense research and investment. The Federal Acquisition Regulation and Defense Federal Acquisition Regulation Supplement (DFARS) provide precepts for government procurement of supplies and services including standard solicitation provisions. Additionally, Department of Defense Issuances—which include formal directives, instructions, memos, and manuals—are policies and procedures that regulate defense activities. With respect to human subjects research oversight, the Component Office of Human Research Protections (DOHRP) ensures every institution within the department complies with federal laws that focus on participants rights and welfare.

At the federal level, some risks of AI deployment are acknowledged; and in the absence of regulations, both the Defense Department and the Office of the Director of National Intelligence (ODNI) have released frameworks that address the risk of AI technology to the public’s constitutional rights and civil liberties. Both ODNI’s Artificial Intelligence Ethics Framework for the Intelligence Community and the Defense Department’s Responsible Artificial Intelligence Strategy and Implementation Pathway (RAI S&I) recommend standards for accountability and risk management to deploy reliable and safe AI systems. With respect to civil liberties risks, both frameworks discuss issues such as undesirable bias, data privacy, and risks created by third-party vendors. However, both frameworks fall short when it comes to guidance for research protections for human subjects. Both frameworks fail to include guidance for the handling, use, storage, or security of biometric data. Additionally, neither framework addresses requirements for informed consent, which is a cornerstone of human subjects protections. The National Institute for Science and Technology (NIST) developed the AI Risk Management Framework, which briefly addresses privacy as a norm for safeguarding autonomy and dignity including an “individual’s agency to consent to disclosure or control facets of their identities (e.g., body, data, reputation).”

To support human subjects protections in AI deployment, deepfake technology can and should be properly categorized as HSR, and appropriate human subjects review procedures should be applied to it. This requirement is aligned with the ODNI’s and Defense Department’s stated interests in responsible AI deployment, and with NIST’s privacy values, and is necessary for protecting civil liberties. In practice, I recommend adhering to the DFARS procedures for the protection of human subjects. These procedures closely mirror the Federal Policy for the Protection of Human Subjects, also known as the Common Rule.

Reframing Deepfake Technology as HSR

Since current AI frameworks don’t properly address the human subjects included in AI/machine learning data sets, scholars and policymakers should look to human subjects study laws and their guidance. When research involves human subjects, researchers are especially concerned with the protection of basic rights and dignity, as well as power imbalances that might influence civic participation, fairness, and bias. These considerations lead the U.S. government to prohibit some research practices outright, to impose certain procedures and limitations on others, and to prescribe the terms of informed consent. Oversight systems such as institutional review boards (IRBs) and the federal code already exist, so invoking these does not require completely new guidance.

Research ethics for human subjects is a relatively recent phenomenon. The now-infamous Church Committee report—a product of the 1975 Church Committee to investigate federal intelligence operations—revealed illegal and unethical clandestine human experiments conducted by the U.S. government. While investigating federal intelligence activities, the Church Committee uncovered projects such as MK-ULTRA, in which the CIA dosed unwitting subjects with LSD, leading one to leap to his death while drugged. Other experiments included trying to develop “hypnotically induced anxieties” and the Canadian Experiments—in which human subjects were placed into drug-induced comas for weeks at a time while subjected to constant audio loops of noise or repetitive messages. The Church Committee ultimately determined serious wrongdoing on the part of the U.S. government that violated the “fundamental principles of our constitutional system of government.”

Additionally, the revelation of research such as the Tuskegee Study—in which Black men with syphilis were deceived into believing they were receiving treatment for their condition when they were actually receiving no treatment—led to the adoption of foundational research ethics principles of respect for persons, beneficence, and justice. Respect for persons, in particular, has two moral requirements, to acknowledge autonomy—an individual’s ability to deliberate and act in their own interests—and to protect people with diminished autonomy. The principles, outlined in the Belmont Report (a government report on ethical principles and guidelines for the protection of human subjects of research), are applied through informed consent, assessment of risks and benefits, and selection of subjects. Just over a decade later, the U.S. formally adopted the Federal Policy for the Protection of Human Subjects, also known as the Common Rule, the standard of ethics for government-funded research involving human participants. The Common Rule lays out regulations for federally supported human subjects research including IRB standards and procedures, and requirements for informed consent. The Defense Department and 19 other agencies are signatories and must adhere to the Common Rule, which defines a human subject as any living individual about whom researchers obtain, use, analyze, or generate personally identifiable information.

The inclusion of deepfakes as HSR under the Common Rule is complicated by the fact that these regulations were not written with this application of technology in mind. Under the Common Rule, data sets that include human personally identifiable information are indisputably HSR. Thus, deepfakes may properly be categorized as HSR. There is an additional complexity concerning how consent should be secured. Broad consent can be obtained in lieu of informed consent for storage, maintenance, and secondary research use of personally identifiable information. Initial informed consent is still required; however, secondary use does not require new consent—the logic here is that the intervention is in the past, so researchers can reanalyze the data however they wish. As such, deepfake data sets compiled with the consent of paid actors, for example, would not need to reacquire consent from the data subject for use of their image or voice if a company collects their data for one purpose (training their own detection models) and sells that data for another purpose (government generation of deepfakes). However, this requires an initial commitment to informed consent that includes sufficient information about the research being conducted that helps participants understand reasons one might or might not want to participate. To use this research, the Defense Department must ensure that informed consent is collected and that appropriate data management steps are taken in order to protect the rights of human subjects.

The problem isn’t that human subjects, consent, and ethical use have not been considered. They have. And from a normative perspective, these human protection codes do a good job filling the gaps that AI frameworks have yet to address. The problem is that data collection for deepfake technology constitutes HSR under the U.S. Code but simply has not been treated as such.

Robust protections for human subjects are vital in deepfake technology and research because a person’s image, face, and voice are the physical representation of their identity. Image search engines such as Google Images, for example, illustrate how easily an image or a video could be used to connect an individual to other identifying information. The ability to manipulate an individual’s likeness and voice to say and do things they have not can evoke embarrassment, irritation, humiliation, or a sense of being undermined. These are dignity harms. When a person is no longer the only determiner for their actions—because another person can control how their movements and words are exhibited online—they lose at least some of their autonomy.

In a 2019 interview with HuffPost, one victim of nonconsensual deepfake pornography described seeing a deepfake pornographic video of herself: “When it’s Photoshop, it’s a static picture and can be very obvious that it’s not real. But when it’s your own face reacting and moving, there’s this panic that you have no control over how people use your image.” Recently, in response to the concerns of Hollywood studios using deepfake technology to replace paid actors, actor Sean Penn did not mince his words. “So you want my scans and voice data and all that. OK, here’s what I think is fair: I want your daughter’s, because I want to create a virtual replica of her and invite my friends over to do whatever we want in a virtual party right now. Would you please look at the camera and tell me you think that’s cool?” he said.

How This Might Work in Practice

To illustrate the promise and risk of government use of deepfake technology, it is helpful to examine a specific contract grantee. Since 2022, the Air Force Research Laboratories (AFRL) has awarded four Small Business Innovation Research contracts to the company DeepMedia for deepfake detection and multilingual translation. DeepMedia is an AI communication company founded in 2017 whose two primary products are a universal translator, which CEO Rijul Gupta refers to as their “generation product,” and a deepfake/AI detector that has been trained on proprietary data sets. According to COO Emma Brown, the data sets contain over one million faces, images, and voices.

DeepMedia—which is also registered as Anonymous A.I., Inc.—owns multiple products including DubSync.ai and PolyTalk Translator. (Dubsync.ai and PolyTalk Translator are both translating and dubbing apps where users upload their own videos to translate them into dozens of different languages. PolyTalk Translator, for example, is marketed as a way to easily communicate with family overseas.) Neither DubSync, nor PolyTalk Translator, nor DeepMedia has published privacy or data use policies. Their failure to publish these policies is especially concerning because they offer public-facing applications that encourage users to upload their own images, videos, and audio without expressly informing them how this data might be used.

If deepfakes are HSR, then the Defense Department is required to observe proper human subjects protections with respect to data collection. The government must hold themselves and their contractors to the appropriate standards if it is ever to ethically use deepfake technology. In the case of DeepMedia, this would require collecting and maintaining informed consent documents that include the basic elements outlined in § 219.116(b) of the U.S. Code. The documentation should clearly show that DeepMedia provided participants with sufficient information about the research that they understand why one might or might not want to participate, as required by the 2018 revised Common Rule. These revisions speak directly to the importance of a person’s ability to make decisions for themselves, and to the federal government’s inclination to protect individual autonomy and dignity. In this case, the AFRL would need to have these documents reviewed through their IRB, as they would be required to if they were directly conducting HSR themselves.

Requiring contractors to provide this additional documentation may create a compliance burden—albeit a minimal one. This burden, however, could be offset by the benefits of demonstrating a commitment by contractors and the government to uphold human subjects protections. The true cost of compliance would be the lack of access to certain data sources used by adversaries, who are often not beholden to the same ethical and protective standards. Both deepfake generation and detection require access to high-quality data sets, and limiting or preventing the use of such data might put the U.S. at a strategic disadvantage for detecting adversarial deepfakes that are already circulating. It would also limit the Defense Department’s ability to influence operations, which may result in more costly or dangerous operations abroad. And as adversaries continue to deploy deepfakes, it’s reasonable to assume that the United States will continue to procure technology to detect them. To balance the potentially competing concerns of human subjects protections with the Defense Department’s ability to effectively develop and deploy AI systems, the DOHRP should be a stakeholder in the department’s Responsible Artificial Intelligence Strategy and Implementation Pathway. While there is an open question about what should be done about extant work, going forward, procurement of these data sets and technology can and should adhere to HSR protections.

The Future of Deepfakes and National Security

There are unanswered moral questions about government use of deepfake technology as it pertains to human subjects protections. If Defense Department-generated deepfakes are meant to be released in the public, what additional considerations should be made to ensure no negative repercussions befall the original human subject? Will the government take measures to ensure extra protections for biometric data produced by vulnerable populations, like children or individuals with cognitive disabilities?

Current AI frameworks do not explicitly address data subject consent, and the relationship between AI research and HSR is currently unsettled. This creates ethical uncertainty and gaps in oversight of Defense Department use and procurement that may further erode public trust. The department would benefit by explicitly identifying deepfake technology as HSR and publishing special guidance to government agencies and contractors to ensure that proper informed consent is obtained from individuals whose biometric data are used in data sets. This is particularly relevant in the AI research area where human subjects data use and protections are not universally understood or applied. This effort does not require new policy. Rather, it is already required by 10 U.S.C. § 980 and the Common Rule (on which the Defense Department is a signatory.)

The Common Rule should be clarified to acknowledge the huge advancements in modern computing power and our ability to collect and store vast amounts of data—and how this ability to compile more complete dossiers on people requires robust scrutiny and safeguards. What the Defense Department might also consider is adding this protection of human subjects language and a taxonomy to their procurement process to help contractors and government officials determine when and to what extent technologies and data require HSR designation with additional safeguarding processes like an IRB review. Finally, the DOHRP should be added as a stakeholder to the RAI S&I, and sections of RAI S&I that include data oversight and protections should be amended to include language that is specific to human subjects protection.

The world is changing faster than ever before thanks to AI, and people most often look to their governments to respond to those changes while advancing their general welfare. The Defense Department is the largest federal agency tasked with deterring war and ensuring the nation’s security. Ensuring that biometric data—such as deepfake data sets—comply with HSR protocols will advance both of these initiatives and help build trust with the American people.


Aimee Nishimura is a graduate student at the LBJ School of Public Affairs and a Cyber Student Fellow with the Strauss Center for International Security and Law at the University of Texas at Austin.

Subscribe to Lawfare