Cybersecurity & Tech Surveillance & Privacy

The Open Data Market and Risks to National Security

Justin Sherman
Thursday, February 3, 2022, 8:01 AM

Rather than focusing on single vectors of data collection and transmission, the U.S. government must respond comprehensively to the many vectors of data collection, aggregation, buying, selling and sharing that pose risks to national security.

Smartphone with apps. (

Published by The Lawfare Institute
in Cooperation With

As 2021 came to a close, the Washington Post published an investigation showing that the Chinese government is heavily monitoring the internet abroad to collect data on foreigners. The Post reviewed “bidding documents and contracts for over 300 Chinese government projects since the beginning of 2020” and found they included “orders for software designed to collect data on foreign targets from sources such as Twitter, Facebook and other Western social media.” In that story lies an important reminder for U.S. policymakers. While the Chinese government’s 2015 hack of the Office of Personnel Management sticks in many people’s minds as the most prominent case of data mining from Beijing, the reality is that the ecosystem and market of openly available data on U.S. citizens is a current and ongoing national security threat.

I testified to the Senate in December 2021 on data privacy and data threats to civil rights, consumer privacy and national security. Samm Sacks of New America and Yale Law School and Stacey Gray of the Future of Privacy Forum—whose in-depth insights are well worth reading—also testified about this issue set. Much of the hearing discussion focused on data brokerage: the virtually unregulated collection, aggregation, analysis, buying, selling and sharing of U.S. citizen data on the open market. This post expands on my testimony by broadening beyond Congress and focusing on how the entire U.S. federal government should mitigate the open data market’s risks to national security.

Data brokerage is a virtually unregulated practice. While there are some narrow controls around the collection, aggregation, analysis, buying, selling and sharing of certain types of data—such as with the Health Insurance Portability and Accountability Act (HIPAA) and covered health providers, or with the Family Educational Rights and Privacy Act (FERPA) and covered educational institutions—these regulations are limited and circumventable. It is remarkably easy to gather and share data on Americans, even millions at a time, without running into any legal barriers, regulatory requirements or mandatory disclosures.

Simultaneously, there are many shortfalls in existing executive branch approaches to these data risks. When the Committee on Foreign Investment in the United States (CFIUS)—which screens foreign investments in U.S. companies for national security risks—forced a Chinese firm to sell dating app Grindr back to a U.S. company because of the highly sensitive data it held, the U.S. government addressed the risk that Beijing could acquire the data through a corporate owner. However, the U.S. government did not mitigate against the fact that Grindr can entirely legally sell its data to data brokers—or the fact that Grindr already widely shares user data with third parties, including via a software development kit made by Chinese giant Tencent. This is not an issue with CFIUS, which made a national security-beneficial decision within its authorities. The issue, however, lies with what the rest of the U.S. government can do—and in this case did not do—about other ways the data could be acquired by foreign states.

Similarly, the Trump administration’s executive order to “ban” TikTok was bad for many reasons, including that it was politically driven, exceeded the bounds of the law, and did not actually build good policy to deal with the privacy and security risks of foreign software. That said, those defending it were also defending a myopic approach to alleged national security risks. Instead of comprehensively assessing every way that data from TikTok could be shared with or sold to Chinese entities, and instead of assessing how the Chinese government could acquire the same or even more valuable data on the open market, the order focused only on the legal ownership question. Both regulatory shortfalls and weaknesses in the executive branch’s approach to data and security risks demand comprehensive action.

Congress should strictly control the sale of data broker information to foreign companies, citizens and governments; strictly control the sale of data in sensitive categories, like genetic and health information as well as location data; and stop companies from circumventing those controls by “inferring” data. The executive branch, for its part, should not wait for congressional legislation to address these risks. In particular, the Federal Trade Commission (FTC) should crack down on the sale of highly sensitive categories of information where possible, such as GPS data; the White House should stop federal law enforcement agencies from propping up the dangerous data broker market by purchasing their data; and the Department of Defense and the intelligence community should advance internal controls to protect against data monitoring.

Rather than focusing on single vectors of data collection and transmission, the U.S. government must respond comprehensively to the many vectors of the open data market that pose risks to national security.

The Need for Strong Federal Privacy Regulation

Privacy regulation is essential to protecting civil rights and individuals’ physical safety. For instance, law enforcement agencies legally can buy millions of Americans’ GPS locations without a warrant; data brokers also can sell data on survivors of domestic violence to the abusive individuals hunting them. The longer the federal government takes to crack down on these practices, the worse and more frequent the harms will be.

Privacy is also important from a competitive standpoint. The United States’ current lack of a strong federal privacy law undermines its marketing pitch for a “democratic technology model.” The more other states introduce privacy controls on companies, from Israel to Brazil to China, the worse this will look for the United States. Economic strength and technological innovation are also key to national security, and a strong U.S. privacy law could make American firms more trustworthy and competitive—and ensure that other countries keep allowing cross-border data flows to the U.S.

Beyond civil rights, individual safety, and competition, privacy is essential to national security. Our research at Duke University has found companies widely and publicly advertising data on the open market regarding millions of Americans’ sensitive demographic information, political preferences and beliefs, and whereabouts and real-time locations, as well as data on first responders, government employees, and current and former members of the U.S. military. Data brokers gather identity information such as your race, ethnicity, religion, gender, sexual orientation and income level; major life events like pregnancy and divorce; medical information like drug prescriptions and mental illness; your real-time smartphone location; details on your family members and friends; and where you like to travel, what you search online, what doctor’s offices you visit, and which political figures and organizations you support. Hundreds of data brokers make selling this data their entire business model. Thousands more companies, from small businesses to technology giants, buy, sell and share data as part of this ecosystem.

Foreign governments already collect data on U.S. citizens through various vectors to enhance their intelligence operations, military posturing and diplomatic insight, as noted in the aforementioned Post story on the Chinese government scraping open data overseas to target foreigners. A year prior, the Post reported that Shenzhen Zhenhua Data Technology, a small Chinese company, was seemingly collecting millions of social media data points on “foreign political, military and business figures, details about countries’ infrastructure and military deployments, and public opinion analysis.” The database reportedly had information “on more than 2 million people, including at least 50,000 Americans.” Foreign Policy reported in 2020 that the Chinese government used stolen data to expose CIA operatives in Africa and Europe. And just this month, the director of national intelligence published an advisory about companies and individuals selling commercial surveillance tools to track journalists, dissidents and other individuals (which includes U.S. government personnel). In summation, as Samm Sacks wrote in her testimony, “The Chinese government has embarked on an ambitious national data strategy with the goal of acquiring, controlling, and extracting value from large volumes of data.”

The open data market exacerbates all of these national security risks. Foreign citizens, companies and governments can legally buy highly sensitive data on Americans from U.S. companies. Criminals and terrorist groups could likewise acquire this data to target vulnerable U.S. populations, government employees, key leaders in science and industry, and military personnel and their families. Foreign companies could also acquire data for competitive purposes, exploiting the lack of protections the U.S. government has put in place for its own citizens’ data. Given, for instance, the Russian government’s propensity to set up front companies and organizations to run cyber and information operations, it would also be low-cost in both money and process for a foreign government to set up a front company to legally buy highly sensitive data such as political preferences and GPS location histories. All of this could be used for bribery, blackmail, routine intelligence collection and much worse.

In one tragic 2020 case, for example, an angry lawyer found a federal judge’s personal information online for search and sale, went to her home, shot her husband, and shot and killed her son. As the judge, Esther Salas, wrote in a New York Times op-ed, “Judges’ addresses can be purchased online for just a few dollars, including photos of our homes and the license plates on our vehicles. In my case, this deranged gunman was able to create a complete dossier of my life: he stalked my neighborhood, mapped my routes to work and even learned the names of my best friend and the church I attend. All of which was completely legal. This access to such personal information enabled this man to take our only child from my husband, Mark, and me.”

Building a Comprehensive U.S. Government Response

In order to rectify the multitude of national security risks to the United States from the open data market, Congress should strictly control the sale of data broker data to foreign companies, citizens and governments; strictly control the sale of data in sensitive categories, like genetic and health information and location data; and stop companies from circumventing those controls by “inferring” data. In doing so, Congress should identify entities that broker data as being involved in the buying and selling of individuals’ data. This will avoid replicating a fundamental flaw in Vermont’s and California’s data broker laws, where companies selling data on their own customers, even millions of them, are automatically excluded from scope. Legislators should also remember that true “anonymization” is a myth. While there are some statistical techniques that can obscure individuals’ personal information in a dataset, there is so much information out there that it’s all too easy to trace nameless data points to specific people. Data brokers’ claims of anonymization—implying that no harm is caused when a name is removed from a dataset and then sold on the open market—are false and deliberately misleading.

The executive branch, for its part, should not wait for congressional legislation to address these risks. In particular, the Federal Trade Commission should explore whether and how its authorities may penalize the sale of highly sensitive categories of information where possible, as with GPS data, though there is an ongoing debate over just how much authority the FTC would have. Historically, the FTC has taken some enforcement action against companies selling Americans’ data under its “unfair or deceptive acts or practices” authority. Some observers argue that this pattern of enforcement action could be expanded to include the sale of particularly highly sensitive categories of data. Reps. Jamie Raskin, Katie Porter and 42 other colleagues recently urged the FTC to do just that—specifically to:

  1. Define the sale, transfer, use, or purchase of precise location data collected by an app for purposes other than the essential function of the app as an “unfair act or practice.”
  2. Define app developers’ mislabeling of users’ location data as “anonymous” as a “deceptive practice.”
  3. Enforce its regulations against companies abusing consumers’ location data through its penalty authority.

The representatives also urged the Federal Communications Commission to reaffirm its prohibitions on the sale of location data.

On the flip side, however, many of the FTC’s previous enforcement actions against data brokers involved cases of individuals or companies buying and selling data to scam consumers—which falls much more clearly within the FTC’s “unfair or deceptive acts or practices” authority. Some observers contend the FTC does not have sufficient legal grounds to act more broadly against data brokerage without additional congressional authorizations. Thus, Congress should still give the FTC more authority and resources to regulate the data brokerage ecosystem for the aforementioned reasons of civil rights, consumer protection and national security. In the interim, the FTC should explore how its existing authorities might permit it to crack down on the open data market.

If the Biden administration claims to care about promoting a model of democratic technology governance at home, it should not permit federal law enforcement agencies to circumvent Fourth Amendment and other controls to buy highly sensitive data on Americans, which currently happens without robust transparency and oversight. Further, the federal government is a significant data broker client. Buying data from these companies only props up the industry of buying and selling Americans’ data on the open market—and as a result, only enables these national security risks to persist.

The obvious counterargument to this proposal, which many in the law enforcement space already advance, is that the purchasing of this data on U.S. citizens is necessary to conduct investigations. Law enforcement agencies, however, have conducted investigations for years without this data, and there is no qualitative evidence to suggest that the acquisition of such data significantly improves the efficiency of investigations or reduces crime. Further, the data that law enforcement agencies purchase may not even be accurate, compromising investigations themselves and raising additional civil rights concerns. For example, a 2019 district court ruling in Gonzalez vs. Immigration and Customs Enforcement found that “the databases on which ICE relies for information on citizenship and immigration status,” which draw on data from multiple data brokers, “often contain incomplete data, significant errors, or were not designed to provide information that would be used to determine a person’s removability.”

This claim that the data is useful for investigations does not change the fact that law enforcement agencies are effectively circumventing the Fourth Amendment and other protections by buying citizens’ data on the open market. The court in Gonzalez vs. Immigration and Customs Enforcement (2019) itself found that “the evidence presented at trial establishes that ICE violates the Fourth Amendment by relying on an unreliable set of databases to make probable cause determinations for its detainers.” From a policy perspective, law enforcement claims of investigatory “necessity” should not be considered until the question of evading civil rights protections is addressed. And the fact that the executive branch helps prop up the data broker industry must be acknowledged in addressing national security concerns. The Department of Homeland Security’s and the FBI’s respective counterintelligence missions, for example, are directly threatened by the open-market availability of highly intimate information on politicians, federal judges, senior government officials and U.S. intelligence assets.

Lastly, the Defense Department and the intelligence community should advance internal controls to protect against data monitoring. In 2018, when researchers found Fitbit and other fitness wearable data on the public internet displaying user’s movements all over the world—including U.S. military personnel walking around forward operating bases in Afghanistan and CIA operatives visiting “black sites”—the Defense Department swiftly issued a memo prohibiting personnel “from using geolocation features and functionality on government and nongovernment-issued devices, applications and services while in locations designated as operational areas.” The Defense Department and the intelligence community should conduct internal security reviews focused on how open data on U.S. personnel can undermine their missions and threaten national security. From this, they can propose internal controls to mitigate identified risks.

Of course, this proposal speaks to a broader challenge: There is only so much the U.S. government, and even U.S. companies and citizens, can do to place constraints on the openness of data flows and information. Additionally, there is only so much they should do; the solution to this set of challenges is not to pursue overly restrictive policies that end up undermining important features of the democratic system, like the ability to share information (including personal information) online. Foreign adversaries, particularly the Chinese and Russian governments (and their proxy networks), will also continue trying to steal, scrape and illicitly acquire Americans’ data no matter what. But the U.S. government should still do what it can to mitigate or eliminate some of these national security risks, particularly around sensitive categories of data. It certainly should not make it so easy for foreign adversaries, criminals and terrorists to acquire sensitive data on the open market to threaten individuals’ physical safety and undermine national security.

Data privacy is vitally important to protect civil rights and individuals’ physical safety. The open data market harms and is used to harm every American, particularly the most vulnerable, and the harms on the individual privacy and consumer exploitation fronts alone are sufficient reasons for action. Yet as the recent Post story underscored, the open data market poses substantial risks to national security—and Congress and the executive branch must build a comprehensive approach to mitigating them.

Justin Sherman is a contributing editor at Lawfare. He is also the founder and CEO of Global Cyber Strategies, a Washington, DC-based research and advisory firm; a senior fellow at Duke University’s Sanford School of Public Policy, where he runs its research project on data brokerage; and a nonresident fellow at the Atlantic Council.

Subscribe to Lawfare