Surveillance & Privacy

Getting Transparency Right

Daphne Keller, Max Levy
Monday, July 11, 2022, 9:01 AM

What’s the best path forward for platform transparency regulation?

A person using a cellphone. (https://pxhere.com/en/photo/923362; https://creativecommons.org/publicdomain/zero/1.0/, CC0 1.0)

Published by The Lawfare Institute
in Cooperation With
Brookings

Transparency is essential to getting every other part of platform regulation right. But defining sound transparency rules—identifying what information is needed most from platforms like Twitter or YouTube, and how to get it—is quite complicated. In a Senate hearing this spring, one of us testified about these complexities. This post builds on that congressional testimony. Its substantive points are relevant to EU regulators and U.S. state lawmakers currently considering transparency legislation, in addition to members of Congress. 

New legal models for platform transparency can be found in laws like the EU’s Digital Services Act and proposals like the draft Platform Accountability and Transparency Act and the Digital Services Oversight and Safety Act in the U.S. Collectively, these models set forth multiple transparency measures, each serving different purposes. 

Public transparency reports, for example, can disclose aggregate data that helps consumers and lawmakers understand the scale and evolving nature of abusive online speech, and track platforms’ responses. Special disclosures of more sensitive data to vetted researchers can allow independent analysis of information currently known only to platforms. New legal protections for researchers who “scrape” data from platforms can expand the pool of thinkers involved in assessing platform behavior, and avoid creating platform or government “gatekeepers” for data access. An optimal transparency framework likely includes all of these components and more. 

Collecting and analyzing data about platform content moderation is complex and time consuming. Both the relevant technologies and research priorities may change over time. For this reason, robust transparency laws will almost certainly require some degree of flexibility and ongoing oversight by regulators. 

The core policy judgments required for transparency laws, however, should be made by Congress. Congress should expressly address—and, in our opinion, reject—any legislative changes that would reduce internet users’ protections from surveillance or platforms’ immunities under the law known as Section 230 of the Communications Decency Act. Congress should also assess and make the judgment calls that will be examined in the likely event of First Amendment litigation over transparency laws. This post will examine those critical issues and also unpack others—including competition and consumer privacy—for which the better course may be for Congress to set policy guidance for regulators. 

Section 230

Section 230, which immunizes platforms for some kinds of unlawful content posted by users, has become Washington’s least popular internet law. As a result, there are obvious political advantages for lawmakers who tie their transparency proposals to Section 230 reform. Any advantages for sound policymaking are far less clear. Connecting platform immunity to transparency obligations may do more harm than good. 

The primary reasons are practical. Transparency reporting is difficult and painstaking work. Practitioners find legitimate grounds for disagreement about even basic definitions, like what counts as an “item” of content, or how to quantify “notices” or “appeals.” Platforms that do not have years of experience with transparency reporting—and even platforms that do have such experience, like Facebook—are all but guaranteed to make mistakes. Such inevitable errors should not cause the entire platform to lose business-critical immunities. Indeed, tying the two together risks creating truly perverse incentives for platforms. To reduce risks of erroneous reporting, they would have reason to adopt blunt content policies instead of nuanced ones, or to keep content moderation operations static instead of more dynamically iterating in response to unexpected events or new technologies. 

A narrower rule removing Section 230 protections only for claims related to specific transparency failings would be less drastic than one in which a single error caused platforms to lose legal protections entirely. It would still be a very bad idea, though. For plaintiffs’ lawyers and platforms, it would create a sort of lottery. New opportunities for litigation would vary, based not on considered public policy, but on what transparency mistakes a platform happens to make. A reporting error involving consumer reviews might open the door to litigation by restaurants disputing customers’ online claims; an error in explaining algorithmic ranking might bring lawsuits about which news sources platforms should prioritize. 

This litigation lottery is not anyone’s idea of sensible Section 230 reform. Whatever one thinks of laws like the Stop Enabling Sex Traffickers Act/Allow States and Victims to Fight Online Sex Trafficking Act (SESTA/FOSTA) and proposals like the Platform Accountability and Consumer Transparency (PACT) Act or the Eliminating Abuse and Rampant Neglect of Interactive Technologies (EARN IT) Act, they at least reflect considered approaches to addressing specific problems. The PACT Act, for example, sets forth a specific process for removing defamatory content from platforms. The EARN IT Act and SESTA/FOSTA, for all their shortcomings, target specific harms related to sexual abuse and trafficking using statutory language tailored to address those issues. Puncturing immunity in response to transparency failings would be more of an arbitrary, shotgun blast approach. 

Finally, involving Section 230 in transparency legislation would make the cost of legislators’ own mistakes much higher. An imperfect, first-generation transparency law would likely still merit support from many experts and organizations—it could do a lot of good, and its downsides would largely fall on platforms. Careless amendments to Section 230, by contrast, could directly threaten the safety and speech rights of ordinary internet users. Such society-wide consequences should not be tied to the nascent, rapidly evolving field of platform transparency. 

Privacy

Lawmakers should expressly address the tensions between platform transparency and user privacy in any proposed law. If Congress mandates disclosures that potentially expose users’ personal information to researchers, it should do so under strict rules. In no case should such disclosures reduce users’ protections from surveillance by the government. 

Researchers’ Access to Private Data

Both the EU’s Digital Services Act and many U.S. transparency proposals include requirements for platforms to disclose data to independent researchers. In many cases, those researchers’ work will be difficult or impossible to carry out unless the researchers can see private information about platform users. This is in many ways an irreducible problem: We cannot have both optimal research and optimal privacy. 

Ideally, lawmakers navigating these trade-offs would start by crafting robust baseline federal privacy legislation. They should also look to emerging models in the EU. Ultimately, U.S. transparency legislation should probably set bright line rules on some issues, and provide federal agencies with guidelines on others. Some of the many questions to be resolved include the following:

  • Can large data sets be “anonymized” enough to protect privacy? As privacy practitioners know, it can be very easy to identify individuals’ private information using aggregated or de-identified data sets. Researchers have identified individuals based on “anonymized” viewing data released by Netflix, for example. Technical tools like differential privacy—which alters data sets to preserve broad, pattern-level data about a user community without revealing reliable information about specific users—can help but may also interfere with data analysis and reproducibility of research results. Lawmakers should be clear about the balance any laws strike between privacy and research goals, and what kinds of data platforms must disclose as a result. 
  • When, if ever, should researchers see content that is shared privately by users on a platform like Facebook? Disclosing the content of one-to-one messaging would be the modern equivalent of disclosing personal mail. It would raise major issues under the Fourth Amendment and laws like the Stored Communications Act. At the same time, researchers have legitimate questions about things like foreign electoral interference in forums ranging from Apple iMessages to large Facebook groups. Is disclosure more acceptable if a user’s post was shared with 20 people, or 100, or 5,000? 
  • Who sees sensitive content? Social media users often share content that is highly personal—for themselves or for third parties. A user who publicly or privately shares pictures of her cousin breastfeeding, or who discusses her uncle’s struggles with illness or bereavement, affects the privacy of those family members. This privacy impact is compounded if researchers review the posts. Without such access, though, important research on topics like medical misinformation may become impossible. Limits on access to the content of users’ posts would be particularly damaging for research about the impact of platforms’ own content moderation. If independent reviewers can’t see the affected user posts, platforms will be left to “check their own homework” on foundational questions about errors, bias, or disparate impact in their enforcement of on-site speech rules. 

A number of more fine-grained decisions about privacy are important enough to warrant congressional attention—and are raised by some draft legislation. Should researchers have access to public posts that users have since deleted, for example? Should an individual social media user’s history of policy violations be made public? Should transparency laws require platforms to track data about users or engagement, at the same time that many public interest groups want that tracking to stop? None of these questions are simple, and all deserve thoughtful consideration. 

Government Access to Private Data 

Platform transparency laws should not create backdoors for new government surveillance powers. It would be perverse if a bill intended to curb platform power instead let the government harness it, effectively using private platforms to collect information about citizens. U.S. lawmakers presumably do not intend such an end-run around the Fourth Amendment. But it is critical that their good intentions are captured in clear statutory language. 

Lawmakers should consult surveillance experts for transparency mandates of any kind, in part because surveillance issues can be hard to spot. In laws that give vetted researchers access to private data, for example, the law should anticipate situations in which researchers believe they have seen evidence of a crime. Should such researchers be permitted or encouraged to share what they know with law enforcement? The possibility of information flowing to police in this manner should prompt careful evaluation under the Fourth Amendment and the Stored Communications Act. Existing protections under those laws should be expressly preserved.

Surveillance questions can also arise under laws that mandate transparency about users’ “public” posts. A person whose image was posted to Flickr, for example, should not be deemed to have waived all privacy rights when a company like Clearview AI uses that image to build and license a facial recognition database to law enforcement. Nor should transparency mandates compel platforms to create de facto equivalents of the social media monitoring tools that the FBI and U.S. Customs and Border Protection have been criticized for using. 

While U.S. law often recognizes a sharp distinction between publicly shared information and private information, real-world expectations of privacy are not so crisply bifurcated. As the Supreme Court has recognized, the fact that a person’s activity was carried out in public does not always mean he loses protections under the Fourth Amendment. Law enforcement may still need warrants to undertake particularly pervasive tracking using sophisticated tools like GPS, for example. The same recognition of users’ real-world privacy expectations should inform any new laws requiring platforms to collect and report data about them. 

Costs and Competition

Platform transparency has benefits and costs. The costs do not provide a reason to forgo transparency, but they do mean that lawmakers should assume a finite transparency “budget,” and allocate it carefully for maximum public benefit. 

Some of transparency’s costs are economic. For companies like Google or Facebook—which has reported spending $3.7 billion annually on a broad set of safety and security initiatives—the costs of collecting and publishing data may be insignificant. For smaller companies, even tracking existing content moderation may require expensive work rebuilding internal tools and expanding and retraining content moderation teams. This can make it harder for these smaller companies to compete with incumbents that have invested heavily in transparency for years. 

Another risk is that inflexible transparency mandates may inadvertently reshape platforms’ actual content moderation practices. In the simplest case, time spent compiling transparency reports may take away from moderators’ other priorities, including combating child abuse material or terrorist content. Platforms might also reduce costs or risks of error by enforcing fewer rules, or making those rules more simplistic. For example, a platform might reduce expenses by banning all nudity, rather than enforcing and reporting on a more nuanced rule that permits nudity in contexts such as art, historical documentation, or medical treatises.

Lawmakers should also consider what technical design changes are required or incentivized by any new laws. They should be wary, for example, of causing companies to collect user data they otherwise would not (such as race or other demographic information) or build tools that can also be used in ways harmful to users (like facial recognition or more invasive systems for monitoring users’ communications). 

Finally, there is reason for concern that standardized reporting requirements could drive different platforms to converge on similar rules for online speech—as may already be happening with standardized transparency demands from the advertising industry. This is a difficult issue, because there are good reasons to want data that allows “apples to apples” comparisons of different platforms’ practices. At the same time, requiring platforms to use identical categories for reporting on speech rules could nudge them all to enforce similar speech rules—leaving users to encounter the same restrictions on controversial speech in any major forum for online discussion. Such an online monoculture would be a sad decline from what the Supreme Court called the “astoundingly diverse content” of the earlier internet. 

Most transparency proposals and laws, including the Digital Services Act, address concerns about competition and platform diversity by varying transparency obligations based on a platform’s size, using metrics like user count, employee count, or revenue. A variation on this approach might be for lawmakers or regulators to experiment on the giants first before extending obligations to their smaller rivals. An agency might familiarize itself with the mechanics of data collection on platforms like YouTube and Facebook, for example, before setting out to tailor obligations for smaller platforms. (One author of this post was previously employed by Google and currently consults for Pinterest; the other worked at Twitter before law school. They take no position here on the correct size-based treatment for those companies.)

Other possible considerations for more tailored obligations include the following:

  • What specific risks are involved in a given platform’s business? There may, for example, be more legitimate public interest in data about unsafe products on e-commerce sites, copyright infringement on music sites, adult content on image hosting sites, or disinformation on political discussion sites.
  • How costly is a particular transparency requirement? The “prevalence” metric (which extrapolates the overall circulation of violative content across the entire platform based on analysis of a sample) promoted by Facebook, for example, is notoriously expensive to collect—particularly across diverse and evolving linguistic or cultural groups.  
  • Is information competitively sensitive? Vague invocations of trade secret law should not inoculate companies from transparency laws. But some concerns about confidential information are real—particularly given that academic researchers often later find jobs in industry. Vetting assertions of competitive sensitivity is an appropriate job for an agency.

Again, the point is not that these considerations should prevail over transparency. Rather, it is that cost and competition concerns should be factors in assessing and tailoring transparency mandates. 

Speech

Transparency laws are broadly aligned with First Amendment values, in that they aim to improve public access to information. But poorly designed mandates may actually violate the First Amendment.

The most straightforward risk is that mandatory reporting might be challenged as unconstitutional compelled speech. The other and more subtle risk is that transparency laws might effectively reshape platforms’ substantive rules for content moderation, which could prompt First Amendment claims by both platforms and the affected users. This might happen as a byproduct of misguided standardization or as a consequence of burden and cost, as described in other sections of this post. More troublingly, as Eric Goldman has discussed, state actors might use nominally neutral transparency rules to pressure platforms to restrict or privilege particular speech—much as Texas Attorney General Ken Paxton appears to be doing now through punitive discovery demands to Twitter, prompted by its ouster of former President Trump. 

The issues raised in these legal challenges would be relatively novel. Until recently, only one circuit court ruling was directly on point: In Washington Post v. McManus (2019), the U.S. Court of Appeals for the Fourth Circuit struck down Maryland’s disclosure requirements for campaign ads on online newspapers and other platforms—holding that they cannot “be squared with the First Amendment.” In 2022, though, the U.S. Court of Appeals for the Eleventh Circuit allowed some transparency requirements in a Florida platform law to come into effect. Its ruling largely concerned other major constitutional issues, and its analysis of the law’s transparency mandates was brief compared to the Fourth Circuit’s ruling. It emphasized the state’s interest in protecting consumers from being misled, and the law’s burden on platforms. 

Law in this area is surely not done evolving. U.S. lawmakers and their staff would be wise to ask the skilled lawyers at the Congressional Research Service to further examine these questions, with a focus on the very distinct issues raised by different transparency models. Lawmakers should tailor legislative proposals accordingly. 

Other Considerations

This post, and the testimony it derives from, focuses primarily on privacy, competition, and speech issues. A number of other key questions should, however, also shape any new U.S. transparency laws: 

  • How can we maximize truly public access to information? There is great value in public data access, and in eliminating federal agencies, companies, or universities as gatekeepers. Some of the most important research on platforms to date has been accomplished by researchers “scraping” publicly visible data from platforms, or using archives of material that are available to the general public. These approaches are not without complex trade-offs of their own. But they do offer concrete models of success for legislation to build on.
  • Can U.S. law be aligned with the EU’s new requirements under the Digital Services Act? Not all platform regulation in Europe is a cultural or constitutional match for the U.S. But to the extent that the regulatory models can be reconciled, the result will be more useful information for the public and more streamlined obligations for platforms. It would also reduce the likelihood of future disputes between the U.S. and the EU about privacy and data protection in this area.
  • Will public disclosures give bad actors the information they need to game the system? Some of platforms’ most important content moderation efforts involve ongoing, adversarial, and iterative battles with the purveyors of spam or coordinated inauthentic behavior. Platforms should not be hamstrung in these efforts by transparency mandates that effectively disclose their tactics.
  • What violative content should be disclosed? In many cases, even otherwise illegal content, such as material supporting designated foreign terrorist organizations, can and should be lawfully used for research purposes. Other content, such as nonconsensual sexual images (“revenge porn,” the legal status of which varies by state) or photos of children, may be less appropriate to disclose. Transparency laws should reflect this variation.
  • Are new obligations sufficiently clear and paired with appropriate legal protections? Platforms should not be liable for making disclosures that are compelled by law. If platforms’ obligations are unclear, they or advocates for affected users should be able to seek agency clarification. 
  • How can the law incentivize productive communication among researchers, platforms, and the administering agency? Defining and collecting platform data sets can be complex, frustrating, and time consuming. Without clear and careful communication about technical details, the data that researchers obtain from platforms may be unusable or pose unnecessary privacy risks. To the extent possible, the law should facilitate open discussion among participants in this process, reserving any more adversarial engagement as a fallback mechanism. 
  • What degree of precision is needed to support the public purposes of transparency? Given the complexity and risk of error in data collection—particularly for companies that have not done it before—requiring perfect accuracy may set the bar too high. The law should provide some mechanism for disclosing and correcting at least some mistakes in transparency reports or disclosures to researchers.

Conclusion

Over the past decade, policymakers have become increasingly aware of the complex systems behind platform content moderation. Defining the right metrics and disclosures to explain these systems is no less complex. Like other areas of platform regulation, sound transparency laws will require attention to the real-world details of platform operation. It will also require lawmakers to resolve potentially competing policy priorities—particularly in relation to privacy, competition, and content regulation. 

Some of the questions raised by recent proposals have straightforward answers. Transparency laws should not reduce users’ protections from surveillance, for example, nor should they take away platforms’ legal protections under laws like Section 230. Other questions are harder and may lend themselves to more nuanced answers, perhaps involving agency resolution under congressional guidance. To craft wise transparency laws, lawmakers should seek input from experts in adjacent areas, including internet privacy and First Amendment law. They should also examine and build on existing functional transparency models, ranging from public data repositories to scraping-based academic research initiatives. They should, where possible, align U.S. transparency laws with other recently enacted laws, including the EU’s Digital Services Act. Crafting wise platform transparency regulation is complicated, but it is doable. We have the expertise and building blocks to improve public visibility into the critical speech forums of our age, and should get to work doing so.


Daphne Keller directs the Program on Platform Regulation at Stanford’s Cyber Policy Center. Her work, including academic, policy, and popular press writing, focuses on platform regulation and Internet users' rights in the U.S., EU, and around the world. She was previously Associate General Counsel for Google, where she had responsibility for the company’s web search products. She is a graduate of Yale Law School, Brown University, and Head Start.
Max Levy is a graduate of Stanford Law School and will begin work as an associate at Morrison & Foerster LLP in Fall 2022. Prior to law school he worked at Twitter and The Atlantic.

Subscribe to Lawfare