Published by The Lawfare Institute
in Cooperation With
It’s a dark statistical fact that you, the reader, have likely received a notice stating your data has been breached and is likely in criminal hands. You’re the victim, but the data probably wasn’t stolen from your computer. The attack vector through which your data was breached might have been your employer or school, your health provider or government, a merchant you last used years ago, or a cloud service you use every day.
If it seems these attacks are already frequent and becoming more common still, your intuition is correct. And you don’t need to rely on anecdotes for evidence. A new review of data breach research from Massachusetts Institute of Technology Professor of Information Technology Stuart Madnick, and commissioned by Apple, found that the immense threat to personal data in the cloud is continuing to grow at an extraordinary rate—and criminals are finding new ways to profit. They ransom data, requiring payment to allow the user or provider to recover it. They threaten public disclosure of confidential data or communications, demanding payment to avoid that pain. And they use their ill-gotten gains—both the data and the payments—to enhance their next attack, perpetuating the cycle.
In March of this year, for example, Minneapolis Public Schools announced that students’ data had been breached, exposing over 300,000 sensitive files containing details describing sexual assaults, psychiatric hospitalizations, parental abuse, and suicide attempts. And the Better Outcomes Registry & Network, a perinatal and child registry in Ontario, Canada, reported a data breach in May resulting from its use of MOVEIt, a file transfer service, which exposed sensitive records of at least 3.4 million individuals.
Law enforcement and national security officials are sounding the alarm about the gravity of the threat. The International Criminal Police Organization’s assistant director of cybercrime operations, Bernardo Pillot, noted the “unprecedented increase in both the number of cyber threats and their sophistication, with attacks becoming more tailored as criminals aim for maximum impact, and maximum profit.” U.K. Minister of Security Tom Tugendhat has warned that cybercriminals’ “attempts to shut down hospitals, schools and businesses have played havoc with people’s lives and cost the taxpayer millions.”
In August, a supplier for the U.K. Ministry of Defence experienced a breach that revealed sensitive information relating to a nuclear submarine base and a chemical weapons lab. And a few months before that, Microsoft suffered a breach in its cloud-based email platform, resulting in the theft of 60,000 emails from U.S. State Department accounts and the hacking of the U.S. commerce secretary’s email account. Data breaches in the cloud clearly aren’t limited to organizations that are understaffed or insufficiently sophisticated. Microsoft runs one of the most elaborate cybersecurity operations in the world—with reportedly 10,000 employees, amounting to nearly 5 percent of the company’s staff.
The fundamental driving force enabling these attacks is data centralization: personal information belonging to thousands, hundreds of thousands, or millions of users all being stored by each service provider’s cloud infrastructure. Centralizing data in this way has provided unprecedented ease of use and flexibility for service providers and users alike. But when it comes to security, centralized data repositories disproportionately favor attackers. The one-time breach of a company or service lets criminals steal the personal data for many—or all—of its users at once.
The world will not return to the pre-cloud era. Aggregated and centralized personal data is now common; organizations and governments must urgently embrace security models that can rise to the challenge. We have to, first, limit the impact of attacks on large-scale data repositories and, second, work to keep citizens in the driver’s seat when it comes to their data. That’s a tall order, but there is already a tool that can help meet these goals: Encryption can offer users both protection and control, making it one of the most potent solutions defenders can bring to bear.
There’s a global community of academic and industry cryptographers and government agencies, like the National Institute of Standards and Technology, working together to validate and recommend safe cryptographic building blocks, which are the underlying mathematical algorithms engineers use when applying cryptography to specific problems. But starting with secure building blocks isn’t enough; it is how they are assembled that determines the real-world security of any resulting system. Assessing that real-world security starts with one crucial question: Where are the encryption keys?
Encryption keys are the mathematical secrets required to read encrypted data. As with a physical door lock, the strength of the locking mechanism doesn’t matter if a criminal can get hold of the key. If all data on a server is encrypted, for example, but the key is stored on the same server’s hard drive, then the encryption generally provides no security benefit; any criminal in a position to steal the encrypted data can also steal the key. Here’s one way to understand this intuitively: A thief that breaks into your desk can steal the phone you put there, regardless of whether you protected that phone with a passcode. But if you did, they can’t get to the data on the phone—unless there’s a sticky note next to it that gives away the passcode.
End-to-end encryption means the keys are available only to the “endpoints” of the communication—the people and devices that are specifically intended to access the data. For an end-to-encrypted messaging service, that means the key for a given message is available only to its sender and recipient, and for an end-to-end encrypted storage service, it means the key for a file is available only to the user who stored it and the people with whom the user chose to share it. Importantly, the service providers in the middle—in our example, the messaging service and file storage operators—don’t have the keys. And without the keys, these providers can’t read the data even if they store it, move it, send it around, or make copies. Properly encrypted data without keys is as useful for hackers as random noise. This is critical when considering the rise of cloud data breaches. As end-to-end encrypted data can’t be decrypted by service providers, it also can’t be decrypted by attackers—no matter how bad the breach.
End-to-end encryption certainly cannot address all security problems in the cloud, and many services cannot be engineered with complete end-to-end encryption due to complexity, cost, or the need to interoperate with older systems. Modern end-to-end encrypted systems are significantly harder to build, and the tools and technologies they rely on are not yet familiar to software engineers, or easy for them to use. Though this friction can—and will—lessen with time, it means that even for services that could be built with it, end-to-end encryption is not yet common. Instead, most data in the cloud today is stored in a way that is easily accessible to service providers—and, thus, to any criminals who breach them. As Madnick and other experts note, there have been more breaches in the U.S. in the first three quarters of 2023 than in any complete prior year. In the U.K., Australia, and Canada combined, more than double the accounts were breached in the first half of 2023 compared to the first half of 2022. And nearly 3 billion personal records were breached globally in 2021 and 2022. These breaches are not only pervasive, but often recurring. A report by IBM studying 553 organizations globally that had experienced a data breach between March 2022 and March 2023 found that 95 percent had been breached more than once. The consequences can be dire for a wide range of organizations and the users who entrust them with their data—sometimes by choice, other times by necessity.
Plan B Must Not Be “Plan A, but Harder”
The common defensive approaches for user data in the cloud are clearly inadequate. New technical and policy prescriptions must avoid the trap of doubling down on the same unsuccessful strategies but demanding they be executed with more determination. Today’s defenses are in practice failing almost all stakeholders; the problem is one not merely of execution but also of vision. Both the national security apparatus and the private sector must grapple with a profound question: How can organizations enact fundamental design changes to data systems to proactively mitigate the seemingly ceaseless barrage of attacks?
Instead of engaging with this question, should policy proposals succeed in freezing the progress of data encryption, the consequences will be both obvious and inevitable: Criminals will continue large-scale breaches, encountering little meaningful resistance, and their sophistication will increase rapidly to track the growth of the profit opportunity. As we have already seen, attackers will fund new tools and techniques to execute ever more advanced breaches. In 2021, Mandiant reported that financially motivated actors, like ransomware groups, accounted for nearly one-third of known exploited zero-day vulnerabilities—referring to vulnerabilities that were exploited publicly before a patch was available from the vendor. And Identity Theft Resource Center, a nonprofit organization, reported 86 zero-day attacks involved in data breaches in the first three quarters of 2023, compared to five throughout 2022. But new hacking tools are not all; attackers will also seek entirely new ways to monetize the breached data. The meteoric rise of machine learning, for example, has created an insatiable global hunger for data to feed into the training of artificial intelligence (AI) models. It would be unsurprising if data breach actors moved to tailor some of their work accordingly, finding opportunities to launder and sell breached personal information to legitimate but unsuspecting—or, sometimes, purposely uninquisitive—AI data pipelines.
While all companies and organizations face shared baseline risks of attack, the defensive solutions will necessarily vary. At Apple, our approach to safeguarding user data is guided by a set of foundational design principles.
First and foremost, Apple—the company and its services—stands separate, in terms of data security, from our customers’ devices such as the iPhone in your pocket. The hardware we make can be used without creating any account with Apple, and we work to rigorously minimize the amount of data Apple collects. Data that exists only on users’ Apple devices, as opposed to data on Apple’s servers, is disaggregated and inherently far better protected against attack, especially when coupled with industry-leading device security—including, as you’d expect, strong encryption for data on the device.
In cases when user information has to touch our servers, we are often able to offer a service while only ephemerally processing the user data, or while processing it under uncorrelated and randomized identifiers that bear no association to the user’s identity. One example is that Apple servers provide various location services to users, including mapping and routing, but do so with no awareness of the user’s identity, and without retaining a history of their whereabouts. Whenever possible, if a service doesn’t require identifying a user or storing their data, we build it not to do those things—even if this often exacts a meaningful engineering cost in terms of the difficulty and duration of development and testing.
But when a user stores their photos, notes, and/or files in iCloud, or when they perform an iCloud backup of their device, our servers must both identify the user account to which the data belongs and retain that data as long as the user wishes. In these cases, storage is not incidental—it is the very purpose of the service. As the provider, that means we carry a dual mandate: Users entrust Apple with a weighty responsibility to help them regain their data even if they lose or break all their devices for any reason, whether from theft or disaster. At the same time, we must prevent all unauthorized access to the data, including in the case of a breach. To our knowledge there are no instances where a user’s iCloud data was stolen through a compromise of iCloud servers. In large part, that is because we work tirelessly to stay ahead of attackers. But in part there’s also an undeniable element of luck, as is true for all organizations and systems that haven’t been breached.
The Critical Role of End-to-End Encryption
Attacks on user data are relentless, and our users expect us to use every tool at our disposal to keep their data safe—as they should. For sensitive data, such as health information and passwords in iCloud Keychain, our defense of choice has long been end-to-end encryption, where only the user’s trusted devices have access to the keys protecting the data.
As privacy and cybersecurity expert Susan Landau put it previously in Lawfare, “In a world in which securing communication bits is equivalent to securing money, ideas, and business and personal information, end-to-end encryption is integral to public safety and national security.” At Apple, we wholeheartedly agree, which is why at the end of last year we launched a new, opt-in feature called Advanced Data Protection for iCloud. Fourteen categories of sensitive data in iCloud, including health information and iCloud Keychain, were already protected with end-to-end encryption by default. For users who choose to enable Advanced Data Protection, this protection expands to 23 data categories including Photos, Notes, and iCloud Backup—allowing users to secure the vast majority of their iCloud data with end-to-end encryption.
With all iCloud data that’s end-to-end encrypted, only the user’s trusted devices have the keys—Apple does not retain a copy. In the event of a hypothetical breach of Apple’s servers, a user who enabled Advanced Data Protection is protected by Apple’s highest level of cloud data security; an attacker could steal the user’s encrypted data, but the keys simply aren’t there to be stolen. It is conceivable in theory to attempt to break the encryption by trying every possible key, but we can quantify how long this would take: The attacker has virtually no chance of success before our sun runs out of hydrogen, sputters, and extinguishes.
End-to-end encryption, in our view, is without a doubt the strongest approach organizations can take to protect user data. And as with all powerful technologies, a successful deployment requires thoughtful consideration for the user. We have made Advanced Data Protection an option for our users, rather than the default, because the feature requires the user to take ultimate responsibility for managing their cryptographic keys. If a user loses all access to their Apple ID account, Apple will not have the encryption keys to help them recover their data.
To mitigate this concern, before introducing Advanced Data Protection, we considered it necessary to launch multiple security protocols that can help a user regain access on their own. Users can recover their iCloud data with their device passcode or password, but we require that they also set up an alternative recovery method—one or more recovery contacts or a recovery key.
Users who opt in to Advanced Data Protection can change their mind. They can elect to restore Apple as their ultimate method of data recovery by disabling the feature, and we believe it’s important they have that option. But we also believe all our users deserve the highest protection from data breaches in the cloud, and access to end-to-end encryption is the best way we know to curtail the threat. Even as attackers become more sophisticated, and criminals find new avenues to monetize breached data, there’s every reason to believe encrypted data stolen without keys will remain protected.
Motivated largely by law enforcement-related concerns, lawmakers and regulators around the world have been considering legislation that may instead require providers like Apple to retain encryption keys for user data. Though intended to help detect child abuse and terrorist recruitment, such a requirement would outlaw end-to-end encryption. We at Apple value public safety highly, and we are committed to helping law enforcement prosecute crimes within the law. But it would be a catastrophic error to undermine the data security of all citizens to improve the prospects of prosecuting specific, individual crimes—no matter how heinous. Although end-to-end encryption does not directly concern the U.S. Constitution, Justice Antonin Scalia provided a prescient guide for reasoning about its costs. He wrote in 1987 in Arizona v. Hicks, there is “nothing new in the realization that the Constitution sometimes insulates the criminality of a few in order to protect the privacy of us all.” Scalia may well have been a cloud security expert.
End-to-end encryption is a pivotal capability that protects the privacy of journalists, human rights activists, and diplomats, and helps defend people around the world from surveillance, identity theft, fraud, and data breaches in the cloud. Companies and organizations should offer this protection to users worldwide. At a time when digital attackers are breaking into more systems than ever before, end-to-end encryption is one of our most powerful and precious defenses, and one that will stand the test of time. For the benefit of all, it must be protected.