NSA’s Yottabytes of Data

Aerial view of NSA HQ at Fort Meade, via NSA

The NSA is building two new storage facilities to house yottabytes of data. One in Utah and one in Texas. The scale is staggering. There are a thousand gigabytes in a terabyte, a thousand terabytes in a petabyte, a thousand petabytes in an exabyte, a thousand exabytes in a zettabyte, and a thousand zettabytes in a yottabyte. A yottabyte is 1,000,000,000,000,000 GB¹.

This data is mostly pocket litter: trillions of phone calls, email messages, web searches, parking receipts, bookstore visits, and other data trails.

The issue is critical because at the NSA, electrical power is political power. In its top-secret world, the coin of the realm is the kilowatt. More electrical power ensures bigger data centers.

Does this scare you? As for myself, no. As a mathematician, I’ve often fantasized working at the NSA. From the article, it seems like they don’t have enough supercomputing power to deal with the vasts amounts of data that they are harnessing and storing.

I like the term infoweapons used in this article. Infoweapons are supercomputers running complex algorithmic programs.

On a remote edge of Utah’s dry and arid high desert, where temperatures often zoom past 100 degrees, hard-hatted construction workers with top-secret clearances are preparing to build what may become America’s equivalent of Jorge Luis Borges’s “Library of Babel,” a place where the collection of information is both infinite and at the same time monstrous, where the entire world’s knowledge is stored, but not a single word is understood. At a million square feet, the mammoth $2 billion structure will be one-third larger than the US Capitol and will use the same amount of energy as every house in Salt Lake City combined.

Unlike Borges’s “labyrinth of letters,” this library expects few visitors. It’s being built by the ultra-secret National Security Agency—which is primarily responsible for “signals intelligence,” the collection and analysis of various forms of communication—to house trillions of phone calls, e-mail messages, and data trails: Web searches, parking receipts, bookstore visits, and other digital “pocket litter.” Lacking adequate space and power at its city-sized Fort Meade, Maryland, headquarters, the NSA is also completing work on another data archive, this one in San Antonio, Texas, which will be nearly the size of the Alamodome.

Just how much information will be stored in these windowless cybertemples? A clue comes from a recent report prepared by the MITRE Corporation, a Pentagon think tank. “As the sensors associated with the various surveillance missions improve,” says the report, referring to a variety of technical collection methods, “the data volumes are increasing with a projection that sensor data volume could potentially increase to the level of Yottabytes (1024 Bytes) by 2015.”[1] Roughly equal to about a septillion (1,000,000,000,000,000,000,000,000) pages of text, numbers beyond Yottabytes haven’t yet been named. Once vacuumed up and stored in these near-infinite “libraries,” the data are then analyzed by powerful infoweapons, supercomputers running complex algorithmic programs, to determine who among us may be—or may one day become—a terrorist. In the NSA’s world of automated surveillance on steroids, every bit has a history and every keystroke tells a story.