PDA

View Full Version : Every Six Hours, the NSA Gathers as Much Data as Is Stored in the Entire LoC



lapis
11th May 2011, 08:07 PM
LoC=Library of Congress.

http://www.popsci.com/technology/article/2011-05/every-six-hours-nsa-gathers-much-data-stored-entire-library-congress

The National Security Agency is, by nature, an extreme example of the e-hoarder. And as the governmental organization responsible for things like, say, gathering intelligence on such Persons of Interest as Osama bin Laden, that impulse makes sense--though once you hear the specifics, it still seems pretty incredible. In a story about the bin Laden mission, the NSA very casually dropped a number: Every six hours, the agency collects as much data as is stored in the entire Library of Congress.

That data includes transcripts of phone calls and in-house discussions, video and audio surveillance, and a massive amount of photography. "The volume of data they're pulling in is huge," said John V. Parachini, director of the Intelligence Policy Center at RAND. "One criticism we might make of our [intelligence] community is that we're collection-obsessed — we pull in everything — and we don't spend enough time or money to try and understand what do we have and how can we act upon it."

NSA's budget is not disclosed by law, but we'd imagine it would awfully expensive and difficult to even listen to such vast quantities of data, let alone analyze it intelligently. They mostly listen for keywords now--bits that don't make sense (and thus might be code), certain red-flag words (like, well, "bomb," which seems kind of unsubtle but I guess we're talking about terrorists here and of course it's possible there are intricacies of language that are missing in translation), and any conversation between principals like bin Laden. Still, next time you're aghast at how much space the entire series of Blue Planet takes on your hard drive, just be glad you're not the NSA.

[Baltimore Sun (http://www.baltimoresun.com/news/maryland/bs-md-nsa-bin-laden-20110507,0,2220255,full.story)]

Ponce
11th May 2011, 08:32 PM
What they call "for your safety" today, will be their weapon of choice tomorrow that will be used against you.

Gaillo
11th May 2011, 09:25 PM
The library of congress, by many accounts, holds approx. 11 terrabytes (million megabytes, or thousand gigabytes) of data. Currently, a 1TB drive costs approx. $100 FRN. So, this means that if the NSA is actually archiving all that data, they are spending somewhere around $1,000 every 6 hours... or $4,000 per day. Of course, they are probably compressing the HELL out of the data before storage, which MIGHT give them a 10:1 reduction on photographic/video/voice data and maintain enough fidelity for later detailed analysis. So, let's assume $400 FRN per day, that STILL amounts to $146,000 FRN per year - and that's assuming no backups (UNTHINKABLE!). Double or triple that figure to account for redundant backups, and you're getting aweful close to half a million per year JUST FOR STORAGE MEDIA... no wages for governazis to install the drives, retrieve and catalog them, etc. I know... that's mere chickenfeed in governmental budget terms, but a collosal waste of money nonetheless!

platinumdude
11th May 2011, 09:34 PM
You think we still use hard drives for storage? We use something better but that is a lot more costlier.

midnight rambler
11th May 2011, 09:37 PM
a collosal waste of money nonetheless!

I'm thinking you may not get it. The whole point is to waste as much money as possible.

Gaillo
11th May 2011, 09:42 PM
You think we still use hard drives for storage? We use something better but that is a lot more costlier.


Oh... DO tell! ;D

vacuum
11th May 2011, 09:48 PM
The library of congress, by many accounts, holds approx. 11 terrabytes (million megabytes, or thousand gigabytes) of data. Currently, a 1TB drive costs approx. $100 FRN. So, this means that if the NSA is actually archiving all that data, they are spending somewhere around $1,000 every 6 hours... or $4,000 per day. Of course, they are probably compressing the HELL out of the data before storage, which MIGHT give them a 10:1 reduction on photographic/video/voice data and maintain enough fidelity for later detailed analysis. So, let's assume $400 FRN per day, that STILL amounts to $146,000 FRN per year - and that's assuming no backups (UNTHINKABLE!). Double or triple that figure to account for redundant backups, and you're getting aweful close to half a million per year JUST FOR STORAGE MEDIA... no wages for governazis to install the drives, retrieve and catalog them, etc. I know... that's mere chickenfeed in governmental budget terms, but a collosal waste of money nonetheless!

When you compress media (audio/video) generally you don't even get 10% compression, much less 10:1.

Anyway, the I'd guess the data storage cost is probably much higher than your estimate.

Looking at a storage solution similar to what they'd use, Amazon data storage costs we'll say 5 cents/GB/month for more that 5000 terabytes of storage. (Obviously government costs for the same storage level would be much higher than private industry.) So 11 TB of data per 6 hours is 121 TB/month. Assuming they have a 5 year window that they retain data, that is a storage amount of 7260 TB total stored. Amazon would store it redundantly for $4.4 million/year.

Since this is NSA, obviously there'd be much heavier encryption and protection of the data than Amazon would provide. For example, they have to build their own buildings to house the data, with proper security. Probably redundant power systems, etc. They probably have written all their own custom software for the database and other things. My guess is at least an order of magnitude more than $4.4 million.

Gaillo
11th May 2011, 10:05 PM
(My original post deleted for brevity...)

When you compress media (audio/video) generally you don't even get 10% compression, much less 10:1.


That depends on whether or not you are using lossless or lossy compression. True, if you take an image file or video file and run it through something like winzip (lossless), you will typically get VERY little compression (not even the 10% you cite... my experience is that if you get 2 or 3 % you are doing VERY well!). However, if you do something like a JPEG DCT compression, you can take a scanned image and get 20:1 over the raw data with almost ZERO perceptible loss in image quality, and push 40:1 or 50:1 with very few noticable artifacts. Video is a little worse... but I routinely take DVD MPEG-2 video files and run them through an MPEG-4 ASP profile compressor (XviD) and achieve 4:1 compression with almost ZERO loss in video quality (at least to my eyes), H.264 compression does even better... though I rarely compress with that codec due to increased compression time and not many standalone decompressor devices support it. Audio will give you INCREDIBLE compression when using something lossy like MP3 or AAC... you can achieve 100:1 compression and EASILY maintain everything you would need to hear in a phone conversation for detailed analysis, probably even voice "fingerprinting" recognition of an individual for later electronic identification.

Based on the above, I just kind of "averaged" it all and picked 10:1 ratio for my analysis just to be conservative... and to assume that the .gov intelligence agencies probably have access to the best audio/video/image compression technology known to man.

Ares
12th May 2011, 03:37 AM
The library of congress, by many accounts, holds approx. 11 terrabytes (million megabytes, or thousand gigabytes) of data. Currently, a 1TB drive costs approx. $100 FRN. So, this means that if the NSA is actually archiving all that data, they are spending somewhere around $1,000 every 6 hours... or $4,000 per day. Of course, they are probably compressing the HELL out of the data before storage, which MIGHT give them a 10:1 reduction on photographic/video/voice data and maintain enough fidelity for later detailed analysis. So, let's assume $400 FRN per day, that STILL amounts to $146,000 FRN per year - and that's assuming no backups (UNTHINKABLE!). Double or triple that figure to account for redundant backups, and you're getting aweful close to half a million per year JUST FOR STORAGE MEDIA... no wages for governazis to install the drives, retrieve and catalog them, etc. I know... that's mere chickenfeed in governmental budget terms, but a collosal waste of money nonetheless!


You're thinking regular SATA drives. Problem is those drives are not fast enough. Typically only 7200 RPM, where SCSI have speeds of 10,000 or 15,000 RPM's. Mostly likely they are using SAS drives (Serial Attached SCSI) which there is no 1TB drive capability as of yet. Just a LOT of them. Of course they could be using SSD (Solid State Drive) but those are extremely expensive. So knowing government and it's preponderance for waste, I'd bet they are using Solid State Drives. And forget compression, they'll store it uncompressed because they'll just send a bill to you and me for more storage.

Agrippa
12th May 2011, 03:52 AM
The NSA has been around for quite awhile. Knowing how quickly government responds to change, I'd be unsurprised to find that they were storing all of this data on punch cards.

Glass
12th May 2011, 04:36 AM
I remember a story about some young computer wiz kid who developed an algorithm for storing something like 1Gb of data on a floppy disk drive. He sold the technology and it hasn't been heard of since. Here's an article about it. (http://groups.google.com/group/comp.compression/browse_thread/thread/be66aba853709c9e/596f80a2234c80a3)

It doesn't say how good it was so I'm guessing the numbers as best I can recall. They were big numbers though.
With the social networks, including the blogs and forums you can see there is so much noise to sift through, it's an impressive task. To trawl through all that chatter for important stuff needs serious power and storage. Its worth it to them because people map out who they know in those things and it would save them a lot of work. The rest is just crunched down revealing the important info. Keeps the space demands as low as possible. At least thats the theory. I wonder how long they keep it for. I think they have some impressive tech at their disposal.

gunDriller
12th May 2011, 07:26 AM
The library of congress, by many accounts, holds approx. 11 terrabytes (million megabytes, or thousand gigabytes) of data. Currently, a 1TB drive costs approx. $100 FRN. So, this means that if the NSA is actually archiving all that data, they are spending somewhere around $1,000 every 6 hours... or $4,000 per day. Of course, they are probably compressing the HELL out of the data before storage, which MIGHT give them a 10:1 reduction on photographic/video/voice data and maintain enough fidelity for later detailed analysis. So, let's assume $400 FRN per day, that STILL amounts to $146,000 FRN per year - and that's assuming no backups (UNTHINKABLE!). Double or triple that figure to account for redundant backups, and you're getting aweful close to half a million per year JUST FOR STORAGE MEDIA... no wages for governazis to install the drives, retrieve and catalog them, etc. I know... that's mere chickenfeed in governmental budget terms, but a collosal waste of money nonetheless!


You're thinking regular SATA drives. Problem is those drives are not fast enough. Typically only 7200 RPM, where SCSI have speeds of 10,000 or 15,000 RPM's. Mostly likely they are using SAS drives (Serial Attached SCSI) which there is no 1TB drive capability as of yet. Just a LOT of them. Of course they could be using SSD (Solid State Drive) but those are extremely expensive. So knowing government and it's preponderance for waste, I'd bet they are using Solid State Drives. And forget compression, they'll store it uncompressed because they'll just send a bill to you and me for more storage.


they might use SSD on the front end, where the info is coming in like a fire hose.

but for massive data storage needs, they probably use humongous banks of SATA drives. much cheaper than SAS, and even the NSA has a budget, and that's a LOT of hard drives.

i just wish they would share their reliability data.

lapis
12th May 2011, 04:51 PM
Sounds like a high-tech version of the USDA job of taking pictures of and monitoring kids' school lunches.

gunDriller
12th May 2011, 06:48 PM
The NSA has been around for quite awhile. Knowing how quickly government responds to change, I'd be unsurprised to find that they were storing all of this data on punch cards.


good. i have a legacy copy of DOS 3.3. i'd love to get $10K for it.

StreetsOfGold
12th May 2011, 07:25 PM
The NSA is child's play compared to the one who made them.
Examples:

*Matthew 10:30 But the very hairs of your head are all numbered.
*Psalm 147:4 He telleth the number of the stars; he calleth them all by their names.
*Psalm 139:4 For there is not a word in my tongue, but, lo, O LORD, thou knowest it altogether.

The worry over men and their puny inventions PALES in comparison to God.

Proverbs 29:25 The fear of man bringeth a snare: but whoso putteth his trust in the LORD shall be safe.