May 25th, 2009
05:03 PM ET

What if someone stole the Library of Congress?

Editor's Note: Last week, The National Archives - a repository of important government documents, including the U.S. Constitution - announced it had lost a computer hard drive containing large volumes of Clinton administration records, including the names, phone numbers and Social Security numbers of White House staff members and visitors.  Officials at the Archives say they don't know how many confidential records are on the hard drive. But congressional aides briefed on the matter say it contains "more than 100,000" Social Security numbers and Secret Service and White House operating procedures. David Gewirtz tells us why we should be concerned.

[cnn-photo-caption image=http://i2.cdn.turner.com/cnn/2009/POLITICS/05/20/lost.hard.drive.clinton/art.clinton.white.house.gi.jpg caption="The National Archives has lost a hard drive containing large volumes of Clinton administration records."]

David Gewirtz | BIO
AC360° Contributor
Editor-in-Chief, ZATZ Publishing

What if thieves broke into the Library of Congress one night and stole 10 percent of all its books? I'm not saying that happened, but it'd be a pretty big theft, wouldn't it? The Library of Congress houses one of almost every book ever published, so if someone broke in and stole one out ten, that'd be a lot of books to haul away.

But what if someone just stole a $250, five-pound hard drive, the size of half a box of Wheaties from the National Archives? Everyone needs more storage space these days and a nice, 2 terabyte hard drive sitting on a table might have been a juicy target for someone walking by - a janitor, an IT tech, a secretary. It's small and easy to walk off with, stick it in a book bag, a lunch bag, or even a trash bag.

It's not really a big deal, is it? So, somebody stole a hard drive. Happens all the time, right?

Well, it does. People steal things and hard drives are nice. After all, there's a limit to how many YouTube videos of farts lit on fire you can store on your own computer without some extra storage space. But when the drive that goes missing contains hundreds of thousands of records of private citizens' personally identifiable data such as social security numbers, as well as security procedures at the White House, it might be a bit more serious. That's what happened last week.

But that's not even the real issue. The real issue is just how much data is stored on these teeny-weenie devices and how much information might get into the wrong hands if one is purchased with the five-finger discount.

What does 2 terabytes really mean?

So what does 2 terabytes really mean? That's the size of the drive that was stolen from the National Archives. The best way to answer that is through some analogies. And I gave a hint at the beginning of this article.

The entire text content (the words) in the Library of Congress, from all books ever published is about 20 terabytes. The drive that was stolen is 2 terabytes. Put another way, if you had 10 of these drives, you could store every printed word ever published - and those ten drives would together weigh less than one big dog.

But what does that really mean? One way to picture it is to go all biblical. No, I'm not talking fire and brimstone. Instead, I'm talking actual bibles. The King James Bible to be precise. One King James Bible is about 2.5 megabytes - and a megabyte is a thousandth of a thousandth of a terabyte.

In other words, you could fit 800,000 King James Bibles on that stolen drive.

But what if you don't know of bibles? What if Harry Potter's more your speed? Well, you could fit 31,250 copies of all seven of the Harry Potter novels on that drive.

Still can't picture it? What if it was paper?

Stacked one on top of the other, those 800,000 bibles would reach 21.5 miles high, 3.9 times the height of Mount Everest, 3.2 times the greatest ocean depth - the height of 372.2 Statues of Liberty, stacked on top of each other.

Put another way, you could give a King James Bible to each of the residents of Crawford, Texas - and to another thousand towns the same size. Or, if you're in Kennebunkport, Me., you could give a full set of the Harry Potter novels to every resident - and every resident of 372 towns the same size.

I could go on and on with the fun examples, but you get the idea. You can store a lot of data on a 2 terabyte hard drive.

It's infuriating that security was so lax at the National Archives that a drive with that much important information got stolen. But the real issue is this - our data storage devices are becoming so large (in capacity) while getting cheaper and cheaper and physically smaller and smaller that this sort of data loss is going to become more and more common.

And while you can't easily walk off with a huge file cabinet of super-secret government data, it's apparently incredibly easy to walk off with a hard drive containing a thousand or more file cabinets worth of equally juicy, secret, important information.

The lesson here is pretty simple: security needs to get better. And that doesn't apply just to our government employees (your tax dollars at work), but to all of us. We keep our entire lives on these little electronic marvels. But they're so little, they're hard to keep track of - and when they go missing, a whole lot of information can go with them at the same time.

OK, OK, one more example. Lined up end-to-end, 800,000 copies of the King James Bible (the number of copies one of those hard drives can store) would run for 113.6 miles - or 66.794 Golden Gate Bridges lined up, end-to-end.

That's a lot of data to go missing. Keep an eye on your hard drives.

Follow David on Twitter at http://www.twitter.com/DavidGewirtz

Editor’s note: David Gewirtz is Editor-in-Chief, ZATZ Magazines, including OutlookPower Magazine. He is a leading Presidential scholar specializing in White House email. He is a member of FBI InfraGard, the Cyberterrorism Advisor for the International Association for Counterterrorism & Security Professionals, a columnist for The Journal of Counterterrorism and Homeland Security, and has been a guest commentator for the Nieman Watchdog of the Nieman Foundation for Journalism at Harvard University. He is a faculty member at the University of California, Berkeley extension, a recipient of the Sigma Xi Research Award in Engineering and was a candidate for the 2008 Pulitzer Prize in Letters.

Filed under: 360° Radar • David Gewirtz • Technology
soundoff (12 Responses)
  1. John Williams

    Have we run out of real news?

    May 26, 2009 at 12:23 am |
  2. David, Indiana

    I wonder if you measured each of those books in terms of the time it took to write them, how long would that be? It's interesting that in putting the data on the hard drive that info becomes more vulnerable. Also, I feel people forget that "generating content" is in the case of writing a book, a labor of love. Sometimes, an author's deepest emotions are laid bare and they show what has shaken them to the core. I wouldn't want to lose that record of emotions. That's closesness and intimacy in action. Not only the glory of letters, and language but a practical everyday communication that people need.

    May 25, 2009 at 11:45 pm |
  3. William of Iowa

    Interesting news. I never imagined that the Library of Congress would be the repository for classified information. Maybe I do not understand the mission of the Library. Next visit I'll ask for some old civil war pictures, a Teddy Roosevelt speech and the tapes of the Presidents morning security briefing. Jeez.

    May 25, 2009 at 10:40 pm |
  4. lisaonline

    I remember the first time I saw a reference to a "terabyte". It was quite a few years ago, in the back of a computer supplies and services magazine. It was a 1 TB W.O.R.M (Write Once Read Many) drive, and it was about the size of a large footstool.

    That would have almost been more effective! Not only were those old units read-only, but also just a little more difficult to haul away discreetly. Somehow it would make a funnier story if it wasn't a security breach.

    May 25, 2009 at 8:42 pm |
  5. Annie Kate

    There are various methods for locking down hard-drives so they don't walk off. I'm surprised that computer security had not taken these precautions with the hard drive that disappeared. It may not completely deter a determined thief but it slows them down. Another question that needs to be looked at is if they had a backup of this data on the harddrive or was that the only source of the data and can it be reconstructed? Hopefully, this theft will inspire the archives to really put some security and backup measures with teeth into effect so it does not happen again.

    May 25, 2009 at 6:14 pm |
  6. earle,florida

    We have a new man in charge at the nat'l archives for 41/2 months and what happens? There are eleven ,soon to be twelve presidential libraries with a budget of several (not quite sure on budget,or functioning libraries) hundred million. Perhaps we need to close the barn door now,....?

    May 25, 2009 at 4:58 pm |
  7. eunice

    Actually, the question should be "Would anyone even care?" Most public libraries are fighting for funding because noone apparently uses them anymore. There are high school students who have never stepped into their school's library. If the public can't get juiced up about the library in their own neighborhood, how could anyone convince them that the Library of Congress is worth saving? My guess is they would just use Google or Wikipedia to get the information that's already posted.

    May 25, 2009 at 4:19 pm |
  8. Mona Sarrette

    shouldn't these storage devices have some kind of Lo Jack device attached? This is very frightening. In the wrong hands, someone else might be able to access my big Social Security account before I have a chance to.

    May 25, 2009 at 2:52 pm |
  9. john

    when i saw "stole library of congress" i though a fun joke about carmen sandiego. but man, this is no joke, there could be tons of info on that drive that could be exploited. uh oh.

    May 25, 2009 at 12:50 pm |
  10. Evelyn

    I am concerned. Yes. Now, do we know what is missing? If we know the degree of sensitivity of the data that is on the stolen device, we need to develop worst case scenarios where this information being in the wrong hands, is misused an abused. Our focus can be both on retrieval and protocol to face consequences.

    May 25, 2009 at 12:41 pm |
  11. Jim Platt

    I would imagine, IF there was anything to be gained by stealing the archives, one of our esteemed congressional members, or more, would have done it a long time ago...

    May 25, 2009 at 12:00 pm |