RISKY THINKING - February 2010
by Michael Z. Bell
Principal Consultant, Albion Research Ltd.
A free newsletter providing essays, analysis, insights, and oddities related to Business Continuity, Disaster Recovery, and Risk Management.
To subscribe, visit: http://www.RiskyThinking.com/newsletter
Some essays may also be available from the RiskyThinking website at http://www.RiskyThinking.com/.
- Of Backups and Bare Metal Restore
- Not Quite a Word From Our Sponsor
- News Of The World
- Administrivia, Subscribing and Unsubscribing
Your dataís safe, isnít it? If a disaster happened, you could simply buy new computers, restore from backups, and continue working. Or could you? Welcome to Mike's First Rule of Real World Backups: backups donít exist unless you test them.
Your company keeps regular backups, doesnít it? Buried in an air-conditioned storeroom on the first floor of the building (safe from hurricanes and flooding) thereís a shelf (or better still, a fireproof media safe) full of old backup tapes and DVDs. Each day or each week you backup everything and send the media off site to be stored safely in a secret location, which you imagine is some underground fortress surrounded by barbed wire fences, patrolled by armed guards in military uniforms with ferocious Rottweilers.† Even a James Bond villain would envy the mighty fortress in which your data is kept.
Or perhaps your company is past all that. Through the wonders of the internet, your data is automatically replicated in three different locations, scattered around the globe, heavily encrypted so that even NSA scientists working with secret quantum computers would require many millennia to read your email. Once again the backup locations are in top secret bunkers protected by armed guards from the ravages of war, civil insurrection, and all the Acts of God listed in loving detail in your insurance policies. Even if the world is annihilated by a nuclear war or an alien invasion, you know your data will survive.
Of course, we know that in reality the data is stored in some anonymous warehouse patrolled by a bored security guard on minimum wage but itís still off-site and safe. And we can dream. If the worst happens we can restore everything easily, canít we?
Itís a sad truth about backups that although we make them, we rarely test them. And we rarely test them under the conditions that would occur following a disaster.
As T. S. Eliot puts it in the poem The Hollow Men:
Between the idea
And the reality
Between the motion
And the act
Falls the shadow.
I first came across The Shadow in my first job. I was working for the research and development division of a software house. In those days, off-site backup consisted of sending a removable hard disk (probably holding all of 10 Mbytes) off to a service bureau each week. The company would then copy the contents of the disk to magnetic tape, store the tape, and then return the disk drive to us.
Fortunately we never needed those off-site backups.
About a year after I started working there we received a check and an apology from the service bureau. It transpired that an operator had been omitting one vital step from this process: actually copying the files to tape. The backup tapes were blank. (The rumor we heard was that the operator had observed our write-only use of tapes, and figured he could do his job more efficiently if he omitted the tape backup part of his job).
Thus Mikeís First Rule of Real World Backups: backups donít exist unless you test them.
The Shadow of course has turns up regularly on a small scale. How many times have you heard the plaintive userís cry: ďBut I thought that was being backed up?Ē Typically users assume that the IT department is backing up their computer. In reality the IT department doesnít have the space or time to store copies of the MP3 files, games, and other personal stuff that has somehow found its way onto the work computer. So backup is limited to a network drive, and excludes stuff stored in ďMy DocumentsĒ and programs installed without the IT departmentís knowledge. The user probably got told about the backup policy during their first day at work along with how to complete a timesheet and where to find the photocopier. But the fact was never used and quickly forgotten.
This doesnít just happen at the user level. In a recent business continuity audit, we noted that responsibility for making backups had been transferred when the person who normally made the backups left suddenly. The unfortunate soul who had the job suddenly dumped upon him did the job as he supposed it should be done. There were no written procedures, and few records. While some things were being backed up, it was evident to us that other things were not.
Thus a corollary to the First Rule is: just because something is being backed up, you canít assume everything is being backed up.
In that same audit, we noted that the lack of records meant that if a backup tape needed to be recalled from off-site storage to rebuild a particular server, there was no method (apart from reading the catalog off the backup tapes) of determining which tape would need to be couriered from off-site storage. So if a restore was required, all recent tapes would have to be recalled and read to find the tape that was needed. (The client assured us that tapes could easily be retrieved by the date on which they were stored: we were never quite convinced of this.)
Another corollary to the Fist Rule is therefore: if you canít figure out how to get it back from off-site storage, you donít really have an off-site backup.
When you do get the tapes back (and assuming you have also the hardware necessary to read the tape) it should all be plain sailing, right?
Not so. You are now faced with doing a Bare Metal Restore Ė restoring your backups to a machine on which nothing is installed. This is theoretically easy if the machine you are restoring is same as the machine you backed up. But it isnít. That machine is buried in a pile of smoking rubble. What you have is what your quick ship supplier promised would be a machine with an identical configuration, but isnít quite. Itís got bigger disks (they donít make the type of disks you use any more), a different motherboard, a faster network adapter, and a different CPU.
We recently had an experience of the joys of a bare metal restore on a small scale. We had a full system backup made with Microsoftís NTBACKUP program. Hereís what should have happened, according to Microsoft:
- Install Windows XP
- Restore full system backup, overwriting system files.
- Repair the XP installation from an installation CD to update windows to match hardware. (Note, do NOT use the Repair console Ė take the subsequent option to repair an existing installation instead of overwriting it).
- Reactivate Windows
Hereís what actually happens:
- You install Windows XP on the replacement hardware.
- You use NTBACKUP to restore everything, overwriting system files.
- You reboot.
- As instructed, you repair the XP installation from an installation CD to update windows to match the new hardware.
- You find Windows wonít run because it isnít activated. When you try and activate it, you get an error message telling you the program MSOOBE.EXE has crashed.
- You spend several hours searching the web. You find that MSOOBE.EXE is a Microsoft Program called ironically ďMicrosoft Out of the Box AdvantageĒ. You find that Microsoft has a lot of information about making backups with NTBACKUP.EXE, but almost nothing about restoring a system from backups.
- Finally you find this helpful posting from Kunal Mudliyar of Pune, India http://social.microsoft.com/Forums/en-US/genuinewindowsxp/thread/9e2a22a4-429e-4abc-9f8e-1735e46fb0c4
- You boot Windows in Safe Mode, find the hidden spuninst.exe program in the hidden folder c:\windows\ie7\spuninst and uninstall Internet Explorer 7.
- You reboot windows in normal mode and activate it.
- You reinstall Internet Explorer 7.
- A day has elapsed. You leave Windows updating itself and adjourn to the pub.
ďBetween the idea, And the reality Ö Falls the ShadowĒ.
Bare metal restore is not the same as a simple file restore. So the final corollaries are these: you donít know how long a bare metal restore to different hardware will take until you try it. And: just because it worked last week doesnít mean it will work this week.
So to conclude: Mikeís First Rule of Real World Backups is: backups donít exist unless you test them. So donít assume: test. And test using a good approximation to the real world. Different hardware. Different people. Tapes recalled from off-site storage (or files transferred from an on-line vault). And just because it worked last time, donít assume that a system update wonít screw you up this time.
[You can comment on this article at the Risky Thinking blog at http://blog.riskythinking.com/ ]
A reminder that if you still don't have your pandemic plan in place, you can get software to assist you with writing one. See http://www.riskythinking.com/bpps for more details.
In this section we look at the interesting, the instructive, and the downright odd from the world of Business Continuity, Disaster Recovery, and Risk Management.
Fire and Ice
An apartment building floods after sprinklers freeze in uncommonly low temperatures. Ironically the measure put in place to prevent one risk, creates another.
(Note that there are dry sprinkler systems for use in potentially freezing conditions: it was unseasonably cold in Kansas).
How Not To Keep a Secret Very Well
Want to keep something really confidential? Here's how not to do it.
Copy some files from the company president's laptop onto your web server, let them get indexed by Google, and then leave them there for five years.
I won't provide a direct link, but if you Google "email@example.com" and poke around you'll have no trouble identifying the site and the owner of the laptop. (I found the site from a Google search for another email address on the same email list.)
TThe files in question are all dated 13 April 2005. No, it wasn't a Friday.
Exploding Air Beds
Well hereís a risk you donít hear about every day: exploding air beds. This one devastated a German coupleís apartment. Guess itís not a good idea to use those puncture repair kits designed for automobile tires and which use inflammable gases to repair your airbed..
Not Scared Enough of Flying?
Just after Christmas December 2001 we had the Shoe Bomber. Then on Christmas day 2009 we had the Underpants Bomber. Think better airport screening will solve everything? Then consider this story from last year. A suicide bomber injured Saudi Prince Mohammed with a bomb apparently concealed inside himself. Not only does this create a difficult problem for airport security (MRI anyone?), it also leads us with the problem of what to call the first person to get caught using this tactic. The Stomach Bomber? The Gut Bomber? Or do those both sound too much like Aussie slang for a bad meal?
Underpants Bomber: http://en.wikipedia.org/wiki/Underpants_bomber
Not Scared Enough of Flying Ė The Sequel
The solution proposed by governments to counter bombs concealed in peopleís nether regions is the full body scanner, a device which uses extremely high frequency radio waves to obtain an image of a person without their clothes on. This is safer than using x-rays. But will it work in practice?
Not surprisingly, the operational effectiveness of full body scanners is open to question. This German video demonstrates that enough things won't show up to make an interesting hazard.
The demonstration clearly doesn't match an airport scenario (more images would be used), but at the same time there's probably much more scrutiny of the image than there would be in everyday use.
The 10 minute video is in German with occasional English. No subtitles. Terrorist attack on frying pan with smuggled items is at 7:30.
German Video: http://www.youtube.com/watch?v=nrKvweNugnQ
Body Scanners: http://en.wikipedia.org/wiki/Security_scan
Visa Card Programmers Ready for the Future
This is the story about the man charged $23,148,855,308,184,500 on his visa debit card for a pack of cigarettes due to an unspecified programming error.
What's interesting is that Visa accounts will apparently happily process such large numbers without triggering any internal checks or alarms. Do they know something about future US inflation that we don't? Or perhaps the US deficit is already being financed by a government credit card.
But German Credit Card Programmers Not Ready for 2010
In a story which could be a what-could-have-happened-in-Y2K case study, millions of Germans were without working credit or debit cards at the beginning of 2010. A coding error misinterpreted the date as being 2016 rather than 2010. Speculation is that the programmerís confused Binary Coded Decimal format and Binary format in the date field.
Contracting Out Ė Make Sure Redundancy is in the Spec
What can happen if you contract out your IT, and don't read the contract carefully? Apparently network redundancy is one of the things that can be omitted from the specification, as the Virginia Department of Motor Vehicles found out.
Donít Worry, Weíve Got Backup Generators
Except they donít always work. You have to test these things.
Some random stories from a growing list.
But Sometimes Backup Generators Donít Make Sense
It's hard to make the case for not buying backup generators after a power outage has just occurred. It's still quite a rational and probably the correct decision, however. It looks like there was a single point of failure which affected primary and backup power lines to Cleveland Hopkins International Airport. Backup generators were not designed to keep all the airport systems operating: in particular third parties, such as airlines check in desks, were responsible for their own backup power.
Too often organizations don't make a rational decision to accept a risk: it is good to see one being made here. (Unfortunately for the passengers most airlines which didn't have their own backup generators, but that's the price you have to pay if you want cheaper airport landing fees.)
Note that all the safety-critical systems did have backup generators which worked. However, in this case a greater independence of primary and backup utility feeds may be warranted.
I wonder if public reaction following an outage was factored into the cost / benefit trade-off analysis.
Conflicker Worm Ė Still Out There
The Conflicker worm, while forgotten, is still very much alive and kicking. Itís an example of a widespread botnet that people know about, but havenít yet managed to kill. Tracking a worm is difficult (unless you are the owner!) The best estimates put the current number of infected computers at between 1.75 million and 5.25 million. Worms and viruses (of the machine kind) aren't news at the moment so it's easy to become complacent.
Incidentally the reason why it's difficult to get accurate
numbers has to do with the use of IP addresses. On 3 February 2010 there were
Presumably there are less well-known but equally well-designed botnets out there which we just never hear about. http://www.confickerworkinggroup.org/wiki/pmwiki.php/ANY/InfectionTracking
Although the Conflicker wormís main purpose appears to be spamming, every organization which relies on the internet needs to plan for the possibility of a distributed denial of service (DDoS) attack by a botnet. Hereís a quick selection of recent attacks from the website The Register:
Russian newspaper suffering from DDoS attack:
CIA and Paypal suffering from DDoS attack
British ISP suffering from Latvian DDoS attack
Amazon suffers from Christmas DDoS attack
Swedish Signals Intelligence agency (FŲrsvarets
Radioanstalt) DDoS attack
Amazon Cloud under DDoS attack
Dowsing for Explosives
An amusing diversion: a company is apparently selling a bomb detecting device, based upon the sound scientific principle of dowsing. Is using such a device worse than putting up fake CCTV cameras to discourage criminals? It all depends upon how it is used, what it is really being expected to do, and who is being fooled by it.
More info on the company and its remarkable products at:
(Their main website is currently down...)
Nightmare Product Recalls
You have to feel sorry for Toyota. They made a car. A dangerous potential problem was found with their accelerator pedals after they had been in production for four years. They apologized and recalled the affected vehicles. No doubt compensation will be paid for any accidents that were caused. But the lawyers and media are still baying for their blood.
What would happen if a defect was found in one of your products? Would you handle the problem differently?
Hereís a video discussing what was actually wrong with the accelerator
And a good statement from the president of Toyota Motor Sales U.S.A. Jim
A cynic might note how convenient the news coverage is for the remnants of the North American car industry, (who buy a lot of media advertising) and for the U.S. government (who are trying to keep the American car manufacturers solvent). Me, Iím just hoping the coverage will let me buy another reliable Toyota Corolla at a lower price.
Pinning the Blame with Chip and Pin Cards
Do you have one of those extra secure “chip and pin” credit cards where the card is supposedly unusable by a thief without the pin number. Turns out that a thief with a bit of engineering skill can make a pin verified transaction using your stolen card without knowing your pin. You then have the problem of convincing your bank that you didn't make the transaction.
Although the attack shown currently uses a card with a cable running up someone's sleeve for the computer-in-the-middle attack (where communications between the stolen card and the card reader are intercepted and modified), it's clearly not inconceivable to produce a much slicker set-up.
Some good work by the Security Research team at Cambridge University's computer laboratory. [Disclaimer: Cam U. is my alma mater]
Also pay attention to comment 19 and comment 22 on the blog entry: these illustrate how not to do damage control against a well-educated audience. Apparently an employee of the UK payments administration would prefer it if the Security Research group found vulnerabilities in other targets.
Some people claim attacks like this are impractical because of the wire running up the sleeve. They obviously are unaware (or choose to be unaware) of how such technology can be miniaturized. See, for example, the Turbo SIM which can be used to modify communications between a GSM phone and its SIM card. You can buy these on ebay for a couple of dollars.
From the Copywrong Department
I'm always amazed at how foolish some orgnaization's copyright policy is. /p>
This month we feature the ISACA.org who have recently published a Risk Framework which they claim is designed to fill the gap between generic risk frameworks and detailed ones.
I wish I knew what was in it.
But if I did know, I wouldn't be able to tell you or make use of the information.
I kid you not. That's what the isaca.org copyright policy states. Specifically, the second paragraph starts with the words:
"The Material that you are downloading may only be used for personal, informational, non-commercial purposes."
What the proposed use of an IT risk framework that cannot be used for commercial purposes is I do not know. (Perhaps I could use it to manage my kid's Wii. Or would that not be informational?)
Anyway, if you are willing to agree not to use this framework in your organization, ISACA.org have kindly condescended to make it available to you for the cost of your email address and contact information.
Just don't tell me what is in it. I might use that information for corporate, commercial purposes. I wouldn't want you to break their copyright policy.
ISACA Copyright Policy
You can comment on this newsletter on the Risky Thinking blog at
RISKY THINKING is a free newsletter providing essays, analysis, insights, and oddities related to Business Continuity, Disaster Recovery, and Risk Management. You can subscribe on the web at http://www.RiskyThinking.com/newsletter.
Please feel free to forward RISKY THINKING to colleagues or friends who will find it valuable. You may reprint this newsletter providing it is reprinted in its entirety.
Copyright Michael Z. Bell / Albion Research Ltd. 2010