What IT has to know about declaring disasters is that it doesn’t really matter what kind of incident is affecting your business. It might be something climactic like a hurricane, but more often it’s a mundane incident like a power outage and simple human error. What makes it a disaster is the extent of the impact on your business. So to determine how to respond, IT needs to ask, “How long will it take to restore the systems and/or data affected by this incident?” Only a portion of your systems may be compromised, but if a full deployment of your DR plan would take 24 hours — and addressing the individually compromised parts of your business would take just as long — it might be wise to declare a disaster and go ahead with the full DR execution.
This is why careful, ongoing monitoring is so critical. You have to know what is affected and how long it’ll take to restore — a process that begins, actually, well before disaster ever strikes. You need to perform a thorough assessment of what optimal performance looks like under normal circumstances, and only then can you judge the damage in an emergency and estimate what it’ll take to recover.
As for the lack of communication between recovery staff, I have to say, it’s disappointing to see this listed as a top challenge. I have written multiple posts about why and how to draft a complete and rigorous incident response plan. With numerous stakeholders and compliance issues to satisfy during recovery, communication breakdowns simply can’t be allowed to occur.
And just like with an incident response plan, testing your DR plan is critical. You want to identify issues and gaps during a test and not during a real disaster. The pressure will be high enough during a real event, and you want to be certain roles and responsibilities are clear and that no underlying technical issues exist.