Problem solve Get help with specific problems with your technologies, process and projects.

Cloud risk management: Managing the risk of cloud outages

Companies need to prepare for the eventuality of cloud service interruptions.

This tip is a part of the learning guide, Cloud computing risk management: Assessing key...

risks of cloud computing.

You’ve loaded your data onto a cloud service network, tested your connections, beta tested response time and ease of use, and trained your users. You’re looking forward to enjoying the services and flexibility of the service offering.

However, have you done your best to reduce the risk of harm to your company in the event of a service interruption? Recent cloud outages at Microsoft, Amazon and others are sad reminders that -- just like most other things -- cloud services are not perfect. They can be interrupted, despite the promises of skilled advertisers. What can a company do to prepare for this eventuality? There are many precautions possible, to be taken at different stages for cloud risk management.

1.     Do not entrust the cloud with highly time-sensitive data.

Cloud is not for any type of data, or for any type of applications. Some applications are so time sensitive that their use cannot tolerate any interruption. Before venturing into the cloud, you should assess the effect a service interruption would have on your company’s operations. What would be the effect on the sales process, or on the finance department? If the service were down unexpectedly for three or five hours, would your company be able to continue to operate? What if the interruption lasted for several days? You should conduct your own internal due diligence and assessment before spending any time shopping for cloud services.

2.     Conduct appropriate due diligence of the cloud provider.

You should also have a clear understanding of the capabilities and stability of the potential service provider. Thorough due diligence is highly recommended.

In most cases, it will be difficult or impossible to conduct due diligence of the security measures or other safeguards in place at the service provider. If the volume or expected income from the transaction is not sufficiently substantial to warrant special attention from the cloud vendor, the vendor will object to any form of due diligence.

Consider alternatives. For example, the users of most services have created communities where they share tips and tricks, as well as frustrations, on the use of the service. User groups and other posting on the Web are an excellent source of information, which should allow you to gauge the nature of the service provided. Read existing comments, and ask questions on user forums, to help you assess the quality and reliability of the service through user responses and comments.

3.     Assess the cloud providers’ commitment to provide uninterrupted service.

It’s not unusual to find marketing material claiming the service will be uninterrupted, 100% available, with redundant capabilities, etc. Marketers, advertisers and aggressive sales representatives have a lot of enthusiasm and use many hyperboles to describe the superb qualities of their offering.

However, the actual terms of the contract -- which are the only ones enforceable between the vendor and the customer -- do not reiterate these promises. Carefully read the service-level agreement (SLA) -- the actual commitment made with respect to the service and the service levels.

Understand as well that service providers are not insurance companies and they cannot and will not commit to 100% availability all the time, unless you pay a very high premium.

4.     Prepare for the inevitable service interruption: Use back up or alternate service.

Consider redundant services or workaround, in order to ensure your own operations will not be interrupted, or will not suffer from the interruption of the cloud service. For example, if your email service is relying on a third-party provider of email services, such as Google or Microsoft, you should have in place alternate means of communications so your personnel can continue communicating as needed in the event of an email black out. Consider developing an “emergency plan”  such that your employees know what to do, and how to communicate if they lose access to their email or their documents.

If having access to your data is essential, such that any interruption would cause significant harm, consider the use of redundant systems with different service providers in different locations. While in some cases this alternative may not be feasible, practicable, or affordable, for some companies, it might be worth the expenditure for others.

5.     Understand the relevant clauses in your contract.

A well-drafted and well-negotiated contract may provide some relief in the event of a service interruption, in the form of compensation for certain losses. Make sure the contract addresses service interruption. These provisions, if any, are likely to be found in service-level agreements and are likely to limit the service provider liability to a certain range. For example, if the service is interrupted for less than X minutes, there is no indemnification or compensation for the user. In addition, the compensation may not be more than a certain maximum amount.

Most important, the compensation is likely to be limited to reimbursement of the fee -- or a portion thereof -- for the service. There will be no compensation for the loss of business, or damage to property that results from the interruption of the service.

Make sure you understand what you would receive in the event of a blackout. Try to obtain specific examples that show how the compensation provided for in the contract would be computed. In most cases, you are likely to see the damages you might receive, if any, are only a pittance compared to the actual cost -- mostly consequential damages -- caused by the service blackout. 

6.     Understand the scope of “force majeure.”

Contracts usually contain a “force majeure” clause, which allows the parties to cease performing under the contract if certain events occur that are deemed out of the control of the affected party. Read this clause, and understand the exceptions. While many clauses are limited to acts of God, riots, war and similar events, other clauses are much more comprehensive, and carve out events that are more likely to occur. The force majeure clause is likely to be very important in case of an outage, depending on its cause..

7.     Would transitioning to a different service provider be feasible?

Many contracts offer, as a sole remedy for service interruption, the ability to terminate the contract and transfer to another service provider. Would this be the right alternative for you? If outages become frequent, you may find the best alternative for your company’s viability is to move on and change providers. However, while terminating a contract in case of poor performance seems appealing, it may actually create more problems than it solves. How much time and effort would it take to relocate to a different service? How would your databases adapt to new applications? Would it even be possible? If you plan to take advantage of a clause that would allow your company to terminate the contract, make sure this will be possible, and ensure you are well-prepared for this move far ahead.

As more companies embrace the use of cloud computing, more examples of disappointments, mishaps, poor performance, or service interruption are being made public. While some cloud outages are the result of unreliable, deceitful or incompetent providers, others occur at reputable, experienced companies because  cloud technology has some fragility and vulnerabilities. These examples, however, should  not discourage companies from using the technology when appropriate. Rather, they should serve to increase companies’ awareness of the limitation and risk of use of the service. Equipped with this better awareness of the risk and potential exposure, companies can better approach a cloud computing deal and perform successful cloud risk management.

About the author:
Francoise Gilbert is the managing director of the IT Law Group and serves as the general counsel of the Cloud Security Alliance. She focuses on information privacy and security, cloud computing, and data governance. She has been named one of the country’s top privacy advisors in a recent industry survey and has been recognized by Chambers USA and Best Lawyers in America as a leading lawyer in the field of information privacy and security. Gilbert is the author and editor of the two-volume treatise Global Privacy & Security Law, which analyzes the data protection laws of 60-plus countries on all continents. She serves on the Technical Board of Advisors of the ALI-ABA and co-chairs the PLI Privacy & Security Law Institute. This article only reflects her personal opinion and not that of her clients or the Cloud Security Alliance.

Dig Deeper on Cloud Computing SLAs and Legal Issues

Join the conversation

1 comment

Send me notifications when other members comment.

Please create a username to comment.

Point #2 , particularly the sentence below, implies that the probability of service interruption of a cloud implementation is higher compared to the probability of service interruption of an on-premise based implementation. Is this a fact? If it's not then this point was misplaced.

"....What would be the effect on the sales process, or on the finance department? If the service were down unexpectedly for three or five hours, would your company be able to continue to operate? What if the interruption lasted for several days?..."