Security monitoring isn't something most organizations have historically done well, for a number of reasons: large and increasing log data volumes, competing priorities from business-facing initiatives, lack of resources and review fatigue, among others. But in the cloud, monitoring becomes even more complicated and difficult because the same forces that make cloud possible can have a negative impact on monitoring controls and erode an organization's ability to take action in response to events.
This means taking a laissez-faire attitude to monitoring after a cloud service is implemented will result in either serious security blind spots or a near-complete inability to monitor at all. To avoid this, organizations need to think strategically about monitoring in a cloud environment; they need to think through how they approach monitoring in new cloud deployments, but they also need to review existing deployments to make sure monitoring capabilities are what they expect. You'd be surprised at how often organizations fail to do this, and how often they get burned as a result.
Cloud security monitoring challenges
Most security practitioners are probably familiar with some of the challenges that impact detective controls when they intersect cloud-enabling technologies like
Specifically, consider what happens when traditional log management, log correlation or security information and event management (SIEM) tools are used to support a dynamic virtual environment where virtual machines are spawned on the fly to meet spikes in demand and are recycled when no longer required. Will existing monitoring tools provide relevant information in that bursting scenario? Unless you plan specifically for it, they probably won't. Even if you do configure transient or ephemeral hosts to send logs to a central log correlation and archival tool, trying to tie together log data originating from multiple ephemeral hosts (for example hosts using IP addresses that have been recycled multiple times over) can be an exercise in futility. Also, technologies like vMotion that dynamically change where virtual hosts reside on the network can have unintended consequences as well; for example, by relocating an image to a location where log data can no longer reach log collection and forwarding agents.
In addition to technical challenges, there are business and process challenges. Aside from the obvious --for example, shifts in scope where more sensitive applications are moved to an existing environment only evaluated for low-sensitivity usage -- there are also service provider issues. Is your service provider really keeping the types of records you think they are? Have you checked? And what support do you have for visibility into the lower levels of the stack that are (by design) a black box in the cloud model you're using (for example, the operating system logs in a Software as a Service (SaaS) or Platform as a Service (PaaS) deployment)?
Cloud security monitoring guidance
To develop a cloud security monitoring strategy, you'll need two pieces of data: anticipated scope of coverage (i.e. what systems you want to monitor) and a list of the monitoring capabilities you expect to be in place.
Your list of monitoring capabilities should include those implemented by your service provider (if you're using one), those you rolled out along with your cloud deployment, and those you currently rely upon (i.e. existing capabilities). With this list plus the scope, you can determine gaps between what you expect and what's actually in place. Closing these gaps should be included in your intermediate-term planning efforts. Options for how to address the gaps will depend on your cloud strategy; it can involve pushing back on service providers to implement more/better monitoring, deployment of new products, or expansion of the scope of existing controls.
To be most effective, the inventory of monitoring capabilities should also lead to tactical review for each capability. You need to specifically test each one in a routine (periodic) fashion; you'd be surprised how often seemingly subtle changes can directly impact monitoring controls. When that happens, failures can persist indefinitely without someone actually looking for those failures and fixing them. The tactical review should also include drills – simulated investigations that attempt to actively trace events to their source. This should involve requesting log artifacts from your service provider and sifting through application and OS log artifacts. The goal is to discover -- and ideally fix -- any problems before they have an opportunity to impact a real-life security event like a breach.
About the author:
Ed Moyle is a senior security strategist with Savvis as well as a founding partner of Security Curve.
This was first published in September 2012