Luiz - Fotolia
Microsoft offered a behind the scenes look at Azure security during RSA Conference 2015 and offered some best -- and worst -- practices for the cloud.
In a presentation titled "Assume Breach: An Inside Look at Cloud Service Provider Security," Mark Russinovich, CTO of Microsoft Azure, discussed several real-world customer case studies and also detailed the company's overall approach to cloud security.
Russinovich said Azure security is split into three categories. The first is protection, which features components such as identity and access and vulnerability management. The second is detection, which includes auditing, logging, monitoring and penetration testing. And the third category is response, which involves breach containment and customer notification.
Cloud services are appealing targets for hackers and cybercriminals, according to Russinovich. "It's easy to get a free trial," he said. "And once you get a free trial, you have at your disposal huge network pipes, lots of compute power, and you've got a concentration of vulnerable assets in the cloud, which are the other customers in that cloud."
As a result, Microsoft stresses the "shared responsibility" model between the customer and the cloud provider, Russinovich told the audience. "We can only go so far in what we do to protect you," he said. "We go as far as we can, and we're working on going even further. But it's ultimately your responsibility for some of the things that you do to make sure you're securing your own assets."
For example, he said, Microsoft protects the cloud infrastructure itself, monitors the service looking for anomalies, and provides DDoS attack mitigation and fraud and abuse detection. But the customers themselves are also responsible for preventing fraud and abuse by protecting their credentials and identities, he said.
"Turns out [cloud] fraud and abuse affects everybody because the people who are doing the fraud and abuse are attacking other customers," Russinovich said, "whether they're running inside the cloud or attacking other clouds or out in the wild."
The importance of loud monitoring and logging
To illustrate such issues, Russinovich offered several real-world customer examples that occurred in the last year. The first customer example began with Microsoft getting reports that some virtual machines in Azure were attacking customers outside the cloud. Russinovich said that while Microsoft does not monitor customer VMs and applications without their express permission, it does monitor overall traffic to look for abnormal spikes in activity or suspicious connections.
"If we see a burst in network flow coming from some place, then that could be a problem in our infrastructure, it could be a software bug, or it could be a malicious attacker that has infiltrated a customer cloud or our cloud infrastructure," he said.
In this case, Microsoft determined the attack code was inside the customer VMs and not from Microsoft's own Azure infrastructure. Therefore, Microsoft notified the customer, which was extremely pleased that Microsoft contacted them, and investigated the incident. During that investigation, Microsoft discovered that while the client had logging in place, it hadn't analyzed the data or set up any alerts or incorporate the data into its on-premise SIEM system.
"It turns out that the customer had not really been paying attention to the logs and alerts coming out of their virtual machines," Russinovich said.
The customer also discovered that its antivirus engine in the VMs had been disabled. Russinovich said that when the customer finally looked at the Windows event logs for those VMs, it noticed other suspicious activity in the days before it was notified by Microsoft. That activity included the creation of a new user account, which was then added to the administrator's group for that company.
"This is a case that shows when you move to the cloud, you need to keep in place the best practices you follow in your on-premise environments -- otherwise, you're going to miss things like this," Russinovich said. "Who knows how long it would have gone on unless we notified them."
Addressing vulnerabilities in the cloud
Russinovich also discussed the issue of addressing vulnerabilities and offered two examples to illustrate how the Azure security approach can differ depending on who is ultimately responsible.
First, Russinovich looked at an example that involved the Bash shell vulnerability known as Shellshock, which was disclosed on September 24th last year. Russinovich said attacks on Azure VMs exploiting the vulnerability began almost immediately. "As we were scanning our systems, we saw numerous Linux virtual machines had become attack zombies," he said.
Microsoft immediately notified the Shellshock-exploited customers and sent breach notifications to customers whose VMs had been compromised. The breach notification, which Russinovich displayed on screen during the presentation, gives the customer 48 hours to full patch the OS and mitigate Shellshock.
"We give them time to clean it up," he said "but they're really putting everybody at risk. So we feel a responsibility to take action if the customer isn't going to take action."
If the customer doesn't take action, Russinovich said Microsoft will suspend the Azure deployment and disable the customer's subscription until the matter is resolved. In an interview with SearchSecurity following the presentation, Russinovich said Microsoft hasn't been forced to take that kind of action often but such cases do occasionally happen. "Customers are entitled to do what they want to do as long as it's not harming other customers or the cloud," he said. "It's not a pleasant conversation to have with a client, but it's also not pleasant to see them using Azure in a way that puts others at risk."
Russinovich also discussed an example when Microsoft was at fault, specifically for introducing a bug in a software update for a Microsoft service. If exploited, the bug could have allowed one Azure customer to see another customer's FTP uploaded data. A customer notified Microsoft, and the software giant quickly enacted its security incident response lifecycle -- a nine-step process that includes event detection, DevOps and security team engagements, confirmation of the security event, identifying affected customers, determining impact, and notifying customers.
Russinovich showed a timeline for the FTP bug response, starting with the first customer report of the FTP bug. Once a second customer reported the same bug, the Azure security team was engaged, as was DevOps; within a few hours, a security event was declared and the mitigation process was started, which included a complete disablement of FTP capabilities in Azure. The Azure security then determined the scope of the problem and notified impacted customers while issuing a fix for the bug.
Russinovich said the process following the second customer's report to the rollout of the bug fix took about 24 hours. Thankfully, none of the customers experienced any malicious activity from the bug, he said.
Cloud vs. cloud attacks
Last, Russinovich discussed security issues coming from other clouds and how they can affect Azure. Specifically, he detailed a case where Azure was hit with a DDoS attack from another cloud provider that had been compromised. "In this case, we saw spikes of incoming traffic into our cloud -- 35 million packets per second of attack traffic," he said.
Luckily, Azure's edge systems were designed to take on DDoS attack traffic.
"90% of [DDoS attack traffic] doesn't even make it to our cloud," he said.
Russinovich said the Azure security team began investigating the traffic spike and discovered the attack was coming from not just one but two other clouds, which he did not name. Once Microsoft began working with the two other cloud providers, it discovered they had been compromised in completely different ways. The first cloud saw several customer VMs exploited because of the Shellshock vulnerability and had been turned into DDoS botnets. But the other cloud provider, Russinovich said, had experienced large scale abuse of customer account credentials that allowed attackers to take control of portions of their cloud infrastructure and use it as a DDoS platform.
Speaking with SearchSecurity after the presentation, Russinovich said the majority of security incidents Azure faces are a result of customer or third-party errors. "A lot of those breaches we've seen are because of compromised account credentials," he said, "so we are really preaching things like two-factor authentication."
Indeed, Russinovich told the audience weak credentials were one of the most common causes for Azure tenant breaches, along with VMs missing security patches, insufficient security monitoring, and remote desktop protocol or Secure Shell endpoints that are exposed on the Internet.
Overall, Russinovich said the amount of cloud abuse in Azure is rising and displayed a chart showing abuse incidents more than doubling in the second half of 2014, a statistic that included phishing and DDoS attacks as well as copyright infringement and other illegal activities. He urged the audience to employ best practices like consistent monitoring, timely patching, and better protection for credentials.
"[Abuse] continues to rise because, of course, cloud usage continues to rise," Russinovich said. "It's also rising because attackers are getting cleverer about abuse."
Amazon, Google, and Microsoft highlight cloud provider security issues at RSAC 2015