The disclosure of a CPU vulnerability known as Meltdown has major implications for cloud services. According to...
researchers, the Meltdown vulnerability affects cloud providers that use Intel-based hardware and Xen-based paravirtualization, and patches have been released to mitigate the effects.
Other hardware and virtualization providers and users could be affected by the Meltdown vulnerability, as well, depending on their specific environments and the virtual machines and containers they use. For example, we now know that AMD processors are also affected by some of the announced vulnerabilities -- although they are not specifically vulnerable to Meltdown as described -- and virtualization vendors like Microsoft and VMware have issued their own patches to address possible impacts from these vulnerabilities.
What the Meltdown CPU flaw does
The Meltdown vulnerability enables applications and workloads running on affected systems to potentially access the shared memory of other applications and operating systems using the same hardware. This is done by manipulating the chip's built-in programming for speculative execution, where the chip tries to guess execution instructions and relevant data before they are accessed or executed at all. In short, passwords, private keys and any other sensitive data could be exposed across multi-tenant platforms that include the world's leading hypervisors and almost every major container technology, including Docker, Linus Containers and OpenVZ.
This is of particular concern for cloud customers because cloud service providers are wholly responsible for managing and maintaining hardware and hypervisor patches and updates. This represents another variation of the side channel attack, which leverages shared computing assets to access other tenants' data.
There are two attack vectors of the Meltdown vulnerability. First, virtual machine instances in internet as a service (IaaS) provider environments can attack other guest VMs running on the same hypervisor platform. In IaaS environments where multiple tenants are sharing hardware resources, this could enable a malicious co-tenant to access any manner of sensitive data from their target's shared memory. This attack vector was addressed by many of the leading providers, including Amazon, Microsoft and Google, fairly quickly.
The second attack vector enables software applications and services in containers to attack other container services and software by leveraging a shared OS kernel. This attack is conceptually the same as the first, but leverages a different shared resource -- the OS kernel versus the hypervisor.
By applying firmware updates and patches during announced maintenance windows, providers have mitigated some of the threats the Meltdown vulnerability poses. Unfortunately, these fixes are causing numerous performance impacts for many tenants and service providers.
Amazon has admitted to some performance degradation in its environment, and it has committed engineers to work directly with the affected customers to optimize resources and minimize cost increases. Google announced a new software-based binary modification protection system called Retpoline -- return trampoline -- that isolates certain program execution functions for critical OS and hypervisor binaries, thus minimizing performance impacts.
What Meltdown means for the cloud
For cloud providers, there will likely be a series of ongoing patching and tuning exercises required in the future to fully protect against speculative execution flaws, and there will be varying degrees of performance impact to cloud customers along the way.
Containers will continue to be affected, as well, but given that customers have more control over what's running in the containers themselves, there are several recommended mitigation options:
- Minimize the files and file systems in the container. Attacks from one container to another will rely on binaries, libraries, and other files to attack, or to use in attack scenarios. By removing compilers, script interpreters and login shells, cloud customers can reduce the threat surface.
- Use a read-only file system if possible. This will help prevent malicious actions within the container.
- Avoid running containers as the root or admin user. This limitation of privilege will help prevent some attacks that rely on administrative privileges.
- Limit the repositories from which source code can be downloaded, which will help control how container-based applications are built, and ideally prevent any malicious repositories from being used.
It is likely that cloud consumers will see the impact of the Meltdown vulnerability and related flaws, such as Spectre, for some time.