At DockerCon 2016, held in Seattle, USA, Aaron Grattafiori presented “The Golden Ticket: Docker and High Security Microservices”. Core recommendations for running secure container-based microservices included enabling User Namespaces, configuring application-specific AppArmor or SELinux, using an application-specific seccomp whitelist, hardening the host system (including running a minimal OS), restricting host access and considering network security.
Grattafiori, Technical Director at NCC Group and author of “Understanding and Hardening Linux Containers” (PDF), began the talk by introducing the principles of defense in depth, which consists of a presenting a layered defense, and shrinking attack surfaces and hardening those that remain. Although microservices may add overall complexity to a system architecture (particularly when operated at scale), the fact that they can be implemented to not present a single point of security failure provides an advantage over a typical monolithic application.
The principle of least privilege, e.g. not running an application process as root, is vitally important to securing a system. As a monolithic application provides the majority of its functionality via a single process, this makes it difficult to apply this principle. The principle of least surprise - “sane defaults, isolate by trust” - and the principle of least access are also essential to providing defense in depth. Grattafiori noted that “‘least’ is common to all these principles”, as this fights against excess and complexity, and allows system builders to:
- Establish trust boundaries
- Identify, minimise, and harden attack surfaces
- Reduce scope and access
- Layer protections and defenses
The benefits of monolithic application security (AppSec) include the fact that the building and operating are well understood and a ‘known known’ - the architecture is often relatively simple, and there is typically a well-established existing culture (and associated responsibilities) around development, business and compliance, etc. The downsides of monolithic AppSec include that a compromise of a single point often means compromise of the entire application or network, authentication requirements are global in scope, and security is often difficult to tailor.
The upsides of microservices AppSec include: the Unix single responsibility principle is known to work well, the publicly exposed attack surface is typically reduced, services can be independently patched, applications (and associated runtime containers) enable the easier practice of the principle of least privilege and service-specific tailored security, and it is typically easier to establish a Trusted Computing Base (TCB). The downsides include microservice security being an ‘known unknown’, cultural changes may be required for successful operation (such as DevOps and DevOpsSec), a good understanding of the overall system functionality is required, legacy systems are not easily adapted, and complexity (at scale) breeds insecurity.
Grattafiori continued the presentation by examining the security implications of individual areas of a microservice system in more depth, first focusing on network security. Although the majority of software systems offer authentication at the OSI model layer 7 (application), it was argued that this provides limited benefits, and the implementation of layer 4/5, transport layer security (TLS), is highly desirable. If additional network security is required then layer 3 IPSEC can be implemented.
Many organisations are packaging microservices within Linux containers like Docker, rkt or LXC, and there are clear security analogies between the two:
It is essential to prune the application and container threat model attack tree. This includes limiting the damage that can be done via application vulnerabilities by utilising defensive coding and also implementing container security, for example: capabilities, user namespaces, a read-only rootfs, immutable files, mount flags, and mandatory access control (MAC).
The damage done via a container escape can be limited by using user namespaces, which allow for a root user in a container to be mapped to a non root (uid-0) user outside of the container. As of Docker 1.10 user namespaces are supported directly by the Docker daemon, but this facility is not enabled by default. Kernel and syscall exploits can be limited by utilising seccomp, kernel hardening and MAC. The damage done via a compromised kernel or host operating system can be limited using network hardening, isolation on trust, least privilege, least access, and logging and alerting.
Grattafiori stated that container security begins with the host system’s base operating system, and using a minimal Linux distribution such as CoreOS, alpine Linux, Project Atomic or RancherOS was highly recommended. It is also important for an operator to understand how a distribution handles updates, binary package compilation, security defaults (e.g. MAC), and default kernel options and sysctl settings.
Container images should also be minimal, and typical examples include building an image ‘FROM debian:jessie’ or ‘FROM debian:wheezy’. However, this may not be minimal enough, as even before an apt-get install of application-specific software there are still many libraries, executables and language files that the application process does not need - this means more patching, disk space, attack surface and post-exploitation utilities.
Examples for building minimal containers using Docker and runC were demonstrated, and several examples of using a scratch container for the execution of statically compiled binaries (for example, applications built using Golang) were also provided.
Grattafiori stated that the use Mandatory Access Control (MAC) is highly recommended for enforcing the principle of least access via the OS. MAC refers to a type of access control by which the operating system constrains the ability of subject (typically a process or thread) to access or generally perform some sort of operation on an object (e.g. files, TCP/UDP ports, shared memory segments). MAC as a Linux Security Module (LSM) is implemented via AppArmor and SELinux (grsecurity, a set of patches for the Linux kernel which emphasise security enhancements is also recommended, and includes a MAC solution). MAC is implemented on Mac OSX via TrustedBSD, and Microsoft has Mandatory Integrity Control (MIC).
The default Docker AppArmor policy is very good, but as this policy is generic in nature it must include a large number of files, permission grants and complexity. Microservices again allow for the practice of more specific security using custom profiles. Custom AppArmor profiles can be generated with aa-genprof, or Jessie Frazelle’s Bane can be utilised for Docker-based applications. However, profiling of the target application is required (as is understanding and using the application), and common mistakes include providing too much access, dealing with wildcards and the use of path-based Access Control Lists (ACLs).
Grattafiori cautioned that AppArmor deny lists (blacklists) should be avoided because of the limited value they provide. Other “gotchas” presented included the fact that profiles must be loaded by AppArmor first, the abstractions used in the profiles may be overly verbose, and it can be challenging to fully exercise the target application to ensure full functionality is enabled when running in production (although running unit and regression test suites may help here).
Although highly valuable, MAC is still vulnerable to kernel attacks, and the kernel attack surface is huge. ‘Secure computing mode’ (seccomp) is a computer security facility that provides an application sandboxing mechanism in the Linux kernel (however, seccomp is not a sandbox per se). seccomp allows a process to make a one-way transition into a "secure" state where it cannot make any system calls except exit(), sigreturn(), read() and write() to already-open file descriptors. Should it attempt any other system calls, the kernel will terminate the process with SIGKILL. seccomp-bpf is an extension to seccomp that allows filtering of system calls using a configurable policy implemented using Berkeley Packet Filter rules.
The seccomp default filter is enabled by default in Docker Engine version 1.10 onwards, but due to the generic requirements 304 syscalls are enabled (approximately 75% of all available syscalls). The principle of least privilege suggests that microservice applications should have a minimal syscall set made available to them, and accordingly custom profiles can be created. Methods for generating seccomp profiles include strace/ltrace, kernel help (sysdig or systemtap), auditd/auditctl and seccomp itself with SECCOMP_RET_TRACE and PTRACE_O_TRACESECCOMP. A custom seccomp profile can be specified in Docker via the ‘--security-opt seccomp=<profile>’ flag. Grattafiori noted that seccomp profiles are architecture dependent, which limits container portability.
Grattafiori began summarising the talk by stating high-level recommendations for running secure Docker-based microservices:
- Enable user namespaces
- Use application-specific AppArmor or SELinux if possible
- Use application-specific seccomp whitelist if possible
- Harden host system
- Restrict host access
- Consider network security
- Using immutable containers
The problem of managing build and runtime secrets can be solved by using temporary secret injection via the process of temporary bind mounting, followed by loading the secrets into memory only, and unmounting; or ideally using an open source secret management tool like HashiCorp Vault or Square Keywhiz. Secrets should not be injected via environment variables or flat files, as it is very easy for these secrets to leak into container layers, logs or error reports.
Final security recommendations include creating a security specification, generating application-specific and overall threat models, ensuring any applications/services are secure, and ensuring orchestration frameworks and associated service discovery are secure.
Containers and microservices can’t help if you application itself is still vulnerable
Microservice logging and accountability is also important, and logs should be collected and kept centrally, and regularly reviewed. Security is much easier to implement if it is made part of the software application development lifecycle, and verification should be included with a standard build pipeline using tooling like OWASP’s ZAP, bdd-security, Brakeman, and gauntlt.
The DockerCon video of Grattafiori’s “The Golden Ticket- Docker and High Security Microservices” talk can be found on the conference YouTube channel, and the slides are located on Docker SlideShare account. Grattafiori is also the author of the NCC Group Whitepaper “Understanding and Hardening Linux Containers” (PDF), which is essential reading for anyone looking to develop a detailed understanding of container security.