Atlassian recently published how it performs Application Level Encryption at scale on AWS while utilising high cache hit rates and maintaining low costs. Atlassian's solution runs over 12,500 instances and manages over 1,540 KMS keys. It performs over 11 billion decryptions and 811 million encryptions daily, costing $2,500 per month versus a potential $1,000,000 per month using a naive solution.
Cryptor is an encryption library developed by Atlassian to suit their specific Application Level Encryption (ALE) needs at scale in multi-region environments. It is a thin wrapper over the AWS Encryption SDK. Atlassian engineers designed it to offer automated key management, high availability (similar to Atlassian's Tenant Context Service), distributed caching, and the enforcement of soft limits to enable high-scale operations. Developers can integrate Cryptor as a library or a sidecar, exposing its functionality as HTTP and gRPC APIs.
David Connard, principal developer at Atlassian, explains why Atlassian chose to implement ALE wherever possible:
With ALE, sensitive data is encrypted before storage and only decrypted when required (i.e. at the point of use, in the application code). An attacker who gains access to the datastore (or, more commonly, who gains access to a historic replica of it, for example, a backup stored in a less secure location) does not automatically gain access to your sensitive data.
Connard explains that implementing ALE creates significant operational concerns. Implementors should never lose the ability to decrypt the data, encryption key integration should always be protected, and engineers should consider the performance impacts of adding encryption, as ALE adds significant computational effort to the application.
At the heart of Atlassian's ALE is Envelope Encryption. Envelope Encryption is a cryptographic technique used to secure data. It works by encrypting the data with a unique key called a "data key". Engineers then encrypt it with another key, the "root key". Then they bundle the encrypted ciphertext and the encrypted data key in an "envelope encrypted payload" and persist this payload to the data store.
The benefits of using envelope encryption over direct encryption with the root key are that each data key is only used for a small subset of your data, the encryption materials can be cached and re-used across multiple encryption requests, and it allows for fast symmetric encryption algorithms.
Envelope Encryption is well-supported by the AWS Encryption SDK. However, the SDK is mainly designed for single-region scenarios, whereas Atlassian has a heavily multi-region use case, with KMS keys stored and service running in multiple regions. Also, AWS' SDK enforces strict correctness, which makes sense at lower performance scales. However, Atlassian had to loosen some restrictions and enforce them softly to handle its high-scale operations.
Atlassian also encrypts all of its data at rest. However, encryption at rest provides no defence against many types of data exfiltration possibilities, such as a failure to restrict access to the data store, an authorised application doing something unsafe with restricted data at runtime, or legitimate access to data stores by staff for debugging purposes, or to resolve incidents.