In a whitepaper entitled Cryptographic Cloud Storage (PDF), Seny Kamara and Kristin Lauter from the Microsoft Research Cryptography Group, propose a “virtual private storage service” offered by public clouds using new cryptographic techniques.
Cloud computing has gained some traction lately and 2010 is considered to be the year of the cloud by some. While the benefits of using computing in the clouds are well known, its adoption is hindered by security concerns. Individuals may have no problem using an online storage service from a company that has a good security history track, but companies and governmental agencies are very reluctant to trust their data to the uncertainty of the public clouds.
Kamara and Lauter propose a virtual private storage service which would satisfy the following requirements:
- confidentiality: the cloud storage provider does not learn any information about customer data
- integrity: any unauthorized modification of customer data by the cloud storage provider can be detected by the customer
- non-repudiation: any access to customer data is logged, while retaining the main benefits of a public storage service
- availability: customer data is accessible from any machine and at all times
- reliability: customer data is reliably backed up
- efficient retrieval: data retrieval times are comparable to a public cloud storage service
- data sharing: customers can share their data with trusted parties.
Most of the requirements are obtained by encrypting the documents stored in the cloud, but encryption makes it very hard to search through such documents or to collaborate in real time editing. The Cryptographic Cloud Storage whitepaper proposes an architecture for a cryptographic storage service that would solve the security problems of “back-ups, archival, health record systems, secure data exchange and e-discovery”.
The architecture is based on three components:
- Data Processor (DP) – processes data before sending it to the cloud
- Data Verifier (DV) – verifies data’s integrity
- Token Generator (TG) – generates tokens allowing the service provider to retrieve documents
The consumer solution involves using a local application that has the three above mentioned components. Before uploading data to the cloud, Alice uses the data processor to encrypt and encode the documents along with their metadata (tags, time, size, etc.), then she sends them into the cloud. When she wants to download some documents, Alice uses the TG to generate a token and a decryption key. The token is sent to the storage provider to select the encrypted files to be downloaded. After that, the DV is invoked to verify the integrity of the data using a master key. The document is decrypted using the decryption key.
Collaboration is done by Alice generating a new token plus a decryption key which are sent to Bob who uses them to retrieve documents from the cloud and to decrypt them.
For the enterprise, the whitepaper proposes a similar approach:
This solution proposes the introduction of an additional Credential Generator (CG) that generates tokens for any user involved in storing/retrieving documents into/from the cloud. The credential token establishes what rights one has over a specific document and it is used to manage access to documents. The rest of the process is similar to that for a consumer architecture.
In order to prepare the data for the cloud, the data processor:
begins by indexing it and encrypting it with a symmetric encryption scheme (e.g., AES) under a unique key. It then encrypts the index using a searchable encryption scheme and encrypts the unique key with an attribute-based encryption scheme under an appropriate policy. Finally, it encodes the encrypted data and index in such a way that the data verifier can later verify their integrity using a proof of storage.
Microsoft Research Cryptography Group and other research organizations have developed techniques for searchable encryption but the main problem is they are unacceptably slow, tens of seconds for a single word search. More research and advances in the searchable cryptography field are necessary before this approach becomes a viable solution for a virtual private storage service.