Data security (key management, encryption)
Encryption is a method of transforming visible data (commonly referred to as plaintext) into an output that discloses little or no information about the plaintext. The encryption technique utilized, such as the Advanced Encryption Standard (AES), is open to the public, but its execution is dependent on a secret key. The key is required to decode the ciphertext back to its original form. At Google, encryption is frequently used in conjunction with integrity protection to keep data private.
Why encryption helps secure customer data
- Firstly, encryption is one piece of a broader security strategy. Encryption adds a layer of defense in depth for protecting data.
- Secondly, encryption ensures that if the data accidentally falls into an attacker’s hands. They cannot access the data without also having access to the encryption keys. Further, if an attacker obtains the storage devices containing your data, they won’t be able to understand or decrypt it.
- Thirdly, encryption at rest reduces the surface of attack by effectively “cutting out” the lower layers of the hardware and software stack. Even if these lower layers are compromised, the data on those devices is not compromised if adequate encryption is deployed.
- Lastly, encryption provides an important mechanism in how Google ensures the privacy of customer data.
Google’s default encryption
Encryption of data at rest
Without any intervention from the client, Google Cloud encrypts all customer material kept at rest using one or more encryption techniques.
Layers of encryption
To secure data, Google employs many levels of encryption. Using many layers of encryption, on the other hand, provides redundant data security and allows us to choose the best strategy based on application needs.
Figure 1: Data saved in Google Cloud is protected using many levels of encryption. For practically all files, either distributed file system encryption or database and file storage encryption is in place, as well as storage device encryption.
Encryption at the storage system layer
- To understand how specifically Cloud Storage encryption works, it’s important to understand how Google stores customer data. Data is broken into subfile chunks for storage. Where, each chunk can be up to several GB in size. Each chunk is encrypted at the storage level with an individual encryption key. However, two chunks will not have the same encryption key, even if they are part of the same Cloud Storage object, owned by the same customer, or stored on the same machine1.
- Secondly, each data chunk has a unique identifier. Access control lists (ACLs) ensure that each chunk can be decrypted only by Google services operating under authorized roles, which are granted access at that point in time.
- Next, each chunk is distributed across Google’s storage systems and replicated in encrypted form for backup and disaster recovery. A malicious individual who wanted to access customer data would need to know and be able to:
- access (1) all storage chunks corresponding to the data they want,
- and, (2) the encryption keys corresponding to the chunks.
Furthermore, Google encrypts data at rest using the Advanced Encryption Standard (AES) method. Using the exception of a limited number of Persistent Disks manufactured before to 2015, which employ AES128 by default, all data at the storage level is encrypted with AES256 by default. AES is commonly utilized because (1) the National Institute of Standards and Technology (NIST) recommends both AES256 and AES128 for long-term storage (as of March 2019), and (2) AES is frequently incorporated as part of customer compliance requirements.
Encryption at the storage device layer
In most situations, data is encrypted at the storage device level with AES256 for hard discs (HDD) and solid state drives in addition to the storage system level encryption mentioned above (SSD). AES128 is used by a small number of older HDDs. For user data, the SSDs utilized by Google Cloud employs AES256 exclusively.
Encryption of backups
During the backup process, Google’s backup technology guarantees that data is secured. This method avoids revealing plaintext data needlessly. Furthermore, the backup system encrypts each backup file separately with its own data encryption key (DEK). This is calculated using a key stored in Google’s Key Management Service (KMS) and a per-file seed produced randomly at backup time. All metadata in backups is kept in another DEK, which is likewise stored in Google’s KMS.
Encryption key hierarchy and root of trust
- Google’s KMS is protected by a root key called the KMS master key, which wraps all the KEKs in KMS. However, this KMS master key is AES2564 and is itself stored in another key management service, called the Root KMS. Root KMS stores a much smaller number of keys.
- Secondly, Root KMS in turn has its own root key, called the root KMS master key, which is also AES2564 and is stored in a peer-to-peer infrastructure. Here, the root KMS master key distributor, which replicates these keys globally. The root KMS master key distributor only holds the keys in RAM on the same dedicated machines as Root KMS, and uses logging to verify proper use.
- Further, to address the scenario where all instances of the root KMS master key distributor restart simultaneously. Then, the root KMS master key is also backed up on secure hardware devices stored in physical safes in highly secured areas in two physically separated, global Google locations. This backup would be needed only if all distributor instances were to go down at once; for example, in a global restart. Fewer than 20 Google employees are able to access these safes.
Reference: Google Documentation