A multipart exploration into cloud encryption. Part 2: Encryption in Google Cloud Storage.
Introduction
This is the second of a multipart dive into encryption in the public cloud. Part 1 focused on encryption in Amazon S3. Part 2 will focus on encryption in GCP’s Google Cloud Storage (GCS). Unless specified, the options mentioned herein use AES-256 encryption. Topics that will be covered:
- Cloud KMS Primer
- Encryption at Rest
- Server-side Encryption
- Client-side Encryption
- GCS Default Encryption
- Envelope Encryption
- Summary and Next Installment
See Appendix A for acronyms and keys for cloud terminologies.
1.0 Cloud KMS Primer
Before diving into encryption in GCS, we should go over a few terms and high level details about GCP’s Cloud KMS offering.
Cloud KMS uses a hierarchical approach to KMS keys and permissions can be assigned on the project, key ring, or the key itself.
- A KMS key resource id (with version #) displays the hierarchy from top to bottom:
projects/$PROJECT_ID/locations/$LOCATION/keyRings/$KEYRING/cryptoKeys/$KEYNAME/cryptoKeyVersions/$VersionNumber
A key ring is a location-centric collection of encryption keys.
- Keys must live inside of a key ring in Cloud KMS.
- Permissions can be assigned to a key ring and propagate to all keys in the ring. This eases permission management and allows you to group like-keys in the same place.
A key is the cryptographic material used to encrypt / decrypt. A key can have many different versions, but will always have at least one (the primary version).
A key version represents the cryptographic bits of the key for that point in time.
- When a key is rotated, a new version is created.
- The KMS key resource ID stays the same when a key is rotated.
- When specifying a CMEK (covered later), the version number is not required because Cloud KMS determines which key version was used previously.
- Key versions can exist in four states: enabled, disabled, scheduled for destruction or destroyed. Key versions can only be used for cryptographic operations when they are in the enabled state.
Cloud KMS does not allow key rings or key resources to be deleted.
- Key material (the cryptographic bits) can be destroyed and the key / key version is moved to the destroyed state. By not allowing key resources to be deleted, GCP ensures that the resource IDs will always point to the original resource. If a key was allowed to be deleted, it could be recreated with the same name and same resource ID in the future.
2.0 Encryption at Rest
GCP is unique in the public cloud industry in that it encrypts all customer data at rest by default using Google-managed keys. There is no action needed from the end user to toggle this feature on, and you cannot turn it off. We will cover default encryption (at rest) later on.
In addition to GCS Default Encryption, you can add an additional encryption layer in relation to server-side encryption: Customer-managed encryption keys (CMEKs) and Customer-supplied encryption keys (CSEKs).
3.0 Server-side Encryption
SSE is encryption that occurs after GCP / GCS receives the data but before that data is written to a disk.
3.1 Customer-managed Encryption Keys
Customer-managed encryption keys are generated and stored within Cloud KMS and can be managed and governed inside a completely separate project than your GCS buckets or objects. There are caveats to this approach, which are outside of the scope of this blog, but generally this is a recommended best practice for separation of duties and easier project / IAM management.
Service accounts play an important role in using customer-managed keys in GCS (and many other GCP services). These Cloud IAM accounts are where you assign permissions for services to use the Cloud KMS keys. The accounts are then leveraged by the service on your behalf to automate the process of encryption and decryption. There are a couple of important prerequisites and scenarios to keep in mind in regards to GCS service accounts and CMEKs:
In order for a CMEK to decrypt an object for a user:
- The user must have permission for the bucket or object (sometimes object level ACLs are in place to block bucket level permissions)
- The customer-managed key that originally encrypted the object must still exist. This is critical that even after you rotate a CMEK — you need your old keys.
If the CMEK does not exist or the service account no longer has permissions to decrypt the key, the object cannot be read.
If for some reason you delete a CMEK or lose access to the encryption key for your objects, you can view the objects’ metadata for “kmsKeyName” which is NOT an encrypted value. The value of that metadata key will provide insight into which object was encrypted with the lost key. Updating this key/value pair will NOT re-encrypt the object — you must re-upload the object with a new CMEK.
Not all information about an object (commonly referred to as metadata) is encrypted with a CMEK. The object’s data is encrypted in addition to a MD5 hash and CRC32C checksum. You can read more about those last two here. Not all of an object’s metadata is encrypted because there needs to be a human readable value somewhere for identification. Metadata should not be used for sensitive information unless it is encrypted.
Note: The CMEK you use for a bucket MUST be in the same location as the bucket.
In the event that you need to abide by FIPS 140–2 (up to level 3) but still want to use CMEKs, GCP offers their Cloud HSM solution. GCP manages the HSM cluster for you and key creation and operation is generally the same as Cloud KMS. In addition, Cloud HSM uses Cloud KMS for it’s frontend so visually it is a similar experience. Cloud HSM CMEKs can be used like Cloud KMS CMEKs for GCS except HSM keys are FIPS 140–2 validated.
3.1.1 Code Examples
The following exercise will authorize a GCS service account to encrypt and decrypt your specified Cloud KMS Key, and then configure the CMEK as your bucket’s default encryption. If your project does not currently have a GCS service account, it will create one and assign Cloud KMS CryptoKey Encrypter/Decrypter permissions to the specified key. If one already exists, that service account will be assigned the previously mentioned role for the specified key.
gsutil kms authorize -p $PROJECT_ID -k
projects/$PROJECT_ID/locations/$LOCATION/keyRin
gs/$KEYRING/cryptoKeys/$KEYNAME
This command alone will not assign your key as the default CMEK for a bucket. To complete this exercise you use the same command with a different option:
gsutil kms encryption -k
projects/$PROJECT_ID/locations/$LOCATION/keyRin
gs/$KEYRING/cryptoKeys/$KEYNAME
gs://$BUCKET_NAME
As noted previously, the key ring / key you specify for your GCS CMEK must be in the same location as your bucket. For example, you cannot use a global key ring or key with a us-east1 bucket. Neither the console nor the CLI will give you any errors, but if you try to upload an object you will receive the following:
BadRequestException: 400 Cloud KMS region
‘global’ not available for use with GCS region
‘US-EAST1’
3.2 Customer-supplied Encryption Keys
Customer-supplied encryption keys (CSEKs) are AES-256 encrypted, Base64 encoded keys that you provide during each and every GCS operation. These keys share some of the same behaviors and functionality as customer-managed encryption keys (such as what GCS object metadata they encrypt), but there are many key differences you should be aware of. For starters, you are responsible for generating and managing the CSEK outside of Cloud KMS AND providing the correct CSEK for every GCS call. CSEKs are still utilized once GCP receives the data but are not stored in Cloud KMS. Instead, the keys are removed from GCP’s servers as soon as the operation completes.
GCS will still perform the encryption and decryption of your objects with a CSEK, but you must specify the key on every call because it is not stored in GCP. CSEKs require much more management overhead than CMEKs because you are now responsible for the storage and rotation of your encryption keys. You cannot retrieve a CSEK from a GCS bucket or object. GCS tracks which object was encrypted with a specific KMS key by storing a one-way hash as metadata on each object. That hash cannot be reversed into the encryption key that was used. If you lose your CSEK, your data is unreadable and gone for good which is why you must track
In the Google Cloud Console, CSEKs are not supported for GCS. You cannot upload, download, or view an object’s data in the console with a CSEK because each operation requires specifying the CSEK that was used. CSEKs also can’t be configured as the default encryption option for a GCS bucket, because Cloud KMS does not store your CSEKs. If you rotate your CSEK, you have to rewrite the object with the new CSEK — there is no automated rotation functionality in GCP for CSEKs.
You may be asking yourself, “why even bother with CSEKs?” after reading all of those restrictions. The majority of the reasons point to compliance frameworks or regulatory requirements that require the customer to manage their own cryptographic key material.
Both CMEKs and CSEKs are viable options for server-side encryption, but it is important to know the pros and cons of each and also any industry regulations or compliance requirements your organization must adhere to before making your decision.
3.2.1 Code Examples
If you want to pass in your own generated Base64 encoded key when you upload an object to a GCS bucket:
gsutil -o ‘GSUtil:encryption_key=$YOUR_KEY’ cp
/some/local/file gs://$BUCKET_NAME
The “-o” option needs to be passed in for each operation on a CSEK GCS object when a cryptographic operation is involved. You can also bypass the need to specify a CSEK for each operation by adding it to your local boto configuration file.
Note: In the above command, you can pass in a key that’s managed in Cloud KMS for the “-o” option and the object will be encrypted with a CMEK. Not all CSEKs are locally generated.
4.0 Client-side Encryption
Client-side encryption behaves as the name suggests — you encrypt the data on your end before sending it to GCP. Similar to customer-supplied encryption keys, you are responsible for the management and the rotation of your keys. An important difference, however, is that you also have to perform the encryption and decryption on your own. You cannot leverage a GCP managed service for client-side encryption to offload the cryptographic operations. Using this method of encryption allows you to completely abstract away from GCP all cryptographic operations which may be a requirement from your security department. Neither GCS nor GCP will know anything about your keys or even if your data was encrypted, so do not lose your keys or else your data is unrecoverable.
5.0 GCS Default Encryption
As mentioned previously, by default all objects are encrypted with Google-managed encryption keys. It does not matter if you use customer-managed keys, customer-supplied keys, or even client-side encryption — they will all receive an additional layer of encryption with Google-managed encryption keys. No configuration or permission management is required for the end user and all of the cryptographic operations happen seamlessly behind the scenes. Google-managed encryption keys are stored and managed using Google’s internal key management service and NOT Cloud KMS.
Note: All GCP products utilize default encryption at rest with Google-managed encryption keys, not just GCS.
Customer-managed encryption keys can be configured as an additional default layer of encryption on the bucket level. This means that once configured, all objects uploaded will be encrypted via Google-managed encryption and then with the CMEK specified on the bucket. If you pass in an encryption key during an encryption operation, the supplied encryption key trumps the CMEK specified on the bucket level, but not the google-managed key. This specific scenario allows users to have a mix of customer-managed and customer-supplied encrypted objects in the same bucket — all encrypted by Google-managed keys.
Note: CSEKs are not supported for GCS default encryption.
5.1 Code Samples
To configure the default CMEK on a GCS bucket (remember, bucket + key must be in the same location):
gsutil kms encryption -k
projects/$PROJECT_ID/locations/$LOCATION/keyRin
gs/$KEYRING/cryptoKeys/$KEYNAME
gs://$BUCKET_NAME
Remove the default CMEK on a bucket:
gsutil kms encryption -d gs://$BUCKET_NAME
6.0 Envelope Encryption
Envelope encryption is the process of encrypting a key with another key. This trend can continue as long as you’d like (keeping in mind performance issues) until there is one plaintext key that needs to be locked away and heavily protected. At its simplest level, envelope encryption involves a Data Encryption Key (DEK) and a Key Encryption Key (KEK). DEKs are used to encrypt the data and KEKs are used to encrypt (wrap) the DEK. If you are utilizing Cloud KMS with either CMEK or CSEK, GCS creates your DEKs for you and then sends to Cloud KMS (who manage your KEKs) to get wrapped for storage alongside your object chunk. For client-side encryption, you are responsible for managing the entire envelope encryption flow including the KEKs and DEKs.
When an object is uploaded to GCS, it is broken up into subfile “chunks” and stored across different devices on GCP servers. Each chunk has its own unique wrapped DEK that is stored nearby. Storing the wrapped DEK near the data it encrypts allows for low latency response times.
Important notes about DEKs, KEKs, and object chunks:
- No two chunks will have the same DEK, even if they belong to the same object, are owned by the same end user, or are stored on the same hardware (some exclusions apply).
- Chunk & DEK configuration greatly reduces the blast radius if a DEK is compromised because the malicious individual would need EACH chunk’s location and have cracked EACH chunk’s DEK.
- Each DEK belonging to the same object is encrypted with the same KEK.
- All of the chunking and encrypting happens before the data is written to disk.
- KEKs cannot be exported from Cloud KMS and all encryption and decryption of DEKs happens inside Cloud KMS and are then passed to GCS.
GCP takes envelope encryption even further with something called the KMS master key and it wraps all the KEKs in KMS. This key is stored in another key management service called Root KMS. Root KMS is THEN encrypted with a key called root KMS master key which is stored on a peer-to-peer infrastructure called the root KMS master key distributor. For more information on GCP’s use of envelope encryption, view the official documentation.
7.0 Summary and Next Installment
To summarize, we covered the encryption options in GCS, ranging from customer-managed and customer-supplied encryption keys, default encryption, client-side encryption, data chunking, DEKs, KEKs, and envelope encryption. For a deeper dive into any of the topics covered, reference Appendix B. If there’s a topic or area you’d like us to cover, ping us on all socials!
Appendix A — Abbreviations and Keys
This section contains abbreviations and keys referenced in this blog.
- AES — Advanced Encryption Standard
- CMEK — Customer-managed encryption key
- CSEK — Customer-supplied encryption key
- DEK — Data Encryption Key
- GCP — Google Cloud Platform
- GCS — Google Cloud Storage
- KEK — Key Encryption Key
- KMS — Key Management Service
Appendix B — Resources
- https://cloud.google.com/storage/docs/encryption
- https://cloud.google.com/security/encryption-at-rest/default-encryption/
- https://cloud.google.com/storage/docs/encryption/using-customer-supplied-keys
- https://cloud.google.com/storage/docs/encryption/using-customer-managed-keys
- https://cloud.google.com/kms/docs/envelope-encryption
- https://cloud.google.com/storage/docs/encryption
- https://cloud.google.com/storage/docs/encryption/default-keys
- https://cloud.google.com/storage/docs/encryption/customer-supplied-keys
- https://cloud.google.com/storage/docs/encryption/customer-managed-keys
- https://cloud.google.com/kms/docs/object-hierarchy