Skip to content
Chris LeiblApr 29, 2019 12:00:00 AM5 min read

Event Driven Security on Google Cloud Platform

Event Driven Programming

Event driven programming has grown increasingly popular with the rise of cloud computing. As application architectures begin to embrace Serverless, many applications are being run purely by event driven services. AWS Lambda, Azure Functions, and Google Cloud Functions are all popular cloud services which allow for applications to execute arbitrary code based on specific events.

Reference: https://cloud.google.com/images/products/functions/how-it-works.svg

Reference: https://cloud.google.com/images/products/functions/how-it-works.svg

Event Driven Programming + Security

At ScaleSec, automation is one of our guiding principals when working with customers to improve their security posture in the cloud. As DevOps continues to take over the application development ecosystem, we consistently see security programs struggle to keep pace. In order to close this gap, the concepts of event driven programming can be integrated into DevOps processes to build automated responses to security related events.

Event Driven Security on Google Cloud

A common problem InfoSec teams encounter when running applications in the cloud is virtual machines (VMs) whose virtual firewalls allow unfettered access from the internet. These VMs can create big problems in a matter of minutes, not hours or days. Internet exposed machines are exponentially more vulnerable to adversarial takeover, simply on the basis of increased attack surface.

In order to combat internet exposed VM’s, Google Cloud Platform users can leverage the following services to automatically respond to the creation o update of a firewall rule to ensure open access to the Internet is not allowed:

  • Admin Activity Logs
  • Aggregate Export Sinks
  • PubSub
  • Cloud Function

Solution Overview

In this section, we will describe how each of the above services integrate together to create an automated security workflow for remediating firewall rules allowing SSH access from the Internet.

Visual Diagram of Automated Workflow

Visual Diagram of Automated Workflow

Admin Activity Logs

Every time an interaction occurs with a GCP resource, the API call made is logged in Google Stackdriver as an Admin Activity Log.

In our solution, we can leverage these logs to alert us to when certain security related events occur, such as an update to an existing firewall rule. Admin Activity logs are always enabled, but to use them in a workflow we must first create an aggregate export sink.

Aggregate Export Sinks

Aggregate Export Sinks are set at an organization or folder level and include all child objects (ie all projects in a folder). These sinks allow users to select log entries based on a set filter and send them to BigQuery, Cloud PubSub, or Cloud Storage.

In our solution, the Stackdriver filter we will use to capture logs which relate to firewall rule creation or update is:

logName:logs/compute.googleapis.com%2Factivity_log
resource.type:gce_firewall_rule
jsonPayload.event_subtype: (compute.firewalls.insert OR compute.firewalls.update OR compute.firewalls.patch)
jsonPayload.event_type:GCE_API_CALL

This filter can be translated in english to the following criteria:

  • Match the logname “logs/compute.googleapis.com%2Factivity_log”
  • Have the resource type equal to “gce_firewall_rule”
  • Have an event subtype of “compute.firewalls.insert” or “compute.firewalls.update” or “compute.firewalls.patch”
  • Have an event type equal to “GCE_API_CALL”

Additional information on how to create advanced Stackdriver filters can be found here.

Pub/Sub

Cloud Pub/Sub is a real-time messaging service which allows for independent systems to publish or subscribe to messages in a queue. In our demonstration workflow, a Pub/Sub topic is used by the aggregate export sink as a destination to send logs which have matched our filter.

Our PubSub topic also serves as the trigger for our Cloud Function. The PubSub will only trigger the Cloud Function when it receives a log, providing greater efficiency and minimal compute cost when compared to earlier implementations, where persistent VMs would periodically poll APIs to check configuration settings.

Cloud Functions

Cloud Functions allow us to write small, purpose built functions which are triggered when a specified event occurs. In our solution, a Cloud Function will be trigged whenever a firewall is created or updated.

Let’s step through the Cloud Function to understand how it works.

The activity log generated during the firewall event is passed to the Cloud Function in the data field of the PubSub Message. PubSub messages are base64 encoded JSON objects so we must decode the message and parse the JSON.

This is shown in the first lines of our process_log_entry function in main.py.

import base64
import json
import googleapiclient.discovery
import string
import time
def process_log_entry(data, context):
#base64 decode the data field
data_buffer = base64.b64decode(data['data'])
#Load JSON so we can parse it
log_entry = json.loads(data_buffer)
# Get Firewall Name by Parsing Log
firewall_name = log_entry['jsonPayload']['resource']['name']
# Get Project ID by Parsing Log
project_id = log_entry['resource']['labels']['project_id']
view raw main.py hosted with ❤ by GitHub

With the firewall name and the project ID, we can use the Google Cloud Python SDK to describe the firewall.

# Create Client for the Compute Engine API
service = create_service()
print('Describing Firewall')
# Checks if the Firewall is disabled. Returns True/Faalse
disabled = check_for_disabled(project_id,service,firewall_name)
# Gets the Allowed Source Ranges for the firewall rule. Returns a list of CIDR Blocks
source_ranges = get_source_ranges(project_id, service, firewall_name)
# Checks if the Firewall is an "Allow All" Rule. Returns True/False
allow_all = check_for_allowed_all(project_id, service, firewall_name)

After we have the necessary information about the firewall resource we can assess if the firewall allows SSH from the internet.

# If Firewall Rule Allows all then call the disable firewall function
if allow_all == True:
# Require short sleep for demo as it takes a few seconds between log creation and firewall resource completion. Will get 400 Error if we don't
time.sleep(20)
disable_firewall(project_id, service, firewall_name)
print("Firewall %s Disabled" % firewall_name)
else:
# Function to get all of the allowed Ports for the Firewall Rule. Returns list of ports and ranges
allowed_ports = get_allowed_ports_list(project_id, service, firewall_name)
# Function checks if SSH is allowed. Returns True or False
ssh_allowed = check_for_port_22(allowed_ports)
# If TCP Port 22 is allowed and 0.0.0.0/0 is in the Source Ranges List and the Firewall is not disabled, then disable the firewall
if ssh_allowed == True and '0.0.0.0/0' in source_ranges and disabled == False:
# Require short sleep for demo as it takes a few seconds between log creation and firewall resource completion. Will get 400 Error if we don't
time.sleep(20)
disable_firewall(project_id, service, firewall_name)
print("Firewall %s Disabled" % firewall_name)
# If TCP Port 22 is allowed and 0.0.0.0/0 is in the Source Ranges list and the firewall is disabled. Do nothing as Firewall is already disabled
elif ssh_allowed == True and '0.0.0.0/0' in source_ranges and disabled == True:
print("Firewall %s allows SSH from the Internet but is disabled")
# If any of these are false do nothing as SSH is not allowed from the internet
else:
print('Firewall %s does not allow SSH inbound from the internet' % firewall_name)

Try it out!

We have provided a Terraform Module  on our GitHub which will provision all the resources necessary for a demonstration of the concepts discussed in this article.

 

RELATED ARTICLES

The information presented in this article is accurate as of 7/19/23. Follow the ScaleSec blog for new articles and updates.