Automating IAM policy remediation is no walk in the park. A development team must:
- identify the controls to be remediated.
- build a program that pulls down and parses out the roles and policies.
- validate the permission and trust policies against the necessary controls.
- perform the required remediation (which may include checking action history.)
- rebuild the JSON documents.
- deploy the remediated artifacts per your company’s or client’s workflow.
Several languages are able to handle these same tasks, but this article will focus on Python. Python is a great option for writing out this sort of software with its Boto3 library doing the heavy lifting and its simple syntax making it a no-brainer.
Modules
TL;DR: Modulate everything to be proactive.
Since automating IAM remediation requires recurring parsing of JSON documents, modulating your parsing function(s) will save you time as you work on each control. If you have controls that pertain to trust policies, you will need a separate function to parse those out since the format differs from that of permission policies.
Modulating the parsing of each policy format also allows you and your team to add new controls in the future without having to re-write the JSON parsing for the given policy type. For example, you may only have one trust policy control at the moment, but somewhere down the line you or your team may have additional controls for trust policies.
Modules will also save you time on remediation and validation for the same reason; the compliance standard, most likely, is always, and will always, be changing. Taking a proactive approach is not costly and has the potential to save human hours down the road while also producing organized and legible software.
Historical Data
TL;DR: Use historical data to implement least privilege.
Having any way to access the historical data for the roles and/or policies you are remediating is a huge win when implementing least privilege. A stakeholder will need to decide the cutoff for the last time an action was used or a resource was accessed, whether that be a few months or a year prior to remediation. While it may take some time for this decision to be made, this information allows you to cut out unnecessary actions and resources from permission policies that are not used regularly enough to warrant retaining these privileges.
Using usage history to determine current permission needs applies to all kinds of AWS IAM policies. For example, if a principal doesn’t assume a role as part of its current practical function, then removing that principal helps to strengthen your environment.
Automating remediation based on a role’s or policy’s history comes with caveats, but the benefits will continue to flow with a properly built pipeline. You may be able to satisfy requirements with IAM Access Analyzer Policy Generator, but the method with which you acquire historical data is going to depend on where CloudTrail data is stored. It may be as simple as querying an S3 bucket via Athena, but the data could also be in a database outside of AWS. One of the benefits of Python is the wide amount of support provided. Many databases publish SDKs for Python so that you can write query formats in your automation and populate them with the necessary keywords pertaining to the policy or role being remediated.
AWS CloudTrail is a behemoth of a topic that won’t be fully covered in this article, but one important distinction to make is Data Events vs Management Events, especially when considering remediating over provisioned resources. Here’s an AWS Whitepaper on the two event types. Another important note is that parsing out key information from data event logs can be arduous, as the field path for the ARN of a given resource will vary. This means that to remediate resources, you will need to write out automation to scan the entire log for the ARN, have the ARN parsed out within a database table, or pass a field path to the query based on the action that triggered the data event (just to list a few possible solutions).
Final Validation
TL;DR: AWS Access Analyzer may be able to satisfy your requirements, if not add validation modules for all of the necessary controls to validate remediated artifacts.
Policy validation is another vital step in automating IAM policies. From the AWS Access Analyzer documentation:
“Access Analyzer validates your policy against IAM policy grammar and best practices. You can view policy validation check findings that include security warnings, errors, general warnings, and suggestions for your policy.”
While the AWS Policy Validation tool is robust, you may have additional controls to remediate against that AWS’s native tool does not yet cover. If your enterprise doesn’t have a security tool to run this validation, you will need to add those controls to your validation modules so that you are validating for all controls in place, not just for the ones you are specifically remediating.
To Sum it Up
Automating IAM remediation is no simple task, but automated security frees up human hours to focus on the long-term solution instead of applying band-aids to problems that will continue to arise in your ever-changing environment. To build a lasting solution, you could also implement the automation within your security pipeline to allow your engineers to focus on improvements to the pipeline and automation rather than wasting hours on issues that your automation can handle in seconds. As your infrastructure continues to expand, scalable security will keep pace.