Using HashiCorp Sentinel to validate Terraform configuration/plan
Table of Contents
Let me preface this by saying that i really like HashiCorp, their products and their open core business model. Their Enterprise stack is great with awesome features (Vault selective intercluster replication, Terraform team collaboration tooling, Sentinel Policy as Code, etc.), but if itss for personal use or a small business case that doesn’t require all Terraform Enterpirse Features and associated budget, using the freely available Sentinel Simulator in a somewhat hacky way could be a viable option (certainly beats having to write a custom tool to achieve the same thing).
Use case
I have some terraform configuration files, and mostly, plan files that i’d like to validate before applying. Things like the terraform version is appropriate, resources to add aren’t too big/expensive, there’s always a tag with some key, etc. I could write a custom tool to achieve that, or i could take advantage of Sentinel, HashiCorp’s Policy as Code framework/tool, which is sadly closed source and available only in Enterprise versions of HashiCorp’s stack. However, HashiCorp have Sentinel Simulator freely available to download, which “enables you to develop and test Sentinel policies locally on your own machine or in a CI environment”, which sounds great for what i need.
Getting started with Sentinel Simulator
Start by downloading Sentinel Simulator’s latest version for your OS and architecture, setting the appropriate permissions (chmod +x sentinel
on *nix) and executing the binary, which should result in a similar output:
./sentinel
Usage: sentinel [--version] [--help] <command> [<args>]
Available commands are:
apply Execute a policy and output the result
doc Show documentation for an import from a doc file
fmt Format Sentinel policy to a canonical format
test Test policies
version Prints the Sentinel Simulator version
An example policy is the following:
# file: basic.sentinel
hour = 4
main = rule { hour >= 0 and hour < 12 }
With it, we declare a static variable hour
which is 4, and in the main rule (each policy needs to have a main rule) we check if the hour
variable is between 0 and 12. Since it is, we’ll get a true/pass with exit code 0 when doing a sentinel apply
:
sentinel apply basic.sentinel
Pass
Rules can be very complex (Sentinel’s rule getting started guide), and most interestingly, can use dynamic data sources with imports, which are, roughly speaking, plugins that communicate with an external data source like time, terraform plan output, etc.
An example use of the time import and more complex rules (which do the same thing, checking if the hour is between 0 and 12), with a print:
# file: time.sentinel
import "time"
hour = time.now.hour
minute = time.now.minute
print("current time is", hour, minute)
past_midnight = rule { hour >= 0 }
before_noon = rule { hour < 12 }
main = rule {
past_midnight and
before_noon
}
Here we’re importing the “time” plugin, and using the time.now.hour attribute (which gives us the current hour) to compare against 0 and 12, allowing us to do time-based rules (e.g. enforce “read-only Friday”).
In my case, since it’s 15:25, it will fail:
date
Thu May 23 15:25:29 UTC 2019
./sentinel apply time.sentinel
Fail
Execution trace. The information below will show the values of all
the rules evaluated and their intermediate boolean expressions. Note that
some boolean expressions may be missing if short-circuit logic was taken.
Print messages:
current time is 15 25
FALSE - time.sentinel:10:1 - Rule "main"
TRUE - time.sentinel:11:5 - past_midnight
TRUE - time.sentinel:7:24 - hour >= 0
FALSE - time.sentinel:12:5 - before_noon
FALSE - time.sentinel:8:24 - hour < 12
FALSE - time.sentinel:8:1 - Rule "before_noon"
TRUE - time.sentinel:7:1 - Rule "past_midnight"
You can do much more than simple comparisons (contains, is, not contains, is not, is not undefined, etc. - full spec of the Sentinel language here)
Advanced reading
- Sentinel Language Specification
- Builtin Functions
- Standard imports
- Writing Sentinel Tests, inlcuding mocking data
The good stuff
Sentinel, when used as a part of Terraform Enterpirse, has some terraform-related plugins like tfplan, which allow to do validations based on planned values. Sentinel Simulator, however, doesn’t have those closed source plugins, so the only way to use it in a similar fashion is via its JSON parser plugin, since terraform configuration and plan can be previewed as JSON (Note: terraform show (plan|state) -json
exists only in terraform 0.12, for prior versions you’d need to use a tool like tfjson).
If you are already on terraform 0.12, skip to the relevant section.
Using tfjson
First you need to install a (functional) version of tfjson, like tapirs/tfjson via go get:
go get github.com/tapirs/tfjson
tfjson
usage: tfjson terraform.tfplan
As an example, I manage my Digital Ocean Kubernetes cluster via terraform, I’ll increase the node_count to 3, and do a terraform plan -out terraform.plan
:
terraform plan -out terraform.plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.
digitalocean_kubernetes_cluster.my_digital_ocean_cluster: Refreshing state... (ID: f8ee8505-af76-437e-864f-3751371374d8)
------------------------------------------------------------------------
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
~ update in-place
Terraform will perform the following actions:
~ digitalocean_kubernetes_cluster.my_digital_ocean_cluster
node_pool.0.node_count: "2" => "3"
Plan: 0 to add, 1 to change, 0 to destroy.
------------------------------------------------------------------------
This plan was saved to: terraform.plan
To perform exactly these actions, run the following command to apply:
terraform apply "terraform.plan"
Running tfjson on the terraform.plan
file results in a JSON with the resources that will get added/modified with the appropriate values:
tfjson terraform.plan
{
"destroy": false,
"digitalocean_kubernetes_cluster.my_digital_ocean_cluster": {
"destroy": false,
"destroy_tainted": false,
"node_pool.0.node_count": "3"
}
}
Sentinel can read JSON, either inline via the json plugin or via the global values configuration, which is available as a configuration file (sentinel apply -config config.json
) or an a command-line flag (sentinel apply -global key=value
).
Assuming tfjson’s output, an example policy that checks destroy
is true (meaning that applying the plan will result in destruction of resources - normally you’d want to check the opposite but this is just a test) :
# file: tf.policy
print("input.destroy is",input.destroy)
main = rule {input.destroy is true}
Now we just need to pass the output of tfjson as the global variable input
to Sentinel:
sentinel apply -global input="`tfjson terraform.plan`" tf.policy
Fail
Execution trace. The information below will show the values of all
the rules evaluated and their intermediate boolean expressions. Note that
some boolean expressions may be missing if short-circuit logic was taken.
Print messages:
input.destroy is false
FALSE - tf.policy:2:1 - Rule "main"
As we can see, the JSON output of tfjson
was successfully read, parsed and evaluated.
Sadly, this would only work for the simplest of checks (destroy
is false) due to the way tfjson’s output is structured and Sentinel parses JSON (you can’t iterate on root-level keys). To be able to do more complex checks, a hacky workaround is required, to encapsulate tfjson’s output before passing it to Sentinel:
sentinel apply -global input="{\"data\": `tfjson terraform.plan`}" tf.policy
# file: tf.policy
print("input.data.destroy is", input.data.destroy)
main = rule {input.data.destroy is true}
Now a more complex policy with multiple advanced checks:
# file: tf.policy
print("input.data.destroy is", input.data.destroy)
no_destroys = rule {
input.data.destroy is false
}
dont_touch_my_kube = func(data) {
for data as i, j {
// skipping the destroy value since there is a dedicated rule for it
if i == "destroy" {
continue
} else {
print(i)
if i contains "digitalocean_kubernetes_cluster" {
print("Refusing modification of digitalocean_kubernetes_cluster resources")
return (false)
}
}
}
return(true)
}
main = rule { no_destroys and dont_touch_my_kube(input.data) }
Result:
sentinel apply -global input="{\"data\": `tfjson terraform.plan`}" tf.policy
Fail
Execution trace. The information below will show the values of all
the rules evaluated and their intermediate boolean expressions. Note that
some boolean expressions may be missing if short-circuit logic was taken.
Print messages:
input.data.destroy is false
digitalocean_kubernetes_cluster.my_digital_ocean_cluster
Refusing modification of digitalocean_kubernetes_cluster resources
FALSE - tf.policy:23:1 - Rule "main"
TRUE - tf.policy:23:15 - no_destroys
TRUE - tf.policy:4:2 - input.data.destroy is false
FALSE - tf.policy:23:31 - dont_touch_my_kube(input.data)
TRUE - tf.policy:3:1 - Rule "no_destroys"
digitalocean_kubernetes_cluster
resource, which isn’t allowed.
Terraform show -json (starting from terraform 0.12)
Since terraform version 0.12 (which includes some very cool new features, if you haven’t checked it out here’s the announcement blog post), there’s a -json
option to terraform show which works on terraform plans, so you can preview them in JSON format, perfect for checks via Sentinel or other tools. Let’s see what that looks like with Sentinel:
terraform -v
Terraform v0.12.0
+ provider.digitalocean v1.3.0
terraform plan -out terraform.plan
terraform show -json terraform.plan
...
This should result in a pretty sizeable JSON, including:
- terraform version
- providers’ configuration
- all current resources
- planned changes
- output changes
- yet unknown values
- variables
Overall, it’s pretty cool and you can get wild with policies. Off the top of my head:
- providers’ configuration doesn’t have static values, only variables / Vault data source
- terraform version is good
- there are no variables with X or Y (like passwords or generic admin username)
- no resources are used at the root level (only modules)
- and the plain old (value X isn’t bigger than Y), like AWS/GCP/Azure instance types, Lambda memory settings, security groups SSH open from 0.0.0.0/, etc.
A few examples for inspiration:
# file tf.policy
allowed_node_sizes = [
"s-1vcpu-2gb",
]
allowed_regions = [
"ams3",
]
allowed_tf_versions = [
"0.12.0",
]
node_pool_max_size = 3
tf_version = rule {
input.terraform_version in allowed_tf_versions
}
no_destroys = func() {
for input.resource_changes as change {
if change.change.actions contains "destroy" {
print("No destroys allowed")
return (false)
}
}
return (true)
}
max_nodepool_size = func() {
for input.resource_changes as change {
if change.address contains "digitalocean_kubernetes_cluster" {
for change.change.after.node_pool as pool {
if pool.node_count > node_pool_max_size {
error("Node pool", pool.name, "shouldn't exceed max node size", node_pool_max_size, "it's currently", pool.node_count)
}
}
}
}
return (true)
}
region_check = func() {
for input.configuration.root_module.resources as resource {
if resource.expressions.region.constant_value not in allowed_regions {
return (false)
}
}
return (true)
}
node_size = func() {
for input.resource_changes as change {
if change.address contains "digitalocean_kubernetes_cluster" {
for change.change.after.node_pool as pool {
if pool.size not in allowed_node_sizes {
return (false)
}
}
}
}
return (true)
}
main = rule {
tf_version and no_destroys() and region_check() and max_nodepool_size() and node_size()
}
This example policy checks the following:
- node types
- number of nodes
- no destroys
- terraform version hasn’t been updated
- no resources are created outside of the allowed region (AMS3)
To run it, you need to apply sentinel with terraform show
’s output:
sentinel apply -global input="`terraform show -json terraform.plan`" tf.policy
It took me roughly 15 minutes to write and test basic functionality, and i have a decent chunk of stuff covered (for my limited use case).
TL;DR
Sentinel is an awesome Policy As Code tool from HashiCorp which can be run locally via Sentinel Simulator and be used to validate any sort of JSON, like the output from a terraform plan.
Coming up next
While searching for a Policy/Validation as code tool, i asked around the DevOps’ish Telegram group and someone proposed conftest, which seems more flexible, advanced and complicated (with a Go-like syntax compared to Sentinel’s HCL-like DSL), not to mention free, open source and usable directly without workarounds. I’d like to give it a try and compare the two tools.