BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Presentations Security and the Language of Intent

Security and the Language of Intent

Bookmarks
21:19

Summary

Tracy Holmes, Petros Kolyvas discuss why the language of security for infrastructure is often lost in translation and how policy as code can help.

Bio

Tracy Holmes is a developer advocate at HashiCorp. Petros Kolyvas is product manager for Terraform Core at HashiCorp.

About the conference

QCon Plus is a virtual conference for senior software engineers and architects that covers the trends, best practices, and solutions leveraged by the world's most innovative software organizations.

Transcript

Holmes: Hi, everyone, and welcome to "Security and the Language of Intent: What's the answer?" My name is Tracy P Holmes. My pronouns are she/her. I'm a developer advocate at HashiCorp. You can find me at @tracypholmes all over the place.

Background

Let me give you a little background on why I chose this talk. This was a discovery talk for me. The idea for this talk came from a discussion with a wonderful teammate of mine. We were discussing the language of intent. Here's what stuck out to me. One, infrastructure is hard because we don't capture intent easily. Two, Terraform tries to make it easier to express intent with infrastructure. Three, language translation is broken - there isn't a language that can express security easily.

A little bit about me...

While we're at it, let me tell you a little bit about me. Before I started at HashiCorp as a software engineer, my background was in systems administration, rack and stack, data centers, stuff like that. That's a lot of 16-character passwords for a background, a lot of replaced [inaudible 00:01:23] cards, security guards, and keyholding, but to me that's security. So locked down, you almost locked yourself out, and I have done that more than once. We don't really have anything like that in the cloud. Most times, there's a bundle of things that got hacked or cobbled together to get the security you need because a prepackaged solution didn't fully work, it couldn't work, or it might even have just been too expensive at the time.

That said, I don't think we should have to lockpick our security to feel secure anyway.

Commonly Asked Questions

Let's take a look at a couple of commonly asked questions from practitioners. The first question could be, "Can Terraform make a really secure configuration for AWS or Azure?" My response would be, yeah, if you configure it the right way. Then that would probably be followed up with, " It's pretty well known that managing secrets in Terraform can be a pain in the you know what. Plaintext can get committed and pushed to your VCS by mistake because you didn't know it was exposed. Or they end up in your .tfstate and all kinds of issues." Let's go through a list. We'll start with secrets management.

A few things to work on so your database is secure

A few things to work on so your database is secure, the first one is you make sure your secrets are, well, secret. This includes passwords, by the way. Here's one way you can check for it. Create a Terraform.tfvars file, add the variables and values to that file, and let Terraform pick it up that way. Just don't forget to add the tfvars file to [inaudible 00:03:04] your file, so you can keep it out of version control. Another would be to use the vault provider for your secrets injection. Secondly, let's make sure your subnets and ports are locked down also. Third, also make sure privileges are virtually non-existent.

While I was looking for things to put into this talk, I came upon Terraform compliance. You could use something like that. Terraform compliance follows behavior-driven development syntax. After installing the tool, you could say something pretty user friendly like, "Here's my scenario. Given this, when this happens, then it should not do this." For the example that you all see, the scenario is no publicly open ports. Given I have an AWS Security Group defined, when it contains ingress, then it should not have tcp protocol, give it your ports and your subnet. Then you just run the test suite. You'll get some pass/fail results, red or green, pass or fail. What you want to do in this case is go make the red things green. I had no idea this tool existed, and I can't wait to play around with it more. Now that I think about it, it would also work for the other things on our list, namely encrypted data, ACLs, and access.

Straight from the docs...

I used AWS for that example. Say I was using Azure. To be perfectly fair, you really could use Azure portal for configuring the majority of this, but it takes time. I'll give you some info straight from the docs. in Azure, databases in SQL Database are protected by firewalls. By default, all connections to the server and database are rejected. You can use portal to Set Allow access to Azure services to OFF for the most secure configuration. Then, create a reserved IP, which is the classic deployment, for the resource that needs to connect, such as an Azure VM or cloud service, and only allow that IP address access through the firewall. If you're using the Resource Manager deployment model, a dedicated public IP address is required for each resource.

You got tired of me reading halfway through that, didn't you? That's a perfectly valid response, but think about it. There are guides upon guides for doing this kind of stuff. It's a lot, isn't it? A GUI doesn't make sense for the majority of organizations, because it doesn't scale well.

Now back to languages...

Let's get back to languages for a second. If I asked you the best way to express, let's say, front end, you might tell me JavaScript, or one of the various 31 other flavors of frameworks and libraries like React, Angular, or Vue, because JavaScript is a little bit like Baskin-Robbins in that way. HTML/CSS, which would be your twofer. Typescript, that's another. What if I asked you about data manipulation? You might say R is the best for this. Someone else will say Python. There are others that may say MATLAB. What about the back end? Java, Go, Ruby, which was my first love. That's all great, but what happens if I ask you the best way to express security?

I asked this question to a recent HashiCast guest, Sarai Rosenberg, who was absolutely fabulous. Their answer was something along the lines of, "It depends. It depends on the application." Python could be used for threat detection or log analysis, because you need something. That's fine. Python also has great data analysis libraries. Other questions you should ask yourself before looking for something like this is, "What's the app? What's the security? How easy is it to use? What other open source tools and dependencies could you use?" Go in this instance is also another that can be used simply because of the amount of support behind it, and the available integration. The list goes on. Truthfully, I've never thought about that. Never, ever would have considered Python could be used for security purposes. It was a great thing to add to my list for research purposes, because I'm also going back to learn Python.

The question still remains, how close can we really get to express the security as a real natural language. Let's look at some policy as code tooling.

Policy As Code

What exactly is policy as code? Policy as code is the idea of writing code in a high-level language to manage and automate policies. By representing policies as code in text files, proven software development best practices can be adopted such as version control, automated testing, and automated deployment.

Let's go over the benefits of policy as code, and I've got six on the slide. The first is sandboxing. Policies provide the guard rules for other automated systems. As the number of automated systems grow, there's also a growing need to protect those automated systems from performing dangerous actions. Manual verification is too slow. Policies need to be represented as code to keep up with other automated systems.

Codification. By representing policy logic as code, the information and logic of data policy is directly represented in code and can be augmented with comments, rather than relying on oral tradition to learn about the reason for policies. Write it down, always write it down.

Version Control. Policies are encouraged to be stored as simple text files, managed by a version control system like GitLab, GitHub, Bitbucket, etc. This lets you gain all of the benefits of a modern VCS, such as histories, dips, pull requests, and more.

Number four is testing. Policies really are just code. Their syntax and behavior can be easily validated with Sentinel, which is something we'll explore in a little bit. This also encourages automated testing, such as through a CI. Paired with a VCS system, this allows a pull request workflow to verify that a policy keeps the system behavior as expected before it even gets merged.

Automation. With all the policies as code and simple text files, various automation tools can also be used. For example, it's very trivial to create tools to automatically deploy the policies into a system.

Lastly, number six, Balance Developer Experience and Security. Simply enough, empower teams to create environments and develop quickly but communicating expectations for security.

Enforcement

What types of level of enforcement should you look for when using policy as code? As an organization, you'll want to have at least some version of the following at minimum. A policy that is allowed to fail. However, a warning should be shown to the user or log if it does, in fact fail, which it should for this particular policy. A policy must pass unless an override is specified. The semantics of override are specific to each Sentinel-enabled application. The purpose of this level is to provide a level of [inaudible 00:10:38] separation for behavior. Additionally, the override provides non-repudiation since at least the primary actor was explicitly overriding a failed policy.

I've mentioned Sentinel a couple of times. As I said, we're going to talk about that in a minute. Any policy as code, I guess, system that you would look for should have these types of things. It's not just for Sentinel purposes. The policy must pass no matter what. This is the only way to override a hard mandatory policy is to explicitly remove the policy. Hard mandatory is the default enforcement level for Sentinel. I believe it will probably be the default for any other policy as code you would attempt to use. It should be used in situations where an override is not possible.

What is Sentinel?

I kept talking about Sentinel. What exactly is Sentinel? Sentinel is a language and an embedded policy framework, which restricts Terraform actions to defined, allowed behaviors. It can be extended to use information from external sources too.

Let's talk a little bit more about it. Sentinel has a language, Sentinel language. An example of this is on the screen. Sentinel defines and uses its own policy language. The language itself was designed to be approachable by non-programmers, since there are many use cases where the individual defining policy may not, in fact, be a developer. However, the language includes constructs that are familiar to developers to enable powerful policies.

Why should it have its own language? To explain why Sentinel defines and uses its own language, the question can be more easily split into roughly two types of languages, configuration languages and programming languages.

Configuration Languages

Configuration languages are formats such as JSON, YAML, XML, etc. Many applications use configuration languages as their ACL format. Configuration languages are good for static, declarative information. ACL systems are good for this also. ACL systems are focused and they provide limited options necessary to restrict access to a system. Configuration languages are not good for dynamic or logical rules. They typically only contain limited conditions, loops, or functions. Further configuration is often declarative, which can introduce complexities to logical statements that depend on ordering. The last thing about configuration languages is that the use cases that Sentinel was built for require the ability to perform complex behavior, and that didn't model well into a configuration format or a declarative form.

Programming Languages

The Sentinel language was designed with the following goals: non-programmer friendly. Sentinel was built to be used by non-programmers. For the use cases we discovered here at HashiCorp, non-programmers needed the ability to enforce certain rules within a system. For example, a person responsible for compliance may need to insert rules into a system. Lastly, it's programmer friendly. At the same time, the language needed to support programmer friendly constructs such as conditionals, loops and functions for complex policies that a programmer may be writing. It also needed to be embeddable and safe.

Sentinel is embedded in existing software, the language itself needs to be easily embedded. Further, Sentinel was designed to be embedded into HashiCorp software written in Go, so it needed to be easily embeddable in Go. The language is used in highly security-sensitive environments. It must not be able to crash its host system or access system resources without explicit approval. Saying all of that, the Sentinel team really did evaluate other languages. They didn't just jump at the chance to build their own language. Some work well, but because the language itself didn't need to be a general purpose language, they felt building a language focused on policy proved to be the best route for what was wanted. It also allows for language extension as needed in the future.

Some more about Sentinel

Here are some things that I found out about Sentinel, or discovered while I was going through this discovery. I'll just call them quirks. This is an issue, as I see it, and it's number one. If you don't have the right password links or a conformant password, Terraform will plan it, but it won't apply it. If you plan it, you won't know anything is wrong until you either YOLO apply, or already have planed it. I don't really care for this. I understand why, because applying is what feeds Terraform real data and starts generating resources and the like that we have to pay for. Terraform won't really know if something is wrong until it applies. To me, this is an issue. It's a validation issue, but it's an issue nonetheless.

The other thing that I'm not particularly fond of, but I kind of understand why. Sentinel has a CLI to enable you to develop policies, but Sentinel policy enforcement is only available in Terraform Cloud or Terraform Enterprise. That said, if you use it with Terraform Cloud, you can do the following. You can define the policies. Policies are defined using the policy language with imports for parsing the Terraform plan, state, and configuration. Managing policies for organizations. Users with permission to manage policies can add policies to their organization by configuring VCS integration or uploading policy system to API.

Remember earlier I mentioned that the non-programmers will need to use this. Terraform Cloud is user friendly, for the most part, for non-programmers. Enforcing policy checks on runs. Policies are checked when a run is performed after the Terraform plan, but before it can be confirmed or the Terraform apply is executed. Lastly, mocking Sentinel Terraform data. Terraform Cloud provides the ability to generate mock data for any run within a workspace. This data can be used with the Sentinel CLI to test policies before deployment. On the screen, you see an example of a policy. Sentinel then enforces a good password, and also has firewall rules.

Conclusion

In conclusion, that's the end of my journey thus far. It was a really good discovery talk. I wish it had been longer and I'm going to tell you why. I will admit Sentinel is great. It's not for everyone, and that's understandable. Outside of finding Terraform from clients, I also discovered fan favorite, OPA. What is OPA? OPA stands for open policy agent, and it's an open source, general purpose policy engine that unifies policy implementation. You can use OPA to enforce policies and microservices, Kubernetes, CI/CD pipelines, API gateways, data protection, SSH and pseudo on container exec control, Terraform risk analysis, and a whole load of stuff. It's open source.

That said, my main issue with OPA so far is, at first glance, it can be very hard to understand and it doesn't capture intent very well. It also has its own DSO. It also has a CLI, and people use it because it's open source and just works. There are several open source projects that integrate with OPA to implement fine-grained access control like Docker or STO, or you can use it for something like a general purpose policy engine in Kubernetes, for example. I feel like I'll like it way more when I get to play around with it. Just like anything, the more you use something, most times the easier it gets. I think the fact that I don't really know the DSO just yet is probably what's irking me a little bit. That said, there's a ton of community support for it. I shouldn't have any issue finding examples and the like.

I didn't get to show much in the way of code and/or demos during this talk. Like I said earlier, I only had 20 minutes. If it had been longer, I would really like to do another talk or even a blog post or both comparing the two and how they've worked out for my personal projects, because I think this is going to be really interesting. That's it.

Resources

Here are the resources. I've got the Sentinel Docs. Some of this was verbatim, some of the texts on the pages, I did not want to say the wrong thing. If you look at these resources, you will see that they came straight from the docs. The Vault Provider is something that I have not had a chance to play with yet, but my co-workers, Rob and Rosemary are awesome, so I should have no problem in the case I have questions. The compliance testing with Terraform and Azure makes mention of Terraform compliance within that doc and that's how I found out about it. That particular set of docs on Microsoft site actually has a really good follow-through and tutorial. Terraform Compliance, that's straight to Terraform compliance. They have tons of examples on that site. Lastly, the Sentinel Policy Guide, go check that out. There's a full tutorial for that as well.

 

See more presentations with transcripts

 

Recorded at:

Aug 06, 2021

BT