Event Recording

Securing Workload Identities: Best Practices for Tokenizing Third-Party API Keys and Access Tokens

Name: Securing Workload Identities: Best Practices for Tokenizing Third-Party API Keys and Access Tokens
Uploaded: 2024-06-07T12:00:00+02:00
Duration: 20 min 52 s

Vincenzo Iozzo

CEO

SlashID

Posted on Jun 07, 2024

Stolen secrets and credentials are one of the most common ways for attackers to move laterally and maintain persistence in cloud environments.

Modern cloud deployments employ secrets management systems such as KMS to protect key materials at rest and avoid leaking keys or credentials in source code or other build artifacts. However, secrets are unprotected at runtime, so any vulnerability or compromise of a service could lead to credential theft.

This talk will propose an architecture that, in conjunction with a secret manager, tokenizes secrets and rewrites requests at runtime. Through this approach, application code never directly interacts with key material. Additionally, it enforces stringent access control rules based on Open Policy Agent (OPA) policies for accessing secrets, significantly reducing the blast radius in the event of a security breach.

Video Description

Stolen secrets and credentials are one of the most common ways for attackers to move laterally and maintain persistence in cloud environments.

Short Summary

Lorem ipsum odor amet, consectetuer adipiscing elit. Luctus fames rutrum metus habitasse donec quis turpis.

Nibh porta tristique sociosqu eleifend condimentum sapien ultricies. Dapibus rhoncus urna elit commodo blandit ut vestibulum tristique. Ante parturient morbi maecenas leo ac est dolor aliquam iaculis.

Leo vehicula vivamus ipsum lacinia cubilia torquent accumsan! Viverra a dictumst dapibus; nam consequat felis mus. Euismod semper iaculis congue mauris nullam.

Become a member of the KuppingerCole Community to access this and thousands of other publications.

Interesting Facts

Become a member of the KuppingerCole Community to access this and thousands of other publications.

Recommendations

Become a member of the KuppingerCole Community to access this and thousands of other publications.

Takeaways

Become a member of the KuppingerCole Community to access this and thousands of other publications.

Video Description

Short Summary

Interesting Facts

Recommendations

Takeaways

Show Transcript

Hi everyone. I, I, I know it's probably a bit tough on, on a Friday, but I'll try to keep this not super technical and as entertaining as possible.

So yeah, we're gonna talk a bit about NHIS and and workload identities generally. First, though, I'd like to explain a bit why, why we think that securing NHI matters, why they're targeted, especially recently, and what's kind of like, what's the state of the art today in terms of best practices, what we think would be kind of like the ideal solution in the long run, and how do we get to that place?

So the reason why it matters is that I'm sure that if you haven't lived under a rock, you've seen in the past few years how identities are becoming more and more targeted in a lot of the attacks, whether it's state sponsored attacks or, or or not. They are, they're constantly front, front and center in the latest Verizon DB reports, which is one of the most comprehensive one, just in terms of like number of breaches that they track. They mentioned that 49% of the cred of the breaches were originated by either stolen credentials or credentials were misused in, in some capacity.

The other interesting data point from that report is that in about 80% of the cases, even if the initial attack vector wasn't necessarily like credential related and virtually any kind of lateral movement within the network ended up being through some form of identity based escalation. And just in the, in the past few months, we've seen a few examples that I'd like to, to briefly mention. The first one is CloudFlare.

They've been extremely vocal about it On their blog posts, they were one of the victims of the Okta breach, and in the blog post they mentioned that they had tracked about 5,000 nhis in their environment that wrote, rotated all of them except for four, and they still got compromised with the four that they had forgotten to rotate. The second breach that is more recent is Dropbox. I'm sure all of you are aware of this, but as of last year, if you are breached and you are a public company in the us, you're supposed to report the breach to the SEC. So Dropbox filed with the SEC.

They weren't, they didn't disclose the full details of the breach, but they did mention that one of the, one of the issues or one of the causes of the breach was a misuse of NHI token of some form. And then the last one, and, and probably the highest profile one is Microsoft. So Microsoft got breached twice recently. The second one is somewhat less exciting in a sense that it was just a credential stuffing attack, but the first one is, was more exciting because of two reasons.

So one is the way the breach happened was through a stolen, basically memory dump of an application that contained an A token in it. And then the, the attacker used the token to get into, into the Microsoft environment. And then from there they were able to get access to a signing key that was valid to, to sign a token for Office 365.

And, and they basically stayed stealth in, in, in the Microsoft environment for several months. They were actually the, this breach was detected by the State Department 'cause they realized that somebody was accessing State Department mailboxes without authorization. And so the reason why this is, I, I think it's particularly interesting from an NHI standpoint is one, it highlights the fact that it's really hard to keep track of a lot of these tokens. So the signing key has been around, had been around since 2016. They hadn't rotated it since, since 2016.

And they weren't sure from, from the CSAR report there was Csar report that interviewed Microsoft executives. And from the Csar report, it wasn't clear that they actually were aware of the level of access that, that the, the debt certificate add. Then the other reason why it's it's interesting is that the initial access, so the ability to exfiltrate tokens from, from a memory dump shows another problem with NHI, which is the fact that they are stored everywhere.

And as a result of that, they're very hard to, they're very hard to rotate and they're very hard to do an asset inventory to even know where they are. The last breach, I'll mention briefly 'cause it, it just happened a few days ago, was hugging face. So hugging face is a ml sort of like code repository and model repository. They announced that their API keys have been compromised. They weren't sure how many of their customers were affected, but their recommendation was to rotate everything.

And then they promised that they were gonna move to all two client credentials instead to try to reduce the, the odds of sort of like master keys being compromised going forward. Now, my background is largely, or before, before I, I started to work in, in, in this industry, I, I spent a lot of time in kind of like the offensive security side on the vulnerability research s engineering world. And what you learn in that world pretty quickly is that attackers tend to go for the easiest possible target. And so back in the days, it used to be very simple vulnerabilities that you could exploit.

Today we're mostly looking at either human identities that are not properly secured through MFA and conditional access or non-human identities because they present a few specific challenges. One is, as I mentioned earlier, they're sprawled everywhere. You might have a secret manager, you might have service accounts in your cloud provider, you might still have AD in your environment, and I'm sure that at any reasonable scale, you're dealing with third party credentials from your SAS vendors as well. It's practically impossible to do any form of lifecycle management.

We've heard when we talk to customers, we've heard anything from, oh, we create users in Okta or our IDP, we then use those users to create service accounts in the SaaS applications, and then we suspend the Okta users, we tag them in a specific group to try to track the, the energizes that we have. As you know, the, the, the kind of like countermeasures that we've developed to stop a lot of these breaches such as MFA and step up and conditional access don't really work in scenarios where credentials are not interactive.

And so we don't have an, an easy way to actually detect it and stop a lot of these breaches. And then lastly, because of the fact that a lot of nhis are used in production environments, even when you, when you are aware of a breach, it becomes really hard to remediate the breach quickly because you can't easily rotate a cred a, a credential without potentially risking downtime. So what's the state of the art today?

And for, for the sake of brevity, I'm, I'm just gonna focus on third party credentials here. The, the kind of like the state of the art today is you use some form of a credential manager and whether it's HashiCorp Vault, it's your built in secret storage in, in your cloud vendor, an open source solution. And then the idea from from here is that the secrets will be accessed directly by the application. Some of the, some of the secret managers have some form of arb, a C but they're generally quite hard to configure.

And as a result of that, often developers end up bypassing the secret manager altogether and just they just hard code credentials in source code and this is how a lot of the leaks happen. And then lastly, as I just mentioned earlier, one of the problems with this approach is that because you kind of lose track of the secret, you, you, you're not in a position to know, okay, what happens if I'm, if I rotate, I don't know this stripe key or this, this, this other credential.

So what we think would be the ideal solution here is to try to turn nhis to, to get to as close as possible, to kind of like human interactive credentials where we know countermeasures that we can use. So ideally we have the ability to emit mini minimally scoped just in time credentials with conditional access based on stuff like machine integrity. So we are able to say, I have de deployed EDR or whatever kind of product on this cloud workload.

I, the EDR is telling me that the workload is potentially has potentially been breached. When the workload tries to access a secret that access is, is gated or prevented, the problem is it obviously is gonna take a while to get there because the NHIS are so prevalent in your environments environment already. So I'm gonna briefly talk about the framework that we adopt to think about the problem space and then specifically one of the solutions that we, we are adopting for prevention and remediation. So the framework is relatively simple.

You want to first make sure that you have the ability to observe all of your NHI credentials, whether you're using a dedicated product for it, whether you are dumping the credential information in something like Snowflake. But, but the goal, the the, the first part is you need to have the ability to, to observe the credentials and, and our in sort of like the provenance of the credential. The second thing is you need, as we have cloud detections, we should start having identity detections, specifically identity detections for nhis, things like this. NHI was used from a IP address.

There is anomalous compared to the baseline of what we've seen before. This NHI was used at a point in time that is anomalous compared to the baseline was used from a compromised machine and so on and forth. Once you have the set of detections, then it's time to think about prevention and remediation.

So the, the idea for prevention is above all you need to minimize the surface, the, the, the, the, the, the attack surface in the sense of you need to minimize the number of surfaces that actually touch key material. And then lastly, on the remediation front, you want to make sure that if, if a credential is compromised, you have an easy way to revoke or rotate that credential. Now our idea to do some of this in particular the remediation and prevention piece is similar to what we do for credit card numbers. So we call it credentialed organization.

And the idea in a nutshell is just like we don't let credit card numbers spread around your environment and we don we tokenize them. We should do the same with third party credentials. So the process is you want to deploy a sidecar in your environment. I'm gonna talk about the topology of this. It doesn't necessarily need to be a sidecar, but that's one of the ways to do it and replace secrets in your environments with non-privileged, non-confidential identifiers.

What then happens is at runtime the sidecar intercepts the requests that need to go through a third party service or another machine replaces the identifier with the actual key material and forwards the request through. This is very simple as an approach, but what that gives you is three important primitives.

One, you are automatically already reducing the attack surface by doing this because you're keeping the secret actually out of the application code. This means two things.

One, even if there is a vulnerability in an application, an attacker won't be able to exfiltrate secrets from from the application, even, even in the case of of, of a breach. And then two, you reduce the risk of accidental exposure. You'd reduce the risk of a developer accidentally art coding a credential somewhere or your CICD having credentials in the wrong place or a config file like an m file with credentials. The second thing is you can easily tie tokenization to authorization.

So you can enforce finer grain authorization policies even on credentials that are not necessarily properly scoped. So even if you are dealing with an environment where you have one API key and it's basically admin access for everything you can enforce on top of that, your own level of fine grain authorization. And then lastly, you have separation of duty between the application team and the team, whether it's platform, application security or identity team that is in charge of managing secrets and credentials.

This not only gives you a better sense of where your credentials are, so it's easier to have an asset inventory, but also it makes remediation significantly easier because the sidecar is able to, is able to safely rotate credentials without causing any, any downtime. And you are, and you would know a runtime that you, you rotating, I don't know, a stripe, API key or a third party, API key is not gonna cause downtime because the sidecar is in charge of making sure that the rotation happens when no requests are ongoing. So lemme talk for a second about the topology.

As I mentioned, it could be a sidecar, it could be a external authorizer for something like a service mesh like history or Envoy, it could be Lambda function next to your API gateway. So you have multiple ways to deploy this depending on the topology of your environment. We call our version of of the sidecar gate. In this example, let's assume that it's a Kubernetes cluster. You deploy gate as a sidecar in the pod, the container wants to make a request to a remote server gate intercept, intercepts the request and can modify the yet as as as needed.

So let's see what happens when we make, when we make a request. So let's say that the contain the container is trying to make a get request to a remote service. In the past you would see the authorization token in that request. With credential tokenization, what happens is we add a header to the request that contains A-U-U-I-D that corresponds to the actual secret that the service needs in order to talk to the remote server gate retrieves, the secret from your secret store swaps out the adder, the proxy adder with the actual authorization adder.

And in here gate can modify kind of like your, your adders are req as required in order to make the request. As I mentioned, the other thing you could do is to also enforce fine-grained authorization on top of it. So specifically in the case of of gate, you can pass an AC oath through access token together with the u secret to UID gate verifies the validity of the access token, the scopes on the access token.

If the request is valid in the sense that AI is the token is valid and the access and the scopes are valid, then gate retrieves the secret from the secret manager and swaps out the editor as we've seen earlier and makes the request to the remote server. Now again, for the sake of time, I, I spoke here about a remote server.

You can do something very similar within your internal environment if you want, but this simple concept, obviously we have it as a, as a as a product, but actually you can implement this yourself as a using sort of like a open source, some open source libraries if you want. And, and again with this you are able to achieve these three primitives that are mentioned. So developers don't have access to secrets anymore.

You can do fine grain authorization on third party API keys, even if they don't have scopes built in, you have separation or duty, which means that prevention and remediation become, become a lot easier and you don't have to worry about downtime when you, when you rotate credentials. I only have a minute left. Thank you for being here. If you have any questions, please don't hesitate. I'm also gonna be outside for a few more minutes. Thank you again for being here. There was no question in the chat, but I do have one question.

Oh, there's one question here, so just a second. If you, if you yeah, if you swap that.

Okay, so just for the remote audience, So very quickly, you, you show the authorization they're using the O token. Obviously before that or token can be obtained, there needs to be some credential exchange for the token. So you kind of have the the same problem again.

Well, it sort of like the, in this, in this, the idea for this specific case is to use the token purely for authorization purposes. So the goal, like what could happen at worst, and I should, I should have mentioned this, the way we've we've implemented it is GATE also has a white list of basically certificates that you can use for specific, for specific remote hosts. And so we don't, in in this specific example, we don't take care of sort of like credential zero.

So how do you provision the original credential in the machine, but the idea is to reduce the amount of credentials that that specific service has access to. So you're going from however many they would normally have access to, to just a single one, which is the all to client secret.

Does that, does that make sense? It it, it's sort of like a risk reduction account. Okay. Great. Thank you very much and thank you for being here from the Friday afternoon for such an interesting talk. Thank you. Thank.

Like this?

Don't like this?

Why don't you like this?

Securing Workload Identities: Best Practices for Tokenizing Third-Party API Keys and Access Tokens