Hi everybody. Thanks for joining me and being here. You guys are the troopers for this esoteric subject. So we're gonna talk about, yeah, data breaches, zero trust and graphs, and we'll start with zero trust. If testing the clicker, this is gonna be interesting.
Yeah, what do you think? Press again. So the next slide is a nice infographic that shows all kinds of data breaches. It's a great start. The idea is that there's more and more data breaches, as we all know, right? Not only that, but they are also more impactful. It still doesn't work, which is interesting. Shall I point it to something? 15
Seconds just,
Okay, so more and more data breaches. You would see a nice infographic showing bubbles and all these, all these data breaches are even more impactful there. There's different sizes of bubbles, so, oh, there we are.
See, I promised you, I told you now some of you may remember the MGM casino results data breach from last year, last October. I think the reason I bring it up is because ERs was held at MGM casinos for a couple of years.
Now, if you were at ERs last year, your data was probably part of that. Anyway, we now know that that bridge cost them around a hundred million dollars and that's just the tiny spec on this infographic here. So just shows you the scale of what we have to fight.
Now, we also know that 88% of data breaches target identity, that's us. I mean everybody here, but also there an identifier.
So whatever we've been doing so far hasn't been working great, has it? We need to do something, something, but what can we do? An interesting thing here would be to understand why there are so many data breaches and why they target identity.
Now, there are many sources about that. There are many people talk about various reasons. I'm just gonna focus on three reasons that I think are quite telling. The first one, as we know, is ransomware. Ransomware as a service outlet, like, or crime is very well organized these days doesn't work.
Yeah, see you have a, you know, exploit creators that are partnering with social engineering affiliates and together they've been really successful at penetrating our identity frameworks with such things as, you know, simple calls to help desk. For example, you know, reset my password or I've lost my device. So that's one reason There's nothing much we can do about organized crime per se, but let's see what other reasons there are.
Second reason, as David told us at VIR last week, is that developers haven't been doing a great job at access control. Have they?
The open web application security project lists broken access control as the number one security issue in software development. So, should we entrust our developers with our access policies? The third reason is, I hate to break it to you, but authentication is broken. Now I know we solved it, but hear me out, we, I know we have those three factors, right?
What, you know, what you have and what you are. We even have a PAs keys now, great there, there's actually data that shows that PAs keys have been quite helpful, but there's always seem, there always seems to be a way around this, either through phishing, because you can lose your devices or get, get stolen because your biometrics can be skimmed or replayed or they can just call support Again saying that you lost your device.
There's actually scams now where, you know, people kind of entice their victims to just log in with all those factors. So what now, what can we do now against this?
Well, it's time for a different approach. Like Einstein used to say, insanity is doing the same thing over and over again and expecting different results.
Actually, I'm told that Einstein didn't actually say that. Let me rephrase.
Alex says, insanity is authorizing the same way over and over again and expecting different results. We need to do something else because what, what we've been doing so far just doesn't work, right? So I'd like to introduce a force factor, what you do, because what you do in the end will be probably the only way that we can tell that an attacker is an attacker. An attacker will try to, you know, once they take over an account, they'll try to have, elevate their privileges, they'll try to access systems that that account would not generally access.
Whereas a gen, A user, a user, a normal user will simply go about their day. You know, so what you do, this is where zero trust can actually help us.
Now, I know I hate buzzwords like anybody else, but I truly, truly believe that zero trust is kind of our only hope here. Let's see how the main concept behind zero trust is, of course, never trust, always verify, right? This means never trust an authenticated identity. Let me say it again.
Oh, interesting. Never trust an authenticated identity. No matter how many factors were used to authenticate that identity. Don't trust it. I know it's kind of wild. So how to go about doing that? Thankfully nist, the National Institute for Science and Technology in the US has provided us with a reference architecture.
Now, I know this is a a, a US based, kind of a governmental agency, but still they, they do produce some publications that are read and you know, followed by even Fortune 500 companies.
So it's kind of a good place to start. So this is the reference architecture for zero trust from nist. And as you can see in the center here, we have something, we have here something interesting. We have a policy decision point and a policy enforcement point. Now you'll recognize an authorization system. Authorization is at the very center of zero trust.
So let's have a closer look authorization, just a, as a little high level overview, we have static and dynamic authorization. Now some call it runtime admin time, tangible, intangible. Let's not go, let's not get caught up with terminology. What I mean by static is that you essentially place your subjects or your users into buckets such as, you know, roles or access control lists, or you have this spaghetti code that you put in your apps if then closes and that use those buckets. That's what I refer to as static.
On the other hand, dynamic uses rules engines and several types of conditions based on subject resource context. And based on these conditions, the rules engines provide a grant or deny, you know, decision. Some rules engines also can react to a real time events. So I'm here to tell you that static authorization is bad. Why is it bad? Because over time all those buckets, they tend to multiply over time. You have more and more rule rules for example, and you can't tell who has access to what anymore.
People also tend to get overprivileged over time, which actually explains the impact of those data breaches. Once an an account is, you know, taken over, they have access to pretty much a lot of things. So dynamic authorization then is what we should be doing. Externalized dynamic authorization. So there's two ways to go about it. You can use graphs or no graph. I'm here to talk about the graph part, which is also referred to as, you know, policy as data. But you have the proponents of policy as code. Some of them are in the room.
Now there are pros and cons for all, for both approaches, right? But I'm here to tell you that graphs are really a hidden gem that is, that's often not really used to their full potential. So because I'm talking about graph, the first thing is you will need a graph. So in your environment, you'll need to build this graph, ideally in a graph database, a real graph database, we can talk about that later. But you don't necessarily need to bring all your data in this graph, but all the data that pertains to your authorization or your security. So you'll find in there your identities.
You'll be able to model your access policies using relationships, you'll have your resources, your assets, and you can even use it to track user user behavior. Now, once you have your data in this graph, what happens is that now you have knowledge, because this is a labeled property graph.
You're shifting from data to knowledge. And that's actually quite a paradigm shift at the 20th century was all about data. This century is all about knowledge, as I'm sure you come to realize. So then let's put our graph in the center here and let's have our authorization system use it.
So how do we do that? First thing we need to do is write access policies, right? With this authorization system. So let's compare two approaches, the policy as data on the left and the policy as code on the right. We here an example written in rego. These two things actually do the same thing. They say the same thing. I'll let you go, go through this and tell me which one is easier to read or understand. Who thinks is the rego?
Okay, just David here, without looking at the graph, no cheating. Can you tell me what user Bob has access to? Right?
Okay, time is up. How about finance? How about you follow the path in the graph from the user to the resource? Like for example here, finance. But you can also see that Bob, if you follow the other, the other path here, Bob has also access to cat and dog.
Same, same thing for eve here. And you can just follow the path in the graph. So the first thing that we can see is that it's pretty easy to read a graph. Anybody can do it. You can make sense and prove your rules fairly easily. Just looking at it. And so for auditing and compliance, it becomes kind of a breeze because your data always shows the access policies and it's always kind of up to date.
So right off the bat, because we've placed our graph in the middle there, we can, you know, we can have compliance with it, we can do the access policies, but we can also do identity management management because now your identities and whatever they're related to are visible and easy to maintain. Great. So what about that force factor? I was talking about the the what you do. This is where we need to talk about thread intelligence. As you see, there are many components also in this architecture, I didn't mention that, but thread intelligence is what we're going to talk next.
But you also need to look at activity logs and also manage your keys properly. And CM systems are also useful. So thread intelligence will actually help us with the force factor to what you do. So a closer look than how do you use graphs for threat intelligence?
The idea here is to detect abnormal elements and patterns, abnormal behavior. So you have a baseline and you try to find the outliers. This will tell us what you do in a graph. You use the relationships between the various nodes to, you know, highlight the correlations between events and things.
Now using a graph, there are several ways to do threat intelligence. I'm not gonna go in details through all of this, but from a high level standpoint, you can use the nodes themselves, you can use the relationships, the edges between the nodes. You can look at subgraphs or changes in the graph over time. The several ways to look at changes over time, or you can look at the behavior of the overall graph. Let's have a look at some examples here. The first example of things you can do is deep link analysis.
This is where you look at nodes in the graph, for example, or the users and follow the relationships and see what they're connected to.
This is super useful for answering questions such as who can access what? Who can access resource one, what user has, what does user have access to? You can do segregation of duties with this, you can ensure the users have the least privilege that they should have. You can even track user behavior like that. And really the sky's the limit here. You follow the relationships and you can find all kinds of answers. Another example here is pattern matching.
So this is graph and graph databases in particular are very good at matching patterns of of data. So for example, if you know already a fraud pattern, it's pretty easy to find that same pattern in your graph. For example, here or here, even though they not, they don't necessarily look, look similar. Very useful. Another one is community detection. This is where you use clustering or key neighbor algorithms in your gr in your graph to find nodes that form clusters or that are grouped together. They are semantically close to each other. This is really useful to capture context.
You can actually highlight any nodes that has, that is hyperconnected, for example. This might be suspicious or any outliers. You can do. Also things such as finding the most influential roles in your environment or entitlements. You can find identity hubs, you can do role mining. You can do 360 degree views because now if you try to correlate your users and your identities there might be close to each other. So this will really help for that.
And those are just three examples of things you could do. So right off the bat, because you're using a graph, you can check that checkbox also.
But now no talk here these days would be complete with the cherry on top ai. Now, as we know, AI is the thing, but how do you go about it in your organizations?
Well, a good place to start is that AI needs data, right? And where do you think the AI uses the data that AI uses comes from? Does it come from LD directories? Does it come from SQL databases? It comes from knowledge graphs. You guessed it through a technique called graph embeddings. This is where you actually transform the nodes and relationships in your graph into vectors of numbers. These vectors of numbers retain the semantics of the original graph, but they are simplified. There are lower level which allow for faster computation.
Here's what that would look like.
You have a graph on the left and through some kind of magical function you issue, you create those vectors of real numbers, which are the same representation that can be then inputted into an LLM into a, an AI for training. So now if you have your knowledge graph on the on the left, which is your company's data, your knowledge, you can actually train, train an LLM with this.
What can, what you can do is, this is all kinds of things like just using the embeds alone. You can find hidden patterns or similarities. But if you have an LLM, you can do retrieval, augmented generation applications. These are applications that use a generic, you know, LLM large language model such as chat, GPT or Misra or whatever. And you can augment it with your own knowledge graph as, as I just mentioned.
So now you can have your identity or authorization specific LLM that uses your own knowledge, your own company or organization's knowledge.
Once you have that, you can use AI to secure ai. You can suggest entitlements, you can use natural language to ask any security question or anything to your, to your actual ai. The sky's the limit at this point. So there you have it. I'm just adding AI to this, this diagram here. One last thing is that all these components will need somehow to communicate to each other in real time. Because as you see, this P-E-P-P-D-P pattern requires you to, to authorize every single request.
So to do that, a good idea would be to look at the open id shared signaled framework event and risk events, which I think could be used for, for this purpose. So to conclude, it's time to bring ai ai, yeah, it's time to bring IM into the 21st century. To do that, you can only trust in zero trust because zero trust will actually help you implement that force factor and help you detect those intruders and graphs are probably the only way you would achieve that because graph databases anyway, have all these algorithms and features embedded at the platform level. All the systems simply don't.
That's it. Thank you.
Thank you Alex. Thank you for the too minutes.
Yeah, very nicely done. Do we have any questions from people who are in the audience here? I didn't see anything in the app. Alright. I wanna ask a question that came up in a conversation earlier today. Sure.
About, well, but can't you just implement graphs in whatever technology you want? What's special about these labeled property graphs?
Yeah, so there's, there's various ways to do graphs there. There can be a logical layer on top of a more traditional system, but as I'm, as I showed in that previous slide, those regular legacy systems don't have all these capabilities, all these analytics features like embeddings and, and key neighbor and clustering and all of that stuff. So you'll be missing out on all that. Basically. Additionally, a past traversal and pattern matching are really core features of these graph databases and you'd have to implement those things yourself somehow.
Got it.
So you'd have to roll your own if you used other technology bases.
Yeah, imagine doing your own SQL joins, you know, instead of having the Oracle database do it for you.
All right. Well help me thank Alex once again. Thank you. Awesome. All right.