All right, good morning and welcome. My name's John Pritchard, I'm Chief Product Officer at Radiant Logic. I'm joined by my colleague Sebastian Fa, chief Technology Officer at Brainwave G R C. And today we're gonna talk about identity data. As an industry, I think we spent a lot of time talking about policy, policy as code policy, enforcement policy information. What you may learn today is that data is becoming a very important component of how we manage and govern policy in the enterprise as you move to things like zero trust and identity first service security practices.
So we're gonna tell you a little bit about the changing industry. I'm gonna show you some examples of data in action and then give you some constructs and some architectural patterns to consider for your enterprises. So there's a bit of framing about how we got here in the first place.
We're about two years out of the global pandemic.
And as identity professionals, I would say our sort of experience in the world has really fundamentally changed because the number of digital experiences that are now pretty common in our everyday lives is, is really tripled not only in your work life but also your professional life. So think about how you interact with the physical world. You may have a profile in your vehicle, you may have an application that you configure that vehicle with different preferences.
You get in and out of buildings with digital ID cards, you've got all the applications that we use during the day, and then you have all of the brands and services that you interact with on a day-to-day basis. From our perspective, there are sort of three major decisions that happen in all of those interactions. Authentication, are you who you say you are, authorization?
Can you do what you're trying to do and personalization?
What is the specific experience that you should have based on the knowledge and the data that we have about you As an industry, we've implemented a set of layered technologies to make these experiences possible. Things all the way to the edge, like firewalls and API gateways. We've got all the application level security around, single sign-on, multi-factor authentication, really fine grain decision making around data. And then in the personalization space, sort of all the marketing technologies at play.
If you think about it, every one of these technologies has a rule or a policy that's making one of those three decisions, authentication, authorization or personalization. So from our perspective, decision making is getting largely decentralized. Cause I think it's fair to say for most of you, you do not sort of own all of these underlying technologies.
They are managed or executed by different members of your organization.
So this idea of decentralized decision making is raising the need of what's used across all these different decision points is data and how do you govern that type of data in the enterprise. This phenomena actually isn't abating. If anything, we're seeing this become more and more low level fragmented decision points. I've got an example here from our colleagues at the the United States Department of Defense. This is their reference architecture for zero Trust. And what you see is many, many, many sort of micro decision points.
So the takeaway here is that as a governance function, as identity professionals, as you think about policy and how policies being executed, you have to consider the one thing that's being used by all of these policy points, which is the underlying data. And this is probably the one thing that you do own in the organization.
How do you manage and govern that data? And that's really what our talk is gonna be about today, the role of that process and how that's done for most organizations. Step one is federating that data.
And we are specific to say federation and not consolidation because in most organizations you have several identity stores that have different aspects of data that need to run to support, say legacy applications. Federation is connecting all of those sources, modeling that data in a common way and then delivering that data internally as a service. So some organizations will use a term of an identity data lake or a common directory services store or a meta directory.
But the idea is that you've got investments in things like L D A P active directory, you may have custom databases, they all have attribute data that's important to you. You're bringing that together in a federated model, modeling that in a common way and then delivering that to all the applications inside with federation done.
You can then get into some of the areas to understand the data itself. And this is where the world of identity observability and identity analytics start to open up in our world. I observability is around cleansing the data.
Think about things like identity data quality. You're looking at attributes like missing attributes in the data where you have multiple data sources perhaps with the same identities that are referenced by different unique identifiers. How do you correlate those identities of being the same things like anomaly detection, how do I look at statistical analysis of my data sets and observe outliers that don't match the rest of the entries in the large federated data store? And then looking at sort of time change as the data evolves in a dynamic fashion. So lemme give you a few examples of this.
In in play, what we have here is a histogram of the tenure of the employees in my organization, how long they've worked there, and the number of group membership that they possess.
Now in a well governed organization, what you would expect is that this slope sort of trends down. The idea is that as you become more senior in the organization, you you sort of do less, you manage more, therefore you might need less accesses to systems.
We often see something akin to looks like this, which suggests as you grow in the organization, as you get promoted, you seldom lose accesses to the things that you had previously. And we see this type of analysis even in organizations that have a full IGA solution in play. So you're probably doing something like a annual attestation campaign, blasting out a bunch of emails to your managers saying, do you approve these accesses? And what they're probably saying is yes to all.
So even in a well gardened organization lacking some type of identity observability, it's hard to spot these types of issues.
Another approach is doing anomaly detection. So what we have here is some analysis comparing the actual usage of systems that you are authenticating into or using and the rest of the people in the workforce that use those similar systems. And then doing a what's called a peer group analysis to see how similar people are together.
Again, what you'd expect is that people using applications should cluster together. They either work in the same department, they work in the same geography. Some of the attributes about their identity should match. And so when I have outliers, I see people using systems that don't match the rest of the peer group that's suggesting that I may have either an issue, potential insider threat, or at least something anomalous in the rest of the data. Using just data out of things like active directory for example, can lead to a lot of false positives.
How often do you sort of deprovision group memberships in your directory? Is that sort of dynamic? So we've done some other very interesting work looking at things that are more realtime. And one of the behaviors that we see in most enterprises is that when a new project starts, one of the first things that happens is a social channel is created on a tool like Slack or Teams. Once that channel's created a bunch of people invited to that channel and then they're collaborating on an effort.
So from a peer group analysis techniques like this and using this to identify what the groups are and then correlating that with actual usage data from like single sign-on systems lead to a much richer insight of are the accesses of things that people are using relevant to what they should have access to. The problem in industry, specifically in our identity space is that the tools to lead to this type of visibility are difficult.
So the ideas of identity observability and what we'll talk about in analytics are really about visibility.
How do I connect a bunch of identity data from several sources, ingest things around behavioral analysis and then visualize how that work is being done. A third example is on data timeliness. So where you have multiple data sources and those are either synchronized or replicated at different times. You have these periods where that replication is not complete. We refer to this as data drift, where a source system has an attribute that needs a land and another system at some point. And the synchronization time where that happens may be done periodically or in batch.
This is an example from a financial customer we have here in Europe where they're syncing some of their attribute data with their HR system, but that synchronization happens when they run their payroll. So it's about every two weeks. So they have this lag that they observe where the attributes that a landing in the source system are out of date or drifted over a period of time.
And if that HR system, which is in their case is used for things like joiner, move reliever, pro provisioning, you have this risk where the data drift can lead to poor policy decisions with observability in place and with federation and in place, you start to clean up the data and then you're able to move into this next phase which is around analytics. Sebastian.
Thank you John. So once you have all the data, you can leverage this data in order to provide analytics as well. So one example is risk based scoring.
So if you have just a few minutes to spend to in order to analyze your data, one ID is to build a risk score in order to identify what is the most interesting to care about. In order to do this, you really need to have a lot of information along or around the identity as well. Otherwise you will only be able to leverage very basic attributes such as that time and high resource address. So this is really the idea. You need to correlate information in order to provide and compute very rich risk analysis model.
Another idea will be to as well correlate the information in order to be able to clean up your role model or group model.
In order to do this, again, most of the time you're running what is called role mining. Roll mining is nothing but comparing or aggregating access rights and entitlement based on user attributes. In order to do this, you need to correlate on one hand whole entities and whole attributes in order to identify what Zeus people actually who seems to do the same job with the company have in common.
So we do have that ization techniques here that helps you then to high identify very easily what those people are sharing in order to build or or reshape role in the company.
So at the end of the day, roll is the ID here is trusted identity data enables identity first security. So this is really the very first steps that you need to have in order to be able to build actually your zero trust approach in the, in the company. Once the authentication and once you have the data, once the authentication is done, then you have to take care about the authorization.
So for a long time, role-based access control was the only model in place in order to be deployed. So everybody knows role-based access control, right? I mean it's very interesting because it, it helps you to enforce the least privileged principle. It aligns permission with job function. But you need to build role, you need to maintain role. It's actually hard to build how to maintain. It's only cause grain. It's at an application level only most of the time and most of the time because it's hard to maintain, you end up with some kind of a permission drift.
So actually not 100% of the permission are granted through roll, but it's okay. I mean at least it's okay.
On fourth, the auditor, this is a must half for compliance. So whenever you have compliance driven application or system, you need to have role model. So this is part of SOC two compliance framework. For instance, if you want to have a more, let's say, risk mitigation approach to your system, then you ha you need to switch to from this role model to a more, I would say just in time risk-based approach. So this is where we have policy-based authorization model or have attribute based access control system. So there is a panel this afternoon where we will discuss this in more detail by the way.
So here you have all those systems such as ho and zebra that we will discuss. So it's risk based is just in time.
You can, you can manage permission at the very fineing ground level. It's not limited to application basically, but bad data quality ultimately leads to bad decisions. So whole model are leveraging user attributes and environment attributes. So it shift actually is a management of the governance from the role to the, so you need to have some kind of an attribute life cycle in order to be able to deploy zu model. It's nevertheless very interesting for the most critical assets for cloud-based systems, for infrastructure level security as well. And finally you have role-based access control.
You have attribute based access control mainly for the most risky application I would say. And then you have the rest for the rest. Most of the time what you want to do is to improve operational efficiency.
So really, you know, you just want to unleash actually your application in order to improve your operational efficiency.
So in order to do this, what you can do is to adapt some kind of a trust but verify approach, trust, but verify approach is nothing but what we can call autonomous identity. So the idea here is to move from a poster where we are more in some kind of a cell service model where Zeus user can directly ask or query for new application or new system access on based on analytics zoo system will be automatically granted, but it's trust. So it's trusted but verified.
So it means that we need, in order for this to be working properly, to correlate all the system, all the information in order to continuously check that this is actually a normal situation or not an abnormal situation. So this is where we need to help both observability and analytics in place. At the end of the day, all this new authorization model, attribute based access control policy driven on autonomous identity, both need identity data plus analytics. In order to do this, we shaped a framework.
No talk is complete without an architectural diagram. So let's bring this all together.
What we talked about is the importance of authorization in the enterprise, how decision making is becoming decentralized with multiple policy decision points. And then what's common between all of that is the data, specifically the attributes that are used to drive those decisions. If we take this through as a sequence of steps, you start with connecting heterogeneous data sources, multiple active directories, L a P investments, custom databases, skim servers, all the things that have attribute data that you're gonna use in your policy engines and you federate those with that.
Then you focus on cleaning that data, that's observability, missing attributes, overlapping identities, anomaly detection, cleansing the information that's gonna be used by the policy engines that the policies are executing the decision upon there you moving to analytics.
So then you're transforming this data into meaningful information, understandable by business people, which means that you take this data on your maps, these data to data model to understand things such as high identities, identities, attributes, application entitlement system to have this clean view, a 360 review of who is working for the company and who can access to what. Based on this you can do both. So you can do data enrichment and you can clean up the information, the entitlement. So this is really the idea of adopting some kind of a get clean approach next to this.
You have prescriptive analytics. Prescriptive analytics is really the idea of having, once you have this 360 degree view of the situation, being able to automatically enforce your security policies. So running highly checks, hey, 30 checks, starting to perform operations such as group mining, role mining, continuous success reviews as well regarding changes.
So the idea here really is really to adapt an approach which which is more some kind of a stake clean approach as compared to the get clean approach, the previous get clean approach. And finally you have prescriptive analytics.
So prescriptive analytics is the idea where you have to detect the unpredictable. So this is where you have risk scoring, this is where you have machine learning, peer group analysis in order to identify discrepancies in your entitlement model. This is where you'll also start to have things such as user behavior analytics to enforce continuous authorization against the application and the system. So this is really the idea of a framework of of or an architectural diagram, which will help the wall company to be part of this security initiative, whatever the authorization model.
How, okay, 30 seconds. So first, westward identity, first security. So perhaps some takeaways.
So okay, so let's say Monday, next Monday when you will be back at the office terms of conferences. But just let take a little step back and think about this regarding high identity first security. So assess your current situation.
So really, I mean as for yourselves, do you really know where is working for the company? What can access to what are you sure that you are operating efficiency and that you are, I mean that your operational efficiency is good enough in order to manage all the entitlements. So how you're struggling with your role-based access control model, for instance. And then as for yourself, in order to deploy or in order to maintain whole in entitlements, do you have a clean and sanitized identity context in place?
So do you have releases, consolidated view of all the people, all the entitlements, all the attributes in place?
If not, then start by defining a path in order to create this data lake actually, because this is really what we, what will fuel all those authorization model, whatever, whatever they are. Role-based, access control, attribute based, access control, policy driven, autonomous identity. This policy or those authorization model will need clean and sanitized data first.
Then step three, build three distinct list for your IT assets, most critical ones, compliance-driven application or system and rest. And then choose the kind of authorization model that you want to deploy or end force depending on Zeus. Three different assets, three different buckets. Thank you. Thank you for attending this presentation.
So if you need to, if you want to discuss further this topic with us, we're present at boost 19 and we also have a panel this afternoon where we'll discuss in in more detail the authorization model, attribute based access control and policy driven access control.
Well thank you very much Sebastian and John, that was really insightful and practical. I would dare say we actually have a time for one last question. Anyone in the audience, cause I have one here on the tablet. How do you deal with access cloning when analyzing peer group access?
This is very interesting question because prior to rule-based access control, most of the time the way that access are granted in the company are by cloning access from one individual to another in order to date exist. What you need to do is to crosscheck the entitlements with the access logs because most of the time when you have access Clonings Zeus, people ultimately will never use Zeus Overallocated entitlements. So you need to correlate this with access logs in order to detect Zeus Overallocated entitlements.
Okay, great. Well thank you very much gentlemen. Thank you. Thanks.