KuppingerCole's Advisory stands out due to our regular communication with vendors and key clients, providing us with in-depth insight into the issues and knowledge required to address real-world challenges.
Unlock the power of industry-leading insights and expertise. Gain access to our extensive knowledge base, vibrant community, and tailored analyst sessions—all designed to keep you at the forefront of identity security.
Get instant access to our complete research library.
Access essential knowledge at your fingertips with KuppingerCole's extensive resources. From in-depth reports to concise one-pagers, leverage our complete security library to inform strategy and drive innovation.
Get instant access to our complete research library.
Gain access to comprehensive resources, personalized analyst consultations, and exclusive events – all designed to enhance your decision-making capabilities and industry connections.
Get instant access to our complete research library.
Gain a true partner to drive transformative initiatives. Access comprehensive resources, tailored expert guidance, and networking opportunities.
Get instant access to our complete research library.
Optimize your decision-making process with the most comprehensive and up-to-date market data available.
Compare solution offerings and follow predefined best practices or adapt them to the individual requirements of your company.
Configure your individual requirements to discover the ideal solution for your business.
Meet our team of analysts and advisors who are highly skilled and experienced professionals dedicated to helping you make informed decisions and achieve your goals.
Meet our business team committed to helping you achieve success. We understand that running a business can be challenging, but with the right team in your corner, anything is possible.
Good morning, good afternoon, or good evening, depending upon which geography you are in. My name is Graham Williamson, and I'm pleased to be moderating the webinar for us on data loss prevention and the application of user driven data classification. I E will be commencing the webinar and talking about the area of data loss prevention and secure information sharing. And then Paul Johnson from Balden James will be talking about the, his experience with classification and helping us understand, going a little bit deeper into the issues around that as an Analyst for Cola.
I'd just like to make sure you're aware of the services that we we do provide. So just give me 30 seconds to go through this particular slide. Capol has three legs to the stool. One is research services. If you're not aware of the level of research that is available, please go onto the Capol site under reports and have a look at the, the reports that are available on virtually every aspect of identity and access management and, and cloud migration. And you can register for 30 days free access to that research. We also offer advisory services.
If you would like some assistance in your data loss prevention strategy, we would be very happy to help you with that. The third leg is events. The major European identity in cloud conference was held last month in Munich, and that will be scheduled every year and you should try and make it, make it make plans to be there. If you can. It's well worth that.
Now, in terms of additional activity, we have another event coming up in March next year, which I'll tell you about in a moment in terms of the rules for the webinar, you will all be muted. Unfortunately, there's just too many people to, to have audio, but the webinar is being recorded and you will get a rec notification of that. And the ability to download the podcast to follow up on any items that you might have missed. At the end of the seminar, we will have a question and answer session, and I would encourage you to ask your questions.
I believe that the interaction is actually the most useful component of the webinar, because as presenters, we under get to understand where the important points are for participants and as participant, hopefully the answers to your questions are going to be very pertinent to your understanding of the, of the topic in terms of the, the content. As I said, I'll go through the initial part of the webinar to talk about secure information, sharing data loss prevention, and how classification fits into that. So sort of setting the stage.
If you like, then Paul will go through best practices for, for data classification. And we'll also talk about some use cases to illustrate security practice. That part of the webinar will be the question and answer sessions. I like to start with this slide. This is a goal slide that we, that illustrates the issues that we are discussing when it comes to organizations, organizations are obviously made up of people and a large part of the, the issues that we have when it comes to dealing with secure information sharing and data loss prevention is of course people.
And we need to be able to manage access that people have to protected resources. So that's, that's an understanding a given the people though, like to work with their own choice of devices. So particularly these days with the millennials, those that are around the 25 years of age at this point, they really don't know what to do with the 12 inch screen. They want to work with their towers and smartphones.
So our, the access control that we need to put on our data loss prevention must include working with devices and allowing people to work with the devices that they want to work with. And, and, and the last component, there is things we've all heard of the internet of things.
And again, as people want to connect sensors and actuators to the corporate systems for data collection and for controlling things, those things are all, they all have owners and the owners need access to that data. They need access to being, be able to share that data and, and, and they need control on who can access those devices, particularly when it becomes a very sensitive thing. So this particular slide indicates the breadth of the issue that we're talking about when we come to data loss prevention.
Now, I just want to go through just some comments on how data loss prevention and secure information sharing are underpinned by classification. So in order to, to share data, in order to protect data, we need to know what classification it has, and we need to be able to manage those classifications. Okay.
So what, what, what I'm trying to show on this slide is there is a continuum if you like between the standard data data classification, as we understand it. And that's typically when we, when we say DLP, we, we start thinking of network controls. So devices that, that, that sit on the periphery of our network and control what can come in and out, we tend to think of content filtering. So we're going to be looking at data stream and determining whether that data can indeed exit the, the corporate corporation and then have things to do with that regard.
But really data loss prevention goes all the way into secure information sharing. And on that end of the spectrum, we're talking things like rights management. So we might want to on this particular document, attach some metadata that tells us what we can do with that information.
So we, we assign some rights management to that to allow us to control what access people will have to it. And once they've got access to it, what they can do with it, secure storage is another aspect of, of secure information sharing. So in some cases we will need to put in place a, a secure data repository to, to, to manage documents of a certain type.
But again, we need to have classified those documents in order to make sure that we can do that in the middle. We've got our API gateways, and those can be considered standard DLP devices. They are getting into the secure information sharing, cuz we can make them very fine grained if we, if we want to come back to that in a minute. And then in, in, in the transfer controls, what can, what can we do in terms of an FTP or what sort of network capabilities do we provide that allows people to share information and for us to protect corporate data from loss.
So all of those are underpinned by a classification system. We need to have some mechanism of, of classifying our documents. We've got another slide here that, that shows the two, two dimension, the two dimensions, the two main dimensions, if you like to secure information sharing and Capco does recommend that organizations have a strategy when it comes to how are we going to share our information?
And, and, and that's really, I guess the crux of the matter, because if we didn't need to share information, then DLP becomes a lot easier. You know, if we can say, okay, we don't send anything outside the organization, then that would be wonderful because we've now got a secure on-prem environment that we don't need to give anybody access to. But unfortunately that means that we stop the business.
You know, we can't give access to our business partners. We can't give, get access, give access to our customers who might want to need to get access to systems. So we need to have a mechanism where we, we, we have a balance if you like between, between making sure the business business can do what they need to do, but within an environment where we maintain the policies that we have on what can be released this particular side, we're saying there's, there's two dimensions that we're the, the main dimensions here.
One is where is the, the, the data, if it's all on premise, we can have a secure repository that we can lock up and say, only certain people can get into that. A shared folder is a typical example of that. So a shared folder, typically we have an 80 group that'll determine who can get access to that shared folder. So it's an on-premise installation and it's generic. Whoever is in that group gets access to that shared folder.
Similarly, if we've got a shared. So for instance, let's talk some examples. We might say I've got board level documents that should only be accessible by board members. So we can have a secure repository that is protected. And only those people that are board members, I E their identity record indicates that they are a member of the board will get access to that secure repository. And you can see it's halfway up the why access between generic and personalized, because that secure repository might very well be personalized. Like only the board members can get to it. Okay.
The, the under the, the rights management box, where at this point saying, okay, we are providing strong controls on documents, and we are only going to give access to those people that we allow to, to, to get access to those documents. But on top of that, we are gonna say, well, okay, they can view a document or they can view it and they can print it, or they can view it and print it and save it. Like we can control access to individual documents at a fine grained level.
When we, we have a very personalized approach to that when we are talking the cloud, the endpoint VPN. So we've got a virtual private network they're very often used for remote personnel to get access to our corporate systems. So they're very personalized because the person must download a client. It must have the client's key in there.
And, and we know explicitly when that person is connecting to our systems and connect accordingly. So it's a very personalized approach.
And, and it's very applicable in the cloud. I said, I would talk a little bit more about gateways. API gateways is a very broad sector, or it goes all the way from, from the device that sits on the, the, the periphery of our network and provides a gateway into that network for people that are, are allowed to do so. So that might be on the basis of, of a person's ad group. It might be based on, on, on other attributes. And when we get up into policy based API gateways, that's what we're looking at.
We're looking at individual attributes associated with a person coming in to get access to a protected resource. And that doesn't need to just be the, the, the identities, the users attributes themselves. I E what department they're in. It could also be contextual attributes such as the time of day. So a person's access through an API gateway will vary depending upon what device they're connecting from or what time of day it is, or their geographic location.
So a policy based API gateway can give us very personalized access to, to protected resources across that whole, the, the, all the way from on-premise to the cloud. We have, we can, can put strong control. So we do recommend that you look at your strategy and make sure that that strategy matches the requirement that you have in terms of the classification process itself. Paul will be going into more, more detail on this, but basically you need to determine what sort of protective markings you're going to be used using. And what sort of classification mechanism is it automated?
Is, is it driven by the user? You need to determine when those protective markings are gonna be applied. So do we say that the author is gonna do that on creation of the document? Or do we say no, we'll, we'll do that as it leaves the organization. So you need to know, at what point do you want to apply that, those controls, and then what, what if it needs to be altered? What if the documents, classification needs to be altered when you then allow, allow that to happen?
Does it have to go back to the user if you're using a user classification mechanism, or is there some sort of administration service that you will have that can override what the user's done? And lastly, what sort of duration of the classifications do you want?
Does, does a classification stay with a document for its life, or is the cert statutory period that, that it must be classified or, or is it some other mechanism? So we do suggest that you document the classification process that you're going to actually use within your organization in terms of document protection. Most importantly, the security culture within your organization is, is, is paramount. And when we are talking about the culture within an organization, we're basically talking a change management issue.
So I, I, I recommend the taking, using something like the, the, the eight point process that, that John COTA uses. If you are familiar with change management, John Coter is a glue in that space, and he has an eight point process that you might want to, to look up and, and see how that fits your requirement. It's K O T E R if you want to, to Google him. But raising that awareness is, is the, the most, the most important thing to make sure that people are aware within the organization, that there is a need for security, that there is intellectual property that must be protected.
And, and that can be at a system level. So we, we, we provide that control at the system level. It could be per document.
So if, if it's individual documents that need to be protected, then that's another, we need to raise awareness for the people using those documents. It could be physical too. I don't know if you've experienced it, but I've experienced the situation where I'm going into a controlled physical area. And the person in front of me has swiped their card. They don't know me, but they hold the door open for me. You know, those are the sorts of things that you need to raise awareness about within the organization.
If there is a need for, for secure and, and, and protection to your resources, you need to have high level involvement. Carter would suggest you get a border level person. Who's helping to drive the change management associated with, with, with secure document protection.
And we do, it's just recommended now that carrot not a stick approach in the past, particularly in the military environment, it's very much been a, been a stick approach, but when it becomes to a commercial environment, the recommendation is that the more, a real world recognition processes the carrot is, is used. In fact, I did read one research paper that talked about you should employ gamification, make a game of it, maybe have competitions between work groups or between departments. So taking that carrot approach is going to be, has proved to be more beneficial.
In fact, I have a little slight here that I'd like to refer to. Okay.
So, so basically when it comes to classification, we want to move from a legacy approach, which has very much been in the past a mechanistic approach. Like if, if, if in a, in, in a machine decision environment will do content filtering, and if we find credit card information, then we'll ensure that a classification of, of, of confidential gets put on the document as a very mechanistic approach.
If somebody does something wrong, we take a very punitive approach where we, we, we, we, we then take some retribution for the fact that potentially a piece of document has, has left the business unprotected. Okay. But we want to move from this punitive mechanistic approach into more of a cognitive reward recognition approach.
And this, as for progressive organizations means that we are going to have reliance on user classification of documents. And we could potentially assist with that where we, we, we have a content filtering system suggesting a classification, but the user is applying that classification and applying their cognitive understanding of the security of a particular document as they do do their classification.
So we, we call that more of a progressive organization. So I think we all want to make sure that we are working for progressive organizations. So taking that, that sort of approach, where users are doing the classification rather than machines. And we are following a, more of a reward recognition process is going to stand us in good stead when it comes to our classification systems. So that is, is the end of my introduction.
And I'd now like to turn over to Paul from bold and James, who will be talking about in more depth classification shift systems and talking about some user approaches to that. So thank you, Paul. Good date, everybody. My name is Paul Johnson. I'm the sales director at golden James. And I would like to talk the next few minutes about putting users out the heart of your security policy. So while a quick introduction into who bold James are, so you have an understanding of what our pedigree is.
Then I'll talk about why we think users are important and, and how, how you select the rights to approach to data classification, what some of the key success criteria to having a successful project. And then I'll share with you a, a few customer stories and some feedback that they've given us on how the project went and some of the, the gains that they've had out of it. So with no further ado, let me talk to a little bit more about bold James. So you may not, you may not have heard of bold James we've been going for over 30 years.
We originally started out writing military messaging and NATO, NATO, secure military messaging. And within that environment, they've always had the, the concept of putting a label on something, whether it's secret top secret or, or confidential. And then over the years we looked at, we decided that that same type of classification could be used and could play a key part in, in that corporate sector. So much so that in 2007, a company called kinetic POC bought golden James.
Now, kinetic were originally the research and development arm of the UK ministry of defense. And they saw both our military and our corporate products and saw that there was a great future for the company in, in them acquiring us, which is what they did. So that gives us a, a large parent company with which we can, we can fall back on and rely on for help when we need it. And if we need it and building on that, that means we also have a global presence. So we have operations all, all around the world. We have customers all around the world.
We have many multinational customers working in, in many different languages and using their dead classification across all those desktops. Here's just a sample of some of the types of customers that we have. Some of them are in banking, are insurance. Some of them are in pharmaceuticals. Some of them are in automotive and manufacturing. Some of them are in gas and oil. We have customers in many, many different vertical markets.
I wouldn't say data classification is particularly akin to any vertical market, because all, all organizations, whatever vertical market you sit in will have data that, that you will consider to be confidential internal only. And you'll have data that you want to share with some of your trusted partners. And you'll have data that you probably ask as public where you don't mind where that goes. So it's not particularly vertical market related. Although there are, there does tend to be a higher concentration in some industries rather than others. So moving onto the user.
So why are users important? Well, mainly because human error is a massive factor in everything that goes on from a, a data classification or data breach perspective.
So as you, as you, as you can see there, the worst breaches in, in 2015, we're due to human error as, as were the vast majority of breaches as well. So the human is important because actually they're creating most of the breaches. Majority of it, the vast majority of it is all accidental.
So, which is where we, so this is where we talk about user awareness and to kind of illustrate this. Here's one of our favorite cartoons that we use in the office. And as you can, as you can see, you can have as much automated equipment in your network as POS as possible, but you're always gonna have human error. You're always gonna have the person sitting between the chair and the keyboard. That's gonna have some impact on something. And actually, when you think about it, you know, the human brain is probably the most complex computer in your organization.
So why wouldn't you harness that to underpin your security policy? So really we have here, what we, the concept of what we call a wimp, which is a well intentioned misguided person. So as we're say, most of these data classification, breaches are human error, and most of them are accidental. So these people are, are not maliciously doing this. So they needed to be help and they need, and they need to be guided as board.
You, what you'll have is you'll have the information stored all over your organization, and you probably don't know where it is. And even if you would start looking, some of your information's gonna be on USB, it's gonna be on service, found on the desks. It's gonna be sitting locally on people's laptops. So this information could be anywhere and everywhere. And at this moment in time, you've got no idea where it is more importantly, you've got no idea how to control it and just, and what would be the best way to move forward with it.
So selecting the right classification approach, We tend to, when we're talking to our customers, use the data that both Gartner and Forrester are board to the table, and this a five step process, which you would imagine is, is fairly straightforward. It's, it's identifying where is that sensitive data on?
What, on, what is it? Is that just board minutes is that merger and acquisition documents is that customer financial records is that are those personnel records, are they, are they all of those? So it's understanding and identifying what type of data you have and how, and how you would classify it. Then you have to go out and discover where it is, where is it sitting? Where is it being stored in, on all those different types of devices, all, all over the place. Once you found it, you can then classify it. You can apply labels to it.
And we'll talk a little bit about that before in, in a minute, but once you have classified it, you can then start to secure it. You can then control where it goes, how it goes when it goes, who can see it more importantly, who can't see it. And then as you would imagine, you would, there's some level of reporting and analytics where you can continue to monitor what is happening and then you, and if there's any developments or progression or differences, or that need to be added into the policy, you're able to do that based upon fact, rather than just your best guess.
So from a data classification perspective, we do three things with your documentation. We apply visual marking so that you can see the label. So the user can see a label. And if somebody sends it to somebody outside of your organization, that they're allowed to that at least it's got your security marketing on it. And you would hope that those people receiving that information would treat it in the, in the manner that you would expect that level of visual classification would be. We also label it into the metadata, which becomes the most important piece going forward.
So we're putting that, that label into the metadata and giving that key identifier to many other types of technology and using that metadata to control the, the creation, the, the motion and the at rest of where that data goes, right? It's life cycle. And lastly, we're using that metadata as well, and the handling rules to decide how that all happens.
And, and when that all happens, we look at it as a bell and braces approach. So you could just have a, a very low cost paper based on a stamp, which people used to do in the 1970s and, and the early eighties. But then more recently in, in the last 10 years, we've, we've had a lot of automation, a lot of data loss prevention technology, a lot of security technology. And what that does is that sits there in the network. And that does things that the user doesn't really know what's going on, but also it means they have no real, no real control over it.
And if you do have a data loss, they can just, your users can turn around and say, well, it wasn't me. The system let that data leave. I had nothing to do with it. It was all automated. And then you can have the user driven, which is obviously we're able to put context into the label. We're able to build a little bit more awareness around the users of the data that they're creating, or the data that they're seeing and where they're sending and who they're sending it to. We are able to then increase the accuracy of those automated data loss prevention systems.
By putting context to the automatic scanning that the LP engine would use by using the metadata and the information on our labeling for it to make a higher level based decision, we're able to keep individuals accountable. So if people, if people are, are labeling things consistently with, with the wrong label, you know, the system can identify that and help them relay it before it even leaves their desktop.
But, but everything we do is audited so that you can pull down all that information and do with what you will, from a training perspective or, or from a regulatory compliance perspective. What we're trying to do here is increased the, the trust that users have in, in the data that they're both receiving and, and sending. And typically we would look at this being a kind of combined approach.
And if, if I start from the user driven part and the top left, so we would, you know, we're looking to empower those users to be able to put markings on the, on the data that they're creating. You can also have some, some what we call recommended. So you can have some intelligent defaults that you can recommend to, to a user. So you might have some particular documents. You might already want to be templated with a particular label on it.
If it's always gonna contain a certain type of information that relates to one of your classification levels, we can automat, or can also automatically label information, especially information coming out of a structured environment, such as an SAP or an Oracle database. If they're regularly coming out and being dropped to a particular file, we can automatically label those files as they're created.
So that as soon as they come out of the structured environment into the or unstructured environment they're already labeled, and the handling rules will allow that to, to be transported in the manner that you require. And then the last type is that we can supplement other things so we can use or endorse label. So you might have an automated label that comes to particular users who have the ability to either confirm or relabel or reclassify information.
And we would recommend that you do a blend of all of this for all the different types of information that you have flowing around your organization. Probably the most important thing is how do you actually know if I, if I move forward and I do this, how do I deliver a successful project? And how do I get better classification rolled out with, within my organization? And the first couple of things that you wanna do do with that is, is take a step back and, and say, actually, why am I doing this?
And, and what do I want to achieve? So there are some, there are certain influences that that might be driving you this way. It could be. And some specific requirements like I ISO 27,001, or the new EU data protection law, which would be coming in in 2018, it could be payment card information. It could be I, or other export controls. So there are some, there are some specific drivers in that might be in one part or in all parts of your organization. That might be a driver as to why you want to do data classification in that particular part.
And depending on which vertical market you are, you're operating on, whether there's a regulatory body, they may be requiring you to be adhere to a set of regulations, but in order to adhere, you also need to do able to demonstrate that you are being compliant with, with those regulations. And the reporting for data classification is, is a great way of being able to do that.
So for us, moving on to a successful project is seven things that we always talk to customers about. The first thing is having a simple policy. So when we start out, most organizations will either have three or four levels in their policy. And that's the really, really good way to start. Let's not go over overly complex to start off with. So starting with simple policy is makes, makes the understanding and the deployment much, much easier. The second part is that we say you should do a two stage proof concept.
Now that first stage should, should involve as fewer users as possible, but in as many different departments as possible. So if you've got 10 or 12 divisions across your organization, if you could have one person from each of those divisions in an initial one week trial, then you will probably get 70 to 80% of the initial tweaks that you need to make to your daily classification policy. And that's much easier to get that back just from 10 or 12 people than it is from a hundred or 150.
And the sort of things we're looking at, there are a lot of organizations use PowerPoint and they use PowerPoint with already with pre-described templates and, and footings and, and headers on them. And to sit, you know, simply we don't want our label to be applied across the top of 1, 1, 1 of those headers or footers.
So it's looking at simple things like that and understanding in any templated material, how we move the labels about so that when it goes out into a jet into a more general trial, that all those simple things have already been done and we've captured most, most of those, those tweaks, the most important thing is that we need to manage the cultural change for all those folks. So they need to understand what is in it for them. Why are we doing this?
You know, I'm, I'm, I'm adding an overhead of maybe asking you to click another button while you are creating a file or sending an email. So what's in it for me. And certainly one of the, the biggest paybacks we have is that this does stop the embarrassment at a minute at a, at a small level or financial loss at a larger level of sending an email to the wrong person. And I think we've probably all done that to various difference of, of, of degrees of impact, but what's one of the biggest whats in it for me is, is to why would I want, why would I want to do that?
So empowering your users to make a decision and getting them to understand the impact of the data that they're creating and who they're sending it to means you have a much more efficient and effective staff in the organization that can support your security policy and, and keep your information where you want to keep it. We would also suggest that during deployment, you always involve the vendors technical team.
You know, we, we have a wealth of knowledge in, in other people in rowing out other cation projects. We, we know what the challenges are, what some of the risks are. So we know how to mitigate some of those, what sort of things to put in place and how to do it.
So we, we can help by giving our knowledge to go for as smoothest deployment as possible. You also need to have a strategy for, for your legacy data. And we tend to discuss this as a second point in it's okay, I'm gonna deploy data classification today from day one.
So from, from now on, we're gonna classify everything, but I, you are gonna have a huge amount of legacy data that you've already got sitting in various places, and you need to understand, well, how are you gonna deal with that? And there's different techniques and different processes and different methodologies that we have to do that. And we can share those with you on, on, on the best way for, for your environment and, and to be able to do that. But that's a key thinking is what do I do with my legacy data?
One of the reasons most of our projects store is because we have an inconsistent level of project management from the client. You'd be amazed how many project managers are bought in as contracted staff, whose projects run out half those contracts. So they run out halfway through the project, or they get offered a different, a different role somewhere else.
So that, that's an important thing to think about is who am I gonna appoint as my project manager? And I am I sure that they're there for the, for the life of the project. And then the last one, there again is something that is fairly fundamental, but it's being aware of other projects that the it organization is rolling out.
So we've, we've seen customers in the past give the data classification project to a desktop team who were unaware that there was a server refresh going on. So there is just being, having a greater view around the rest of the it organization, whatever, what other upgrades changes add and moves are occurring at the same time. And then once we're getting in start getting into a timeline, what are the sort of things that we need to do? And this is a typical timeline that we sit down and to, and discuss and process with customers. And I'm gonna start from the bottom there.
So that the key there is to give senior senior team management and, and grant touched on this earlier about having board level sponsorship to make sure that this happens, and then we're able to work across from right to left there, how we go through that process.
So it's understanding that once a decision has been made, and you've got that senior sponsorship that you can move forward into understanding what classifications you want, how, how that policy is gonna look and how you think you you're gonna deploy it, then moving into your two stage trial of a, of a smaller team and then a larger rollout team. And then once you've been through your proof of concept and you're happy with the policy and how it looks and how and how it feels, you can then move on to the most critical piece from my perspective, which is how do we get those users involved?
How do we start to talk to the users? How do we get 'em to understand what it is that how? And then once you've done that, and you've got that user awareness program and that cultural change piece in place, we can look at how you do the full rollout and, and, and, and how that happens. From a technology perspective, we can deploy data classification to a million users overnight, but that is the challenge. The challenge is when the users come in and turn on in the morning, they need to start classifying. They need to start using, using, using the tool to actually use the tool.
You don't need any training at all, because as long as you can use windows, it's innate. In fact, most people think the tool is Microsoft. What the users need to know is you have a security policy. They need to know what that security policy looks like, and they need to know what type of information they should. They should label wi with those type of labels. I did mention metadata earlier on, and this is where metadata becomes very important in considering what other integration you want to do with those labeling that metadata.
So there's a vast amount of different technologies in those boxes on, in all those boxes, all those vendors are able to recognize and make a decision and cause an action based upon our metadata streams. So you might find that there's many things that you currently do, which will be, which you may or may not involve the users in or may, which can be automated going forward.
So it's understanding the full extent of how metadata can be used to trigger and automate other things that you are already doing in the business today, going back to user awareness and creating and, and, and, and maintaining that awareness. Let's just look at a couple of things around pre-deployment and post-deployment so sending out those internal communications, getting the right tone of voice, how do you talk to your people? Do you have a change control?
Do you have a, an organization within, or a comms department within your organization that should be front and center on that cultural change that you can, that you can employ do this, making sure that you communicate and how you're gonna roll it out, whether it's gonna be a big bang, which is some of our customers do, or whether they're gonna do a phased rollout either by departments or by, or by geographic countries, and then to keep it going, post-deployment, it's continue awareness campaigns. You'll have a, you know, we have an out of the box reporting tool.
You can look at how many documents are being classified, how many being reclassified and everything that a user does on the desktop is stored in the, in, in the windows events log. And we can pull out all that information and, and then display the in, in customized reports as, as, as to what it is that you want to see. And it's really interesting. You can start to see behavioral analytics, which does start to give you some form of almost real time happening in your business.
So if you have somebody in your business that routinely labels three or four confidential files, documents, and emails a day, and today they've done 55, then you can look at that spike and you can go look at that. You can say, well, you know, either they are genuinely doing 55 today, or there's a virus or something, something else has happened or they've been hacked, but it act, it enables you to react faster than the up to six months, sometimes that it takes to find that you've, you've had a breach.
So being able to act on that information quickly is very, very important and underlying that cultural change going, driving that again. So here's some examples of some of the data and some of the campaigns that we've seen go out with some of our customers on how to classify your documents and what level of information that you, that you, you should see. We've even had one customer that changed everybody's corporate background for three months to be in the security policy. So it really depends what works best as a cultural fit.
And, and, and again, that could be different in, in different areas and different countries around the world, if you, if you are in a multinational organization. So it's just understanding that they are, the users are important, they need to be educated, and they need to be continually reassured on what it is that they're doing from a customer perspective. Let me just run through you only three, three more customer slides, and then, and then some final slides. So some feedback we've had from part particular customers, so Alexei.
So in, in doing, doing a project with, with, with part of Alliance, they were absolutely amazed at how their users reacted to being empowered, to start classifying data in line in line with the, the company security policy. And it stopped them failing security audits, which said, never passed until it been rolled out. So they saw, as you can see there, you know, a massive reduction in breaches of security policy, but also a massive improvement in, in user security awareness.
And that wasn't just technology that was people stopping people, tailgating through security doors, people reporting fishing attacks. So it just brought up the whole level of awareness around the, the type of data that they were using and handling and who was gonna see it and where, and where and where it was gonna go to in part of Saar general.
In, in, in Delta bank, we deployed data classification after the deployed data loss prevention tool. And once the, our labels and metadata will be able to be interpreted by the data loss prevention tool, it was able to reduce their false positives down by 80%. And that meant giving people back to the it department, because when you've got information locked in to quarantine area, it's gotta be released by a human anyway. So you might as well employ that human in the first place to put a more contextual label on something.
So they, they got a massive hit because they got people back into their ID it department who were sitting there looking at false positives all day, and then the last ones from a tackling tackling of legacy data. So once the Prudential had rolled out phase one, which is to classify new data and everything on the desktop, they then ran a second project to looking at legacy data. And we have some bulk classifying tools, and we have some, some other tools that we can work with in order to do searches on, on keywords and bring it up and then use the, the bulk classification tool.
And they were able to relabel or label 26 million files, which is 15 terabytes of data in, in just 12 months. So that, and that was a really interesting project for them because they thought it was gonna take them years to do that. So they were quite pleased that they were able to it in relatively, such a short time. So in summary, we're looking to increase users, their awareness, and by doing that, and by getting them to trust the data that they're being sent, that they're creating, that they're being allowed to label.
It raises their level of trust in the data that they're receiving and sending. We can help to reduce those DLP errors, that the, the amount of emails that are getting trapped in quarantine. And we gonna remember as a, as a user, I'm sending an email. I don't know, it's been, I don't know my personal, I'm sending it to hasn't got it. I don't know. It's been trapped in, in, in quarantine. So you can have some significant delays there.
If, if you don't know that that's going on with the reporting that we have, we're able to demonstrate compliance to many different regulatory authorities, depending on what they're looking for. Once you've labeled your data, and you just need to start looking for other bits of data that you can use to use the labels as part of your search criteria. So you have to speed up that discovery to find that information, because you don't have to sift through a load of data that's of the wrong label.
And lastly, it improves accountability because I'm sitting here, I'm creating that documentation, I'm putting a label on it. And so I'm, I'm making a positive decision to do that. And if I'm doing it wrong, then you know, that that can be picked up and I can, I can be held account for what I'm doing, or I can be, be looked to, to be retrained. So key points, I start with a key sponsor. So start with somebody on the, on the board that understands the importance of doing this.
Keep your policy simple, and have a plan for what you're gonna do with your legacy data, consider the integration opportunities, using metadata around other technologies to automate and improve some of the tasks you already do. Definitely run a two stage process from a, a proof of concept using as few users in as many departments for the first week, and then you're able to monitor and activity and identify risks, and then then enhance, and maybe adding more labels or more descriptors as, as you go forward. So I'd like to hand back to Graham, thank you very much for your time. Thank you, Paul.
That was, that was excellent. I, I, I, I really appreciated your comments that you made, particularly in the project management area, because that's Les heel.
I, I, I believe is there anything that can be done in regard to that project management approach? So rather than have an external come in and do it, would you, do you recommend an internal resource to do the project management task?
Well, I mean, it all depends on, on your policy and, and the bandwidth that you have in your organization. What I would say is if you're gonna bring in an external contractor just to do that project needs to make sure that the person seeing that person is fully okay and aware of, of, of what is going on in the project and, you know, make sure that, that, that person's temporary contract is gonna last, at least the length of the lifetime of, of, of the dead classification project. Right. Okay. UN understood. Okay.
We have a, a question here on balance between machine and human decisions. What, in your, in your experience, how do you balance that? How do you determine how you use machine based decisions versus user based?
Is, is there some rule of thumb in determining that sort of automated guidance approach? Well, one of them is to look at where is the data being created.
So if the, if it's being created in an automatic file from a structured data output from, you know, SAP or Oracle, then, you know, we can apply that automatic label. As, as it's put out into a file, we can, we can do that. It can then go to somebody else to double check that actually that's been got the right labor and the right level of classification. If I'm a user and I'm being sent some information that I'm then gonna create and put in another document or into another email and send on, I can have light within the classification tool.
You can have some light scanning again, to look for particular keyword, and that will pop up and do what we call user user guided labeling by saying, you know, you've, you've classified it as this, but we've found this particular information and we think it should be classified as that. So again, it's all it's understanding the type of data that you are using, but we would always recommend that somewhere on the line, you have automated data labeling, checked by human. Okay.
So in terms of that legacy, the legacy files, you mentioned one of your customers, whether it's a large repository of data, would that have been all done via machine versus the, the new documents as they created being more user classified? Yeah. So what they did in the, in that case is they U used a, a tool, a discovery tool looking for particular keywords and key documents. And it would then bring up all those documents with, put 'em into one location with, with those keywords in it. And then they were able to then just click on and label bulk label, all, all of that data.
We want a particular label. Okay. Okay.
Yeah, it was very impressive. The, the economic benefit to that you indicated for those customers. So that's very impressive in terms of the, the, the question here about DLP stopping the business in terms of how you approach your data loss prevention.
I mean, obviously if you lock things up tight on a drum, then you, you, you are hindering the business. How do you make that balance? How do you keep the balance between making sure we keep our security in place, but enabling the business to, to undertake business IE shared that with business partners and so on.
Yeah, sure. And certainly, you know, the last thing we ever wanna do is to stop a user doing their job. So from a data loss prevention perspective, we've seen two ends of the, the spectrum. We've seen some people that just leave the data loss prevention tool and monitor mode, because it was too difficult to put in a set of rules that actually allowed the business to function. And all they were doing was in that mode. They're just using the output reporting to look at what information did actually leave the, the organization.
And then the other end, we've seen some people, another organization tried to implement 500 different rules on a prevention engine, and that just locked up the whole system. And no one was really able to send it or create any information and send out of the, the organization at all. So with all these things in it's, it's a blend and it's having an understanding and working with your systems integrator to understand, you know, what's the best balance between my DLP engine and my data classification tool and where, and which one do I turn up?
Which one do I turn down in particular environments to work the way that's best for the organization. There aren't two customers out there that have exactly the same environment and exactly the same classification tools and exactly the same DLP engine. So everything is, is slightly customized to each individual customer are as to how the, the level identification and what decisions that overrides and the level of rules and what overriding the DLP tool can do. So it just comes down to looking at what you have in your environment, what is the desired outcome for that?
And then setting both pieces of technology to work in that manner in a coherent way. Okay. So very customized approach is what your recommendation is. Yeah. Last question here I have, how many levels of classification is best? I think you mentioned the three, or I did to start off with it's three or four. Absolutely. Right. And I think if, as you're starting out, down on, on that road, that's what you want to do.
Cause anything more than that's probably gonna be too complicated once people have got the understanding of on, on we can, or we can put two types of labels, we can put an unlimited amount of labels on, on documents and files, and we can put them in there as descriptors. So it might be sensitive, external HR, or it could be finance, internal M and a. So you can have as many labels and other secondary descriptors as, as you want, make it as complicated as you want. I would start off with three or four and then move to further levels if you, if you require it.
So we do have one customer who has a certain number of users. And by all means, it's not the, the Mo the majority of the users. And they have the ability to put up to 17 different labels onto documents. And they have a specific requirement for that. And that is these labor. These documents are being stored all around those documents. As soon as the local regulatory requirements says that they can delete those documents. So they may have a, a document that needs to be stored in a, in a country in AsiaPac that might have to be kept for five years. They might have one in the UK.
They needs to be kept for seven years, but they might have one in Australia only used to get for two years. And so these documents would be automatically deleted after two, seven or five in those particular locations that they're in. So you can make it as complicated as you want. I don't know why you'd want to do that, but you, you, you are three or four to start off with, and then revise.
And you, and you might just be particular departments that you want to then increase the level of labels and descriptors that they're able to use. Excellent.
Look, thanks so much, Paul. I, I found it fascinating. I trust our participants have found it fascinating, and I, I, I thank you for raising that awareness within our environment, to the importance of data classification and the, the bold James classified product. Our time has gone. Thank you for your participation for all those, those who have stuck through to the end. And don't forget that you will get the reference to the podcast when that becomes available in the next couple of days.