Good morning, ladies and gentlemen, welcome to our equipping or cold webinar classification. The intelligent way to ensure strong data protection. This webinar is supported by Bolton James. The speakers today are Paulson, who is VP EA of Bolton, James and me Martin Kuppinger I'm CEO, founder, and principle Analyst that could a coal before we start some quick information about, could a coal and some housekeeping information, and then we'll directly move into the content slides. So could coal an Analyst company. We are headquartered in Germany, but have also subsidiaries in the us.
And then Singapore founded back in 2004 and we deliver neutral advice. Expertise, saw leadership on a variety of topics centered around information security, end, access, risk and compliance, and all the areas concerning digital transformation.
So our business areas are research where we, for instance, to our leadership documents, comparing vendors and their products and services and certain market segments, and a lot of other types of research, our events like our conferences we trust had a few weeks ago, our flagship event, European identity cloud conference, more events to come also a lot of webinars and we do advisory.
The two are in fact, four, five upcoming events are on one hand.
The consumer identity world, which will run, will be run in September in the us, October and Europe and November Singapore and our cybersecurity leadership summit, which we will run in November and Germany and plan to do them in early 2019 in the us have a look at least. So before we done continue some guidelines for the webinar, you are mute centrally. So you don't have to care about muting, unmuting yourself. You're controlling these features. We will record the webinar and we will provide the recording presume. It'll be by tomorrow as well as we will provide slide X for downloads.
So you don't need to sort of note down everything you have access to recording afterwards, and you have access to the slide deck and there will be a Q and a session at the end, but you can enter questions at any time using the questions feature in the go to webinar control panel.
As usual, the more questions we have, the more likely the Q and a session will be with that. Let's have a look at the agenda as usual for our webinars to split to three parts. And the first part I will talk about or give an overview about why we need data classification, how this can be done.
So it's sort of an quick introduction to data classification. And the second paradigm Paul trans will build on that provide an in-depth perspective and technologies that support classification and the ways to do classification, how to increase business efficiency and to streamline operational and security processes. Part three, as I've already said, will be the Q and a part. So there I, we will then answer your questions again, the more questions we have, the better it is. So data classification, I think it's something which, which really is a need to have another nice to have today.
I have to admit, I have been reluctant regarding classification for quite a while because it affects the way people work. But if I look at the needs for that, which arise from various topics, and if I look at the, the potential benefits and today's situation of, of information security, then I believe it's really a need to have. And obviously there are regulations as drivers. So when we look at just some few things around regulations, we have, for instance, GDPR, and was GDP it's, there's a need to know where PII sites and so what, which data is PII.
So classifying data, for instance, to know that this is PII is a very essential thing. And we need to understand that we need to understand where PII resides. We need to understand which data we need to archive in a specific way, because it's because certain tax and other business related laws apply.
So sometimes they are archiving periods of eight or 10 years, or for some data, if it's product liability related might be even far longer than this. And we need to understand which data is.
So there are plenty regulations and many of them sort of lead to a situation where we need to understand which type of data resides, where, or which type of data is PI, which type of data is tax relevant. C it's also from my perspective, not only regulations. So that's one driver, another driver is operational efficiency. So when you look at all these numbers about the growth of unstructured data and, and how much data we have to handle, okay, we have some quiet, nice search capabilities, but obviously the better the meta data, the better the classification, the better the results will be.
So how do we find the data? And I have to say, you know, once I need don't know where the data is, or once I need to start using standard search capabilities and whatever SharePoint on office 365, I tend to find what I want, but it takes more time than it should.
So classification again, can help us getting better on these things and their security. There are many areas of security, which benefit from classified data.
So the, the typically use cases are around DLPs or data leak, leakage prevention, and around IRM D information, rights management, which can be which implemented in a far more efficient way when you have classified data. So the deal P policies, your IRM protection policies can be based on the classifications. So more granular, more target that less friction that's.
In fact, the consequence on the other hand, it's also about adding a layer of security beyond the entitlements. And I have a strong belief that we need both. We need a good classification to, for our security management, for our security policies, and we need the entitlements, we need both things. So just saying, okay, this is whatever classified as secret or classified as whatever company confidential, whatever obviously is not enough.
You also need to say, okay, who is allowed to see which part of that classify data, but it's just same way. It's also just it's through the other way around.
So you have the entitlements, but you could say, okay, data for certain level of classification is supposed to be seen only by certain groups in the organization. So you can add in fact, layers of security by using data classification the right way. And this goes well beyond information, rights management. So classification is something which helps in doing information rights management. It helps in doing DLPs or data leakage prevention, but it also helps for adapt for authentication.
So when we look at policies for context, risk based authentication and access to data, having classified data helps because we then can use that classification on the policies and decide on the risk of the context of the person and other things who's allowed to access, which type of information it's about archiving, or I already touched that point.
So when you look at archiving requirements, they are, they're very heavily, depending on the type of data, it might be even the other way around. So you might be legally obliged to lead certain type of data to types of data earlier.
So don't archive or delete archive earlier, email security, obviously. Well, so classifying emails, understanding the, the, the, the risks associated with the email, the sensitivity of these emails, that is one of the areas where we find a lot of integrations of data classification technology. And there are other areas. So basically classification is not something which is just for sort of certain types of industry.
So it started off primarily in government, military, but we have so many requirements right now in every industry that it goes well beyond that, and is also something which goes well beyond the single security technology SU such as for instance, IRM or DLP or other types of things.
So this is basically what, what, what we are seeing here. So we need this from my perspective, and we have a lot of use cases around that. So from here, we, we move to the next part, and this is something which Paul will elaborate far beyond what I will touch here.
This is, there are different ways to do classification, and it's not that there is the single perfect way to do it. It's about understanding which types, which approach they have. So when look at it, a, you know, relatively core screen structured, and we have classification by manual experts. So the librarian, for instance, we have the manual classification where, which every user is involved in doing that. We have automated classification, which looks at the content. So which tries to analyze the content.
And we have the automated classification, which is based on media data, which could be attributes, which could be current entitlement, security settings are stuff.
So we have different ways. They vary in the user involvement, which is very high for manual experts, which depending on the implementation might be higher or lower for manual classification.
In fact, if we are, we are honest, also automated classification has some user involvement in the sense of fine tuning that, but it's more, more an expert thing. If you look at the pure play part of it, and Paul will look at how these things come together or how we need to mix it, that will be one part of his presentation. So today I think we, we have, we are also at a point where manual classification is not something which is, is horribly intrusive to the user. So there are some really smart approaches based on the learning of many years on how to do that, right?
So the quality, if experts do it, it's tendency higher.
So we have done experts, which don't do anything else more or less.
Then, then classifying information, manual classification, it's it might vary massively, but it depends a little on the way you do it from an implementation perspective. It also depends on training. So users need to understand why they need to do it. They need to understand the business purpose, which is easier these days, where, where everyone is aware and scared of, of, of security risks, they need some training. You need to understand how to do it. And it must be done implemented in an efficient way. The quality of automated classification can be super. It can be pretty mediocre.
It depends on the technology. It depends on also how, how the data looks like what you want to achieve. So if you want to get a pretty fine crane classification of data and the content of the documents you look at is varies only slightly.
Then this is very hard to achieve. Then metadata probably is the better way because it helps them to look at some certain other aspects.
So again, it depends on choosing the right approach to having the right combination of things, but it can be done far smooth so far better, far more efficient than it could be done, could have been done when we go, whatever back 10 or 12 or more years. And as I've said before, I'm, I'm a strong believer that we need to shift to these approaches these days, because the challenges we are facing in, in, in formation security, the broader sense are, are so, and also information management, broad sense are so, so large today that we can't can avoid doing classification anymore.
And it's can be done in a very efficient way against Paul touch on, in his talk. So performance, obviously when you only have many like experts, performance is low, you tend to have long queues of work, which hasn't been done delays many classification if done, right? It's really something which is not a big effort anymore. And for automatically classification, obviously because you automate things are rather or smooth.
So at the end, from my presentation, my message is the one message I've already had is do it, look at it, add it to your portfolio of information management and security capabilities. The second is look for the right mix. So what is the purpose of classification?
So what, why do you do it? What are the needs? What is the level of detailedness you need? And then you can look at the approaches you choose. So what is the right approach or combination of approaches to achieve that target? So the purpose and the level of detailedness, then you have a technology you need to drain users. You need to bring users on board. There's as long as, as it's manual, there's always some intrusiveness and that can be minimized, but there's always little rest, which remains. And people need to understand why, and they need to understand how to do it, right?
And that's super important in this process. You need them to look at the results of classification and you need to improve. You need to optimize, no system will deliver the super perfect results from the very beginning.
Usually, unless you have a super simple requirement on classification. So there's usually an optimization required, you should be aware of, but my perspective is do it with that. I hand over to Paulson who we right now, talk about more details on that different types of technologies and how this increases business efficiency, how to streamline operational and security process with that. Paul is your term.
Good morning, everybody.
So, as I was saying, let me continue where Martin left off. So when we talk to clients, we talk about their different business environments and we look at it as two levels.
One is it, one is the actual business, the actual users, what's the core business of the company. What's the core business activity.
And, and when we look at that, we typically talk about different user cases. So what type of actions and what type of activities and processes are the users carrying out in their day to day act activity, which actually makes your business work? What are the core business activities around what's making that organization function? What what's making it money, how is it dealing with their customers? And then obviously there it's back is activity as well.
It's HR, it's legal and the typical type of commonality of, of those business activities that all businesses have.
And then we look at the many in varied environments from, from just from an it perspective, no client that we talk to has the same environment. Every single customer we go to has a different set up to the next customer. And every organization we go to has some form of uniqueness around the processes and the applications that they use let alone the kind of standard it differences around. We may or may not have some cloud.
They may or may not have some kind of thin or hit clients that they use typically with larger organizations that have got more than, you know, multiple sites in multiple countries, they'll have different operating systems. Typically from a Microsoft perspective, we also see, they also might have different technology vendors providing services, but they may be a different partner or a different DLP vendor or a different IR vendor sometimes varying by department and other times varying by country.
So there's not a uniformed set up by any means with any organization.
So every, every environment we go into has a level of uniqueness and then building on that level of uniqueness. That's, that's where we start to talk about selecting the right blend of classification.
And again, every organization needs to look at what its processes are, what its user cases are. And then from a golden James perspective, we look at what type of automated drive going all the way through to user driven classification. Do they need certain environments? Certain processes may need automated.
Some, some of those may need a, a blend of both and suggestive classification. So we are helping the users make a decision. And other times we're actually leaving it down to the user who creating data to make, to make those classification choices. And that's really important given some of the most recent regulatory compliance that's coming out, because there is a stipulation within GDPR that you have to have user awareness, you have to be educating your users in the type of data that they're holding and, and what type of information that that is.
And then to be handling that in, in the correct way. So we have, when we're talking to clients, we look at the blend across their organization of which areas need what type of, of, of solution in that classification part. We also look at the whole data environment. So we're all very familiar with a structured data environment. So that could be an SAP or an Oracle or any other structured database. And that's really perfect for most businesses. I'm sure from a, just from an it security perspective, if we can keep everything in a structured data environment, we know where it is.
We know what it is we can control who has access to it. And that's a really nice, safe environment for us to work in.
However, most of that data gets taken out of a structured data environment and then used in the unstructured data environment, which is where a lot of the challenges from a data classification perspective start, start to happen.
So we have terms, which I'm sure you're familiar with data in use, which is where we're actually looking at creating data. It's sitting on somebody's desktop. They might be creating an Excel spreadsheet. They might be creating some PDF documents.
They're actually sitting there creating data as we're going along with that, we might then look at archiving it or, or parking it in particular file server where it's at rest until somebody needs to use it or do something with that data. And then we have data in motion where we're sending that data around and that could be received on all sorts of devices, internal and external to the business. So we need to be able to understand how the flow of data right from its creation right through to it, through to its deletion.
But if I, I stay focused on the structured data environment, first of all, what happens in a structured data environment is it's all very nice and safe.
And then all of a sudden, somebody starts taking information out into that unstructured data environment. And if we don't have any other processes and controls in place, and we don't know what the data is, then we don't, we can't control it. We can't treat it how it's supposed to be treated, cuz it's flying around an unstructured data environment that could be sitting on people's desktops. It could be sitting on newbs.
The amount of times we go into clients and either suddenly find people of file servers under their desks that they didn't know was there for a central it perspective. So it's, you know, how do we do that? So what we're able to do as data moves from that structured environment, we're, as it's taken out, we are able to put the corporate policy and, and add those labels and that classification to that data.
So as it comes from a structured data environment into the unstructured data environment, it's already labeled and classed up and then is subject to the policies and structure that you've already got in place for your unstructured data environment. And in many cases that those labels reflect exactly what you've got in that structured data environment. So when we talk to customers, I, I may say we need to have a classification policy and we've got a project. So we then start to talk to them about how that project's gonna go and what's gonna happen after go live.
So in a typical environment, we get a lot of business requirements in, from the business to the security organization. And that tends to be very wide and many varied around the, the different user cases that they have, the different processes, the different types of technology that they're using.
And then what happens is the security team say, well, that's far too wide and varied, so we'll try and make it as simplistic as we can. And most organizations first time end up with between three and five labels that they decide will be part of their security policy.
They will then roll that out and it will go live. And then immediately they start to see that they have all sorts of other requirements that the current labeling policy and the more simplistic approach does not meet.
Now, the great thing from a bolding James perspective is that we are very, very focused on being a great labeling company. And that's all we want to do is understand classification and do it as deeply richly as we can. So we are able to support those businesses when they come to us and say, we've got more business and more complexity required from our business users than we thought from our initial more simple labeling policy.
And we're able to sit down with those folks and we can say, well, that's, that's no problem because we are able to expand and change and flex what it is that you want to use our classification tool to do, to order to meet those requirements. There was a phrase which Martin used several times, which we also use a lot, which is, it depends. You could only say it depends to a customer when they ask you about a particular user environment. If the tool you're gonna use is flexible and can scale and can and can change.
Some of the examples that we see by that is that we have people that want to have a cloud and a hybrid environment. They don't want to put everything in the cloud. So they can't use a tool. That's only a hundred percent cloud based. Lot of the time we see environments where there's mix from PC and Mac environments.
So you need a tool that can work in both those type of environments. We still come across clients who through merger and acquisition might have a, a division, which is still using notes rather, rather than outlook.
So we need to have the flexibility in order to include those users in the environment and be subject to the corporate security policy. So for us, it's about having a rich and granular ability to be able to flex and give you the re requirements that you need in order to meet the business drivers and the ever supporting challenges that it security will be given from the business as the business moves forward and changes processes. And if you think simply about GDPR, GDPR is not about it. Security GDPR is about processes.
So the, it security has to support the ever changing governance of those, those business processes. So what we see ourselves as our specialist in classification, we're the way that we do this is, you know, we help people take their security policy and we add that into a more visual labeling so that users can, can see the type of data that they're either creating or that they're receiving.
We're able to do that either from a, a user driven perspective, as we discussed earlier, right? The way through to an automated scenario.
And then what we're doing is we're taking that policy and those visual labels, and we're also adding it into the metadata. And now obviously the metadata becomes very important because that's where we're able to drive the value into the other security and the other application tools that you've already had within your organization that can read that metadata, understand the business context of it, and then start making decisions of making useful additions to the business.
And some of that can be, you know, if I'm a user and I decide to put something that's confidential, but is going to a legal department or legal part partners outside of my organization, I just have to put a label on it. The system will then know, okay, that's very confidential.
If that's going outside the organization, I will automatically encrypt it. I'll use some form of IRM tool and, and that encryption will happen. And because it's that type of, of data, I also know where I want to archive it and how long I want to archive it for.
So we're able to start adding context and business context in order to the, how we're actually operating. And this starts to mean that we have security and business outcomes rather than just security outcomes. We have plenty of technology partners who have some form of labeling in building their tool already. All of them understand that you need to classify the data in order for their particular technologies to work. It doesn't matter if it's IRM or DLP or encryption or, or a cloud based solution they need, they all understand that they need labeling.
Now, the reason that they still partner and want to partner with Bolton James, is they all have particularly limited areas of, of classification and not as rich as, as what we have.
And that's why they're still happy to partner with us. So we have a number of ecosystem partners and, and you can see on that wheel there not only the companies that, that fall into that as our technology partner, but also the type of technologies that they're trying to do.
And from our perspective, you know, data governance is a very big part as is information, rights management and data loss prevention, but also archiving and document management has also become very important because we are able to change the label of particular date and time, which will then trigger data to be deleted in order to comply with your archiving or your regulatory archiving requirements. And in doing all of this, we're also able to do a lot of reporting because it's not being able to just do these things.
You need to also sometimes have an audit, have an audit trial from a governance perspective to show what you're doing and show how you're doing it.
So we're able to bring a rich and granular view of, of how your data and information moves across your environment from creation through the structured data environment, controlling where it's allowed to go and who can see it and what, where what's allowed to happen to it in the unstructured data environment, right through to when it should be deleted.
So our view of the world is you have two choices when you sit down and look at data classification for technology, you can keep it simple and just have a very one level of, of tiering, or you can look at how your business actually works and how you want to use classification to actually add business benefits and get it right, Martin. Thank you very much. That was my last slide.
Okay, Paul, thank you very much for your presentation. And so let's directly move over to the Q and a session. So as I said, we right now have to Q and a happy to, we already have a few questions here, but if you have first questions to Paul or me, then please end enter these questions right now. So one of the questions I I have here is when in the bruises, should I consider my data, Paul?
Yeah.
So, and that's a very important question because people come to us and say, in my, in my project, I have to look at the data I haven't yet created. And the data that's being created today versus all the data that I've historically got archived. And our counseling to our clients is to say, if you start with your archiving, then you're gonna continue to have every day more and more data created. That's unclassified and unlabeled, continuing to go into your archiving. So you always got a, a, a funnel that's continuing to fill out with unclassified data.
So our suggestion is that you start with classifying your data from today. So you remove that issue.
So as of, once you, as of day one, once you wrote out a classification solution, you could then start classifying the data. So you are then no longer filling up your archiving silos with unclassified data.
So, and so you able to have a clean date. So once you've done your first part of enabling your users and then getting them to create classified data from day one, you can then move on to stage two of the projects, which, which then gives you a clean run at looking at your archive data and starting to classify that.
And, and in, in the way that that needs to be done to suit your, the security policy.
Paul, another question I have here is what if I have some Mac users in my organization who do I include them in an overall security blanket?
Yeah. Mac users are, are very typical. Most organizations, they will tend to be senior members of an organization who would like to have a Mac because because many of the time, because they can, but obviously most, most of those folks use office for Mac.
So what you've got, have to do is be able to give them exactly the same user experience for this access to the same security policy across the organization. And so one of the things that we are able to do is extend out our, our classification tool in order to include that Mac environment. So you have the same policy and the same experience right across your PC and your Mac environment.
So, so what if I, it's another question I see here. What, what if so, so we talk a lot about cloud, but also about on premises. Reality is usually it's hybrid. So most never will have a 100 cloud person cloud based environment, but they will have some cloud. So how does, does this matter for, for classification?
Yes. So that's a, if you're getting a lot of questions fired in at me here, aren't you? So that's another, another environment that we see quite a lot. People want to utilize the cloud because of the benefits.
And none of that time, it's a, it's a cost saving benefit and an improvement in business process, but many organizations still do not, whether they're, especially if they're government, they don't want to put things into the cloud. If they, if they think it's going outside of the country, or more importantly, if it's backed up outside of the country and many corporate organizations, we talk to do not want to put their sensitive data into the cloud because they still think it's at risk. So they want to keep it on premise.
So as soon as you require a hybrid approach, which means I want some cation, I want classification to be able to decide which type of data can go to the cloud and which type of data can remain on premise. So if you, if you're going to do that again, you lead a tool that will work both in a cloud and an on-premise environment.
Yeah.
And, and you might, might, and I think classification really helps. So you might not be, let's say phrase it like that. Be overly interested of storing product liability, related data on us servers, for instance, that might be one of those things where you, you say, okay, classification helps me for, of data sprawling into areas where I don't want them to reside. So I'm absolutely with you on that. Maybe another question I, I, I have here is, is about what are the typical industries, which are moving to classification as a standard approach these days.
So as a, you know, as a business, we do, we have clients across many different type of vertical markets, but probably the, well, definitely the three largest of those. And they're pretty, even as far as adoption is finance and insurance. And that you can probably understand that from the fact that they all have, they're all holding, certainly PII levels of data, but they're all holding client data from both an insurance and a banking scenario.
So that's, that's a very popular vertical market. That's adopted classification and the next one is automotive engineering and manufacturing, and they all adopt it specifically to make sure that their future products, which typically are in data form, either in CAD drawings or other file types, they want to make sure that they're sharing, controlling that information with their trusted partners in a manner which doesn't put at risk, their future, their future technologies and their future business plans.
So automotive engineering and manufacturing is a key vertical for, for classification.
And then the last one for us is also pharmaceutical much for the same reason as manufacturing on, on automotive, their future profits rely or sit in their future drug releases. And typically they'll have 20 years to have a patent on a drug. They will usually go to market somewhere between year nine and year 11, which only gives them nine years to make up all the money before it becomes a generic drug. And also importantly, during those, those drug trials, they will also have patient data as well.
So we also see the pharmaceutical industry as a huge adopter of, of classification to make sure information only goes to the right people.
Okay.
So, so you see, I would say far broader use cases these days when we go back a couple of years where I would say classification always was, had a little bit, the, the, the, the, the impression of being mainly something for government military right now, it's really in the industries.
Most definitely.
I mean, if you, if you look at it from a very simplistic approach, if I don't know what the information is in that file share in that document, how do I know how to treat it until, until I put a label on it? They have no contextual idea of what the value of that information is to the business.
Okay. And where do you see the main use cases that your customers?
So is it, is it more around information, rights management? Is it around data leakage prevention? So which type of, sort of, of attra technologies you had this picture of all the partners you have, which ones are the, the, the ones where you see the, sort of your customers implementing the integration first between classification and the other technologies?
Well, certainly data loss prevention. I mean, a lot of the times we go to clients who deployed data loss prevention on its own, and there's, they set up initially a lot of rules. And then they see that the business starts to slow down and stop at the level of false positive may rise. And then almost they thro it back to being a absolutely fantastic measure of the, of the data that they're losing, but not able to prevent it.
So that data loss prevention world, we have a lot of partners that we work very well with in there that understand once we add the context of classification, that they have a huge reduction in their false positives, which then starts to generate real investment returns on that DLP tool and providing real real business value. Another one is information, rights management, where people can then based on the level of level of the label can then start to put particular attributes in a document which can then drive how it's and where and where it should.
And, and, and what type of, for that, that is shared with people and how they can maintain control over that document once it leads their organization. And then more and more importantly, catching those two is definitely archiving and more importantly, deletion of data. Once you do not need it for any other business requirement or regulatory use.
Okay, perfect. I think we are through all the questions we have.
Paul, thank you very much for your presentations. Thank you very much to all the attendees for listening to this webinar. Thank you for golden James for supporting this webinar. I hope to see you soon again in one of our webinars or at one of our on side ones. Thank you. Thank you. Bye.