Session at the European Identity & Cloud Conference 2013
May 16, 2013 12:00
KuppingerCole's Advisory stands out due to our regular communication with vendors and key clients, providing us with in-depth insight into the issues and knowledge required to address real-world challenges.
Unlock the power of industry-leading insights and expertise. Gain access to our extensive knowledge base, vibrant community, and tailored analyst sessions—all designed to keep you at the forefront of identity security.
Get instant access to our complete research library.
Access essential knowledge at your fingertips with KuppingerCole's extensive resources. From in-depth reports to concise one-pagers, leverage our complete security library to inform strategy and drive innovation.
Get instant access to our complete research library.
Gain access to comprehensive resources, personalized analyst consultations, and exclusive events – all designed to enhance your decision-making capabilities and industry connections.
Get instant access to our complete research library.
Gain a true partner to drive transformative initiatives. Access comprehensive resources, tailored expert guidance, and networking opportunities.
Get instant access to our complete research library.
Optimize your decision-making process with the most comprehensive and up-to-date market data available.
Compare solution offerings and follow predefined best practices or adapt them to the individual requirements of your company.
Configure your individual requirements to discover the ideal solution for your business.
Meet our team of analysts and advisors who are highly skilled and experienced professionals dedicated to helping you make informed decisions and achieve your goals.
Meet our business team committed to helping you achieve success. We understand that running a business can be challenging, but with the right team in your corner, anything is possible.
Session at the European Identity & Cloud Conference 2013
May 16, 2013 12:00
Session at the European Identity & Cloud Conference 2013
May 16, 2013 12:00
And last presentation in that session will be from Caston keen, who will look at the data privacy aspects of big data. Thank you.
Big data, small privacy. I impose a question here. Maybe we'll approach this question a bit further with my input from the legal perspective, especially the privacy perspective. So maybe you find out for yourself it's if it's a smaller privacy or not. So what I wanted to say about this, we don't have an answer yet. If big data as such will lead us to smaller privacy or if privacy, as we know it today. And as we know that it will evaluate will be evaluated in the future is reliable at the same time.
So, First I want to point out that we have a legal understanding of what big data is. We have heard a lot today about the definition of big data. So I don't wanna put this into any length here, but we need to understand that big data means creating new information, which is something very important from the data privacy and protection perspective, because we have to look at two points and we have to assure that those two parts of big data, as we see it from the legal perspective are done in an illicit way. The collection of the source data must be done in the correct legal way.
And then the following analysis, again, needs to be subject of a legal revision. If the analysis itself is done in illicit way is the second question we always have to ask. So we don't throw that all together. We have to separate it and check on both sides of this, of these happenings in order to understand if big data have concrete action in order to analyze big data is actually basically done or not. The question is if big data is in conflict with law in general, and with privacy laws in particular data protection, just to remind us all of what this is about really focuses on personal data.
A lot of people get that confused at some time with private data. This is not the term we are using personal data it's data that tells us anything about a living person may be a person that is easily to identify or not easy to identify at all for certain people.
So in principle data protection laws in the UR based upon ban of handling of this personal data, you may not use data on people may even be minor information, the color of your socks, the color of your hair, possibly other things like that, very minor information in daily life, but still, you may not use it unless you have a very good reason because we have a permit reservation. This permit reservation has two parts. Either one may lead to a perfectly legal situation in using data.
Even though we have the BA basic rule of a ban to use personal data, either you find a good reason in any legal statement may be a local one or a national one or an international one. You just need to find a law that tells you, you may, in, you may act with the data the way you feel like doing it, or as an alternative, you may find an informed consent of the data subject, which of course is the person what, which the data is about.
So we have this permit reservation either by law or by informed consent, which is a very important information, which you will see when we go a bit further into the question, whether the analysis of data in particular cases is legally okay, or not. The general legal principle for handling with personal data is the pur the concept of purpose limitation. And this term is something very important.
If you think about big data from illegal perspective, because the purpose limitation is not only applicable to big data, but it's as important for the question, whether big data from a privacy point of view is alright or not as the idea of the basic ban of handling personal information. And it's that important that the article 29 group, which is the group where the national data protection officers of the member states meet and discuss on an international level, the current privacy issues. So this group has really focused very strongly in recent published paper.
It's just some days old on the purpose limitation. And it actually went very far in going into the question, whether big data keeps up with the idea of purpose limitation, which I will explain in a second. So to make this clear, the justification for big data is a very important question because of our general rule of the band to handle data. But we need to really separate whether we are talking about anonymous data or personal data, because anonymous data is information that cannot be connected by anyone to a living person.
Whereas personal data is, as I just pointed out before an information that can be connected with a living person, even by only one person in the world or any machine is enough. So it doesn't need to be you who knows what this information tells me about whom this information is, is treated, is treating, excuse me. So relating to personal data, the data serving as a basis for big data must be lawful. We need as pointed out a permission by law or consent.
And for the analytic process, we also need to ask ourself is if the analysis is law or not, the continuing of the original justification from back when I was sourcing the data with respect to the purpose of data needs to be the case. So we need to have a, a continuity here. When we start to source the data, when in the first place we get hold of the data, we need to have a good reason to have it, which is either the permission or the consent. And this needs to be still living permission in the moment that I'm doing the analysis. It still needs to be there.
If it's not there yet, the analytical process is not justified and it's not listed to analyze this information. If I have anonymous data, of course, there is no need for justification. I have the same example as was mentioned by Mike Small, before the weather forecast, for example, there's no personal information. There's no personal content here. So the concept of purpose limitation is really the key issue when it comes to privacy and big data, the personal data must, and this is something which the current directive tells us.
And also the future legislation, the regulation, the data protection regulation that will be enforced in 2016, the personal data must be collected for specified explicit and legitimate purposes, which we call a purpose specification in a Le as a legal term. And also personal data must not further be processed in a way in compatible with those purposes. So this is the continuation question I was talking about.
So let, let's have a look at the purpose specification. The specific, the term specific is very important.
So I, I want to point out what this really means in a legal understanding. It means that we ha need to have a precise and fully identified definition of what is my good reason to work with that information. What is my purpose, my aim to do with that information? And of course the aim must be a legal one, and I need to not only explain that it's a legal one, I need to really have very clearly and very specifically in mind, what is my actual aim?
When I work with a, with this information, when I source this information, when I pass it on or whatever I'm planning on doing also this needs to be done before the data is even sourced. So it's not that you may source data and then ask yourself, what am I gonna do with it? Or you may not even think of changing, of course your purpose. So you need to have a stable and in a stable reason and a stable purpose of working with your information, and you need to do that prior to doing so.
Also you need to have an, an explicit purpose, which is clearly revealed, and it must be explained explicitly, what is the purpose of using the data? And we are going to there to a very important issue, the communication with the data subject. We really need to understand that with big data, we have a clear need of transparency in the future. Especially if the regulation in 2016 will start to be unfolding.
The implications that we're expecting, we really need to have a pretty individual understanding in any case of big data usage, our customer or any other person that is affected by my data, big data search and analysis needs to understand that he's in there and needs to understand explicitly what is the purpose. So that really makes the urge of transparency even higher. The legitimate reason is something I explained. So I don't wanna go in there even further. The incompatible use is of course forbidden. So that's the continuation factor.
The original purpose must not only be designed, but it must be there still. When I am working with my big data, the change of a purpose, therefore is something that shouldn't be occurring. You should be planning on what you want to do with your data. Otherwise you will have to res specify the purpose, which again, makes it necessary to make the whole process and the new purposes more transparent, even though you might not have the old purpose in mind anymore. And to switch to another one, but also if you have an additional purpose, no more additional purposes, then you need to explain that.
And the question of course is what is a purpose? And you must understand that the definition of a purpose here is a ne very narrow one. So the understanding is not that I want to do something with the data in terms of understanding habits of, let's say, say people who like to ride the bike it's must, it must be much more clearer defined.
So the re specification of a purpose is something that you, for sure want to evade and you want to get around, but if you cannot do that for whatever reason, because you think of a purpose later on, you really need to, again, to go into the communication with the data subject, maybe you need to have an, an additional opt in. Again, this is not in all cases, necessarily the important part, but it may be even the case that again, you need to have an explicit, explicit opt in and not simply an information. The compatibility needs to be determined are to be determined on a case by case basis.
So that's what I mean by being, being very narrow in your understanding of what you're doing, you need to really keep in mind when trying to find out if your future purpose is still compatible with the old one, you really need to go there and have a look. What is the relationship between the purpose that I used to have, and the purpose that I'm now intending to realize, and you really have to look on it on a single, on a single cases, single case basis, and understand is the purpose very similar, or is it something very different?
And you need to put yourself into the position of the data subject in order to understand if your customer, whoever it is now would be interested in knowing that you have this change, the slider, the change of purpose is the less sure you, the more sure you can be that you don't need to go through the procedure again. And your use is still compatible. Also the context in which the personal data has been collected, and the reasonable expectations, as I just mentioned, mentioned, are, are supposed to be realized and exercised. So the context is very important.
Do you have a customer relationship? Is it a non-customer relationship?
Is it, is it a prospect that we are thinking about here? So you really need to know how close is this person? Is this maybe even an employee and you have a long-term contractual relationship with that person, all of that, you need to ask yourself, if you want to answer, is the purpose, a new one, and I have to find an opt any maybe even, or is that something I'm still allowed to do the nature of the personal data?
Of course, there's more sensitive information and less sensitive information you need to find out here as well. If it's really very sensitive information, of course, the probability that you're expected to find a new opt-in of course is higher than the other way around. And the safeguards that you adopt as a controller to ensure a fair processing, of course, may never be underestimated. The more technical organizational measures you have implemented in your big data analysis, the, the more likely it becomes that your use is more, is still compatible with the old one.
The working paper of the article 29 group has stated why big data may be incompatible in the sense I was just mentioning. So this, these are the points that the European commission has. Their pointed out. The analysis of big data algorithm may simply be inaccurate. This is something what Mike Small pointed out as pretty hindering for your business case as well. So it is a legal hinder hindering as well. So this is something that the European union really states here because of inaccuracy.
You might not be allowed to go as far with big data, as you feel like, and the unfair and discriminatory results that result from that are a good reason for really narrowing down the possibilities of big data here from the legislature legislation side, the economic imbalance is something that the European union always stresses very, very strongly and really tries to place that everywhere where legal questions between customers and companies are on the picture. So also here, the economic imbalance is a subject that was mentioned and clearly stated as something you need to keep in mind.
The unfair price discrimination, for example, is a point here that they're talking about. This is nothing particular to big data, but they're really stressing that a lot. The sheer scale of data collection tracking and profiling is important. So in short words, they understood that big data is big. And because it's so big, it's difficult to understand for the customer.
And this is something from a legal perspective is not really liked very much the security of data, which was mentioning before in another, on another slide is possible to level your action in a, in a way that you really may start to undertake a higher level of security. And then you might be able to go rather for big data than without those security measures. So if you come to a point where you believe that possibly you may not analyze the information that you have there, you should, as one of the measures that you should undertake start to raise the it security side.
And then the in trans in transparency is a point. We are always discussing transparency and data protection issues. So here as well. And this is what it makes such a difficult field for the consumer. It's very difficult to understand what happens there because it's very difficult to understand the algorithms. And so the in transparency clearly leads to an incompatible use. And the transparency is the biggest part of practical points that were pointed out by the article 29 group. We have two scenarios.
We have the first scenarios scenario where big data support supposed to help with trends and correlations. This is something that we in familiar wording wouldn't really call a big data privacy issue. But then again, in scenario two, we have personal preferences to be researched. This is something to far more serious in question of data protection issues.
So if you can find different setups here and you have certain big data searches, and you can say, this is rather scenario one, and this is rather scenario two, it would really help you for the legal understanding and determination of measures. So if you can go for trans and correlations, this is far more easier, and maybe you, you find a procedure for that. And then again, you see, I I'm searching for personal preferences here. Then you need to be a bit more stricter in detail.
That means that for scenario one, that businesses must guarantee the confidentiality and security of the data as pointed before. But that is really a pretty basic request here and take all necessary technical and organizational measures. Then again, the important point is a functional separation that we need because like this, we can at least make sure that people that work with the personal information when it comes to sourcing the information, don't work with this when the analytical parts start.
So it used to be maybe, or impractical from a practical perspective, we are very used to having the same people working in different parts of the process. So what they're saying here is please segregate that have a functional separation.
First, you have a group of people who sources the data, and then you, in a way anonymize it, or at least you sort anonymize it and you have an alias, for example. And then you have another group of people who doesn't even know what this is all about. And this is an organizational measure, which really helps you to work with trends and correlations.
In, in, in strong words, I would like to say you can pretty do almost everything with the information that you've sourced. If you have it sourced Listly in the first place, and then you take care of the functional separation. So that's pretty good news. You can work and play with the data pretty relaxed in the first place. I'm talking about the analytics part now, not what might be coming after the analysis. If you're using that data after this point of time, then you, again, need to find out if that is still listed or not.
The working party also recognizes that a full analyzation is not always possible and therefore not always necessary. So this is what I'm also meant by the functional separation, of course, by definition, even though the people who analyze the information may not know who is behind that information they have in their hands still by definition, this is personal data and therefore should under fall privacy. So this is what makes so important to mention here that it's okay to do that because of by definition. It's nothing that we would expect because it's still not anonymized.
It's only partly anonymized for those people who are working with it, which by, by law is not the definition of anonymization. So this is pretty good news. I honestly didn't really expect that, but we have a little surprise here. The personal preferences is something that's supposed to be treated a bit stricter. The consent must be informed and for the consent to be informed and to ensure transparency data subject should be given access to their profiles.
And I think that's very important to know, because if you have information and we were discussing today already the question of ownership of information, and it's a very limited concept because ownership is very difficult for things I can't grab. So this makes clear that there is no clear ownership of information anymore, because I will have to share it anyhow with the data subject. So within data, a big data, your ownership that you might feel to have is reduced because you need to have to, you need to open your, your, your books and explain, this is the information that I have about you.
This is the outcome of information. And also I need to share my algorithm, which is something that might surprise as well, because first the customer may, may not understand that, but then again, it's not the single customer, but the public that should understand what's happening there. So that's a really a new measure here that the EU commission asks people to or ask companies doing big data, to really publish the algorithm. It's also crucial that the data subjects are able to correct or update their profiles. That's nothing new, but also under the regime of big data.
This is something we need to keep in mind. So the consent is very, very important, especially when in 2016, we'll have the regulation. We will need to have a consent anyhow, for any data processing, so that doesn't stop before big data. And we also, we should already work on working. We should already work on our consent strategy because it's not, doesn't need to have to be there simply it needs to be informed.
So again, we have this transparency issue here. I need to really inform people before, especially when it comes to tracking profiling for direct marketing, marketing, behavioral advertisement, data brokering, located based advertising, or track based digital market research. This is actually a clear statement, very clear statement of the article 29 group, that for sure in this environment, you will have to find the consent, which ask us a lot in practice. I believe we will really have to think on how are we going to deal with that and how are we not only finding the informed consent?
First of all, we have to give the information. Then we have to really receive the consent. And then also we need to store the fact that someone has at a certain point of time, agreed with something that we still need still need to be able to define later on in case there's any question about the, about the consent, if it's valid and so on for the consent to be informed, there should be the direct access to data profiles as mentioned. And the algorithms also the access must be realized in the user friendly way.
So this is something that shows us, okay, you will not get away with hiding the real core information somewhere. You must be very clear and very understandable for the customer. You must be machine readable as well. So you can't hide between like printed out information like some other companies do today. This is not con possible under the regime of big data, the source of the data should also be disclosed. This is of course, something that makes sense.
When you think about the fact that we have this two, these two steps that we always have to keep in mind first, the sourcing, and then the analysis, of course, you need to still remember the source and tell the customer by request where you have the data from this just makes sense because the two steps belong together at the end, the current and future legal situation in the EU, in, in the EU for all member states narrows the scope of dictator. But, and maybe that's a, a very negative approach from my side, but at least it doesn't.
I predict it because honestly I had expected something far more rigid. I had expected that maybe in total, there would've been a complete interdiction of big data in our future regime. This didn't happen. But as you saw, the transparency requests are very high. The purpose limitation is a very, very important factor. You really need to check on your purposes and the possible change of it. And the informed consent is something that is a legally binding solution.
And in many cases, this is not only a, a solution, a possible solution, but in many cases, this is even important and necessary to have the informed consent. Otherwise you may, in single cases, not even work with that information, the informed consent needs technical solutions for access and transparency. This is something they close with. So it's not only the law is the tech side as well. Thank you. So lots of legal interpretations of upcoming law, I'm not a lawyer, right? So I need maybe to get it explained a second time.
So does it mean that if I collect data, that in my understanding is not personal identifiable information, but if I do some analysis, I can identify that person using that personal identifiable information. This is not covered by the law because it was not PII in the first place.
Well, this is personal identifiable information at any time, this under falls, this regulation. So even if it's created during, during the, the, the analysis step yes. So then it becomes to be protected. Yes. So it doesn't matter if I have first sourcing and then the analysis, or if it all comes together, the important point is that it needs to be coming at some point of time. And as soon as there is a link between the information and the person you are already in, Okay.
Next question, which I think is very interesting to be observed how the market is going to evolve in the legislation is you said that in this, in this, in this upcoming proposal, I, as a, as a subject, as a subject of PII must be informed about the analysis algorithms. Yes. By request only, but you need to be informed if you feel that Way.
So, so basically Google must show how they compute their pet rank. If that ultimately, if you go think it at the end, right? Yes. Yes. I believe so. Especially if you think that, that this is maybe even a very clear example of big data, because in many cases there will be a discussion. Is that big data, is that really the analysis we had in mind when we found this regulation, but from my understanding, it clearly under falls, the definition, the legal definition of big data, your example.
So yes, the algorithm should be open then. Yeah. And this is something very astonishing, I believe. Yeah.
I mean, it's this business asset of many companies working in that space. Right, right. How to do the analysis and what are the algorithms and how the prioritization works. And Right. So what this really does, it really makes, gives a swift and a switch to our understanding of ownership of data. Yeah.
It's very, very difficult to predict what exactly this will all do to us, but there will be much more sharing. And, and in this part of the story, this will be maybe the other way around from what we all feel today, because the companies will have to share their information. Yeah. But it's not their information. There we go. It's the information of the customer. And it's just the information that's there in a new stage and how the new stage was achieved. This is the new matter information that needs to be shared. This is European scope, right? Yes.
So, and since we have this, as I mention this morning, we have this, these different concepts of information ownership in the us based world and the European based World, even within Europe, we're completely different. For example, in UK, you, you, you're pretty close to my understanding at least to have an ownership of data in Germany. And some other of the member states, this is a no go, you cannot possess information because you can't hold your hand with it or onto it. So you can't own information in a, in a, in a legal way, in a legal sense. Okay. Okay.
Nevertheless in the us, if you collect data, the data is yours. Yes. If it's PII right.
In Europe, it is basically mine because I'm, It's basically yours in a way, but you really need to understand what does that mean? If you say, it's my data, may I do with it? Whatever I want, may I even keep it for myself or will I have to share it? So it's just a glance that I, I take in a second. It's not something very global that you say, it's my data, or it's yours. We are going into a direction where we will all find the information is a bit more flexible because in one second might be still yours.
And then again, just for a part of the story, it will be mine, but I will have to share with you again, why this is mine now. Yeah.
But this, this is the interesting part. So a, is this going to be need, do we need technical solutions for being able to have this access or information stewardship concept that now becomes dynamic over time?
So to say, whatever happens with the data and the other thing is, do, do you see any trend that there is some kind of common understanding across the Atlantic? Not at all. I think we don't even have a common understanding in Europe yet. I think most people haven't realized that this paper, which is only a few days old, really is a dramatic change that we might face in the future.
And the, the, the, I think the important understanding here is that from my understanding, it wasn't even the intention to regulate anything, anything there, it was the intention to really explain, explain what's the purpose limitation, which is something we've had for 30 years. But along the way, all, all parts of this paper, explain us how data ownership will have to be understood in the future. At least there are some outlines to it. And I think we'll need some time to fill that. And it's not gonna be a legislation.
It's just an opinion of the article 29 group, but they, they're not formally binding, but in the past we've seen that it might become a legislation or at least the member states really adopted for their acting. So from, from this point, we are not there yet that this is binding, but this is clearly like almost a philosophical attempt to, to, to find this new approach of understanding of ownership of data. Okay. Very good. Thank you very much. Thank you. Thanks.