welcome To the KuppingerCole analyst chat. I'm your host. My name is Matthias.
I'm not, I'm a lead advisor and senior analyst at KuppingerCole. My guest today is Annie Bailey. She is working out of stood guard. She is working as an analyst for KuppingerCole in the area of emerging technologies, and we want to continue our discussion around privacy and the ad business. Hi Annie.
Hi Mathias. Thanks for having me back. Great
To have you back. And I think we are now really following up on the topic that we discussed a few weeks earlier.
Um, we started talking about, um, getting rid of third party cookies, focusing on first party cookies, which are enabling the business, which are more or less enabling user experience. And we're also getting rid of the third party cookies that are a matter of concern when it comes to GDPR compliance about privacy protection in general, and many other regulations around the world. We've talked about, um, Google planning, a new approach towards fueling the ad business with, uh, the type of information required for, um, targeting ads.
And we've, we've talked about flock this, um, new concept that is created and currently defined and tested, and, uh, that is open for discussion and presented by Google. If we continue our discussion here, flock is intended to be privacy protecting much more than the third party cookies. How do you measure privacy within this flock concept
With flock, privacy would be measured as K anonymous or rather the level of privacy would be measured if a cohort achieves K anonymous.
And this means, um, that K is determined to be the number of users in a cohort, which makes it unreasonably difficult to identify that that user. So K here still has to be defined. We don't know how many users will be in a cohort and what that threshold is, where, uh, you have, you know, anonymity in the crowd, so to speak because as we talked about last time, the larger, the cohort is the more privacy you have as an individual user being placed into that cohort. So Google actively chose not to use differential privacy, which is, uh, an up and coming privacy technique.
As this measure, they stated it, it fails quantify the hardness of tracking users across the web. Um, and so that's, they, they chose their own privacy measure K anonymous,
Right?
And, um, with this measurement of privacy being directly connected to the number of, of users within a cohort that really of course requires these cohorts to be as large as possible. And of course, then the value of such a cohort might be deteriorating with a rising number because then would get more general.
I would assume as we are looking at the FLC concept of flog concept, as it is presented by Google right now, and we know it's still early, although they are testing, can we identify some pros and cons to this Google approach towards privacy protecting fueling of this ad business, maybe to start with the pros, what would be some pros that we can see right now when it comes to replacing the cookie concept with these cohorts?
Hmm. Yeah.
So for should be said, it's nice to be stepping in the direction of more first party relationships rather than everything being collected and determined through third-party relationships. So maybe this is a more healthy way to bring in the app business here.
This, uh, this has yet to be seen, but this could be a positive outcome. Another positive here is that the cohort ID can be calculated in the browser without any other users information. So the cohort ID can be for an individual user can be, um, calculated completely alone. And this means that all of the other users also in that cohort, their data doesn't need to travel to a centralized server in order for this cohort to be determined for each user to be placed in the appropriate group.
So that's really nice, that's less data, which is, um, being sent around to different servers without the user's consent. Another positive is that, um, accompanying flock Chrome will also release more user-facing controls like an on off switch to indicate whether a user wants personalized content at all. And so this is also in the testing phase. It's not being rolled out to the general public, but there should be some more information and, uh, another round of testing and rollout and controlled environments in April. Right?
And we need to understand that this flock concept is only part of a larger effort of Google trying to improve the privacy of users in general. So, um, this privacy sandbox that is rolling out consists of more than just this one component, which is flux. If you think of more pros, it's a bit difficult for me right now. So let's switch over to the cons.
As we mentioned before, this does raise some red flags for us concerning privacy.
And so it's good to talk about them and, and see where there can still be improvements here, especially as this is not a final state yet that there could still be room for improvement. So we hope that will be the case.
One thing to consider is that a cohort ID, although it's not describing exactly you a user and your, um, personal information, um, it does assign a profile based on your recent browsing history from that there's a reasonable chance that another party could probably, or definitely confirm that you've visited certain sites and very definitely determined general demographic information about you. Cohorts are probably going to end up in similar age groups, similar interest groups. That's definitely the goal that could be beliefs, um, that could be political leanings that really could be anything.
And that's another concern here is that it's not yet determined how many cohorts there will be. The more cohorts there are, the more specific they will describe the users within them. So we don't know to what level you would be grouped with other similar users.
Now, if we go back to what we defined earlier is that the standard of privacy here with flock is the number of users within a cohort. So the more users in the cohort, the more private it is protecting the visual in there. Now the algorithm used to generate these cohorts does not have a minimum cohort size, and there's no way to regulate how few or how many members are there.
So if, if whoever's implementing, this is not careful, you could end up with, with cohorts, which have far too few members, meaning it's very, very easy to identify each individual in there. Google has stated that they have solutions for this, which would then, uh, what they call Institute a and anonymity server, which then blocks the flock API from returning a cohort ID, if it is not K anonymous.
So if it has not reached the number of users that need to be in the group to protect everyone's privacy in there, that also negates one of our pros that we just said, because that information would have to flow through a server. That calculation would not be able to happen only on your browser.
The ad industry is interested in getting, um, most recent information about a user. So the cohort I was assigned to save four weeks ago or close to Christmas, might vary dramatically from my interests and my cohort IDs I'm assigned to in the summer or, um, during a pandemic or outside of a pandemic.
Is there a mechanism to make sure that, um, this cohort ID that I'm assigned to does change over time and how is this achieved?
Yeah, that's a great question. Cohorts are recalculated every week.
Um, and this is to give a very limited timeframe window, both for accuracy, in a sense, you know, this is limiting your interests to just what your browsing history indicates from the last week. People are complicated human beings. If you were trying to place people in a group that incorporated the interests of your entire life, this would be hugely, hugely complicated. So it's a much simpler problem to solve just looking at ad week, but it could give very accurate data on how a user or how users in the larger aggregate sense how their interests change over time.
Now it's still unclear to me, if an individual user's membership to a cohort can be traced over time. This is somewhere where I still need to continue to read and research, but certainly looking at the aggregate sense, looking at the types of cohorts, how many members are in those, how that changes week to week, this is going to give a really accurate picture at the global market. We always have
This, this principle of pseudonymization and and every time you pseudonymized, and this is something that we see here right now, there are, of course our efforts to DCU.
Dynamize a person to identify them again, and to learn who they are over time through their behavior and through their browsing history or through their cohort ideas over time. And maybe this is something that would at least be a vector that people could be then, um, de anonymized re identified overtime
. And we exist in a world right now where there's a lot of PII floating around the internet and malicious actors possession in, um, non malicious actor possession. But despite that it is out there.
And so the combination of these cohort IDs with a lot of, of personal data out there could make it fairly possible to identify individual users within cohorts. Because again, we don't know how many cohorts there will be, and we don't know how many users there will be in there. So you have a cohort. You may only be working with a few thousand users who have been placed in that cohort. That's not a lot of people to differentiate one user from the rest. So it could be fairly possible to positively identify a user based on their cohort ID and all the other information which is out there.
Right.
Um, maybe another issue that I came across in person when I, um, in the early days of streaming services and when I had smaller kids, we all use the same streaming provider and this really killed the recommendation mechanisms. So I was presented with new children's programs and my kids were offered new ISA site scifi programs.
So, um, there was no room way back then that has changed in the meantime for different users within the same client, within the same piece of software, piece of hardware. Um, I think, um, if one browser is used by more than one person within a family, within a group of people living together that might really screw up this cohort ID mechanism in general.
Yeah, that's true. And there may be other solutions out there that flux has considered.
Again, you mentioned this privacy sandbox, it's a, it's a whole host of privacy, improving products and solutions that Google is, is trying to launch. So maybe there's something there that we haven't seen yet, which addresses that.
But, um, it's a concern here for usability for accuracy. Um, another, another concern is, um, also relating to not having room for different people, interacting on the same browser as a parent or a child, um, is also your, your persona. So your professional self with your private self, with your, uh, physical self, which may need to, um, correspond with medical providers, your medical provider is not going to need to know all about your interests.
Um, your cohort ID, if it's labeling you as being really interested in reggae music that may not be totally relevant, um, to your medical provider that's, um, needs to be handled in a more controlled environment. So
Data minimization then is really an issue then.
Yes, yes, absolutely.
So we still need to find the balance between, uh, enabling privacy and maybe please advertise us, please. The digital transformation, the new services that are coming up.
Is there, is there a balance, um, is it possible to, um, please both the advertiser and us, um, really enjoying our privacy?
Yeah. This is a good question.
As, as you mentioned before, um, the ad market is a valid business model and an important part of many of the, um, the systems and processes that are present with CIM, things like that. So we need to find a solution which is working for all sites. So I don't have a perfect answer yet, but I do want to say that I think 20, 21 will be a great year for this in, in experimenting, in putting forward new solutions, which really do try to be meaningfully privacy protecting and also serving the ad market here. So one option could be life management platforms.
This is something which we term as a, as a secure place for an individual to store their own personal data. Um, and then to be able to share that data with service providers only with their consent or to delegate access to that data.
Uh, for example, if someone is in declining health, um, that they could gait, um, access to that data to a spouse or caretaker. Um, and so this is a, a definite shift towards first party relationships, rather than letting this data be collected, handled incense through third parties at the moment, I've not seen strong, compelling business models for the ad market with life management platforms. So this is a downfall here, something that we'll have to keep our eyes open for in the coming year. There are also a lot of innovations from the advertiser side.
So things like synchronizing and ID across all user devices. So if you consent to first and third party cookies on a particular website on your laptop, that choice would be remembered and reflected on your smart TV or your smartphone. Your identity could be applied across all devices. So this is something which is really targeting the ad market, making it still quite effective for them.
There were some privacy provisions here, but it's more like a step sideways rather than a step forward it's it's having.
Yeah, some privacy wins, some privacy challenges mixed together. And then another option that we're keeping a really close eye on are decentralized ID. So this is another format for securely storing an individual's data, uh, being able to transact with that data exchange share, prove that it's correct, but again, it's, it's not, uh, focusing on the ad market at all.
Um, and there are very few solutions which address this need of the ad market to access data, to continue to function. So there's no great solution out there.
Um, but there are a lot of different ideas coming from different corners of different sectors, which are thinking about it. And that's really positive.
I did not expect us to solve the question, whether there is a fine balance for the next five years, how a user privacy within, um, their own machine within their browser, within their way of they are working with the internet and in the internet and how that can be aligned with the act business. I did not expect us to solve that today.
Um, but we really understand that this is an ongoing issue. This is something that, um, we will have to look at over time. Nevertheless, this, um, initiative by Google is surely an interesting one to look at. And as we've mentioned in our first episode around that topic, it's, it's really getting rid of the third party cookie, which is in itself a good thing. It shouldn't be replaced in something that is as intrusive into privacy as the third party cookie was, but we will have a look at that over time and we will accompany this process of a time.
And we of course encourage all our listeners to, to contribute to these efforts. The, as Google says, these are documents that are open for comment that are published, that are readily available, that can be easily found by Googling for flock. Before we close down any, I know you have provided some research in the area of privacy. Is there something that you could recommend where to look at when it comes to research at our site KuppingerCole dot com?
Yeah, absolutely. There's a leadership compass on privacy and consent management solutions given our chat today, um, which is about the removal that possible disappearance of third-party cookies.
Uh, there is focused on, um, on cookie management as part of the solutions, but privacy includes many, many other facets. Um, and so the report covers some of those other features that may be interesting and relevant.
So thank you very much, Annie, for that recommendation, as we said, we cannot solve this issue rather we will continue to look at that topic over time, and I'm sure we'll get back to this topic soon when there is more tangible results available that we can then talk.
And maybe some of the open questions that we raise today are then already answered in a first version of this being implemented in the Chrome browser, on our individual machines. So for today, thank you Annie, for being my guest today, I'm looking forward to having you as soon again, around this topic and many other emerging topics. Great
Mathias, thank you so much for having me. Thank you. Bye bye. Bye .