Welcome to the KuppingerCole Analyst Chat. I'm your host, my name is Matthias Reinwarth, I'm the director of the Practice Identity and Access Management here at KuppingerCole Analysts. My guest today is Lead Analyst Alexei Balaganksi. He's covering the areas of cybersecurity, and there are many of those. Good to see you, welcome Alexei Balaganski.
Well. Hello, Matthias. Thanks for having me again.
Great to have you. And we will continue a discussion and giving you color to it around the topic of artificial intelligence. Or is this really true? Maybe only machine learning and cybersecurity. I've talked to about that topic to our colleagues, Alejandro Leal and Marina Iantorno in earlier episodes, also running up to our new event, the cyberevolution. But you are working on that topic quite extensively as well, and we want to talk about the role of machine learning, artificial intelligence when it comes to supporting us as being in the frontline towards cybercriminals. So AI as a new hope for cybersecurity, what role can that play and what does that actually mean from your perspective?
Ah well, Matthias, you are completely correct saying that, Yeah, we are all... at KuppingerCole, have done some research on AI itself, but for me, it's definitely not the most important topic. For me, like, I am a cybersecurity guy, so to say, I've been following those developments for years. And yeah, I am definitely interested in knowing how AI and related technologies can help the so called cyber defenders, so basically us and a lot of our customers, fighting against this huge flood of malware, ransomware, phishing and other kinds of attacks on the very simple reason, there is not enough of us humans capable of doing it properly. So there is a lot of talk now, and to be honest, for quite a few years already that yeah, maybe someday the AI will replace security researchers in fighting cybercrime. I have to confess I am a little bit skeptical in that regard. But let's discuss it and let's look at various applications, possibilities for AI.
Right. Because especially when you're talking to people who are maybe using a bit of ChatGPT, but that's it, the term AI is not well defined. So AI is considered to be something monolithic, something large, something incomprehensible. So this is something where AI is more or less some kind of mythical concept. But when we really look at that, if we look at the different technologies, how can these individual technologies support in cybersecurity and which are there? So what are the key components that we currently consider to be the AI that can help us?
And again, Matthias, you are absolutely correct, and I totally agree that there is no such thing as “the AI”, if you will, because artificial intelligence is not a technology, is not a product, it's not even a platform. It's a really vague and pretty old concept. I mean, it has existed probably like 80 years ago when the first really small and slow computers just have emerged. And back then basically people thought, what if someday in the future, the computing power of this primitive computers will reach that level of that, yeah, they will start thinking like humans. Well, there are two big problems here. First of all, nobody knows for sure how humans think. So we don't really know what a human intelligence really is and how it works. And the second problem is like, how do you actually measure that an artificially created intelligence really is intelligent? I mean, we've heard about Turing tests and other interesting stuff. And of course, with tools like ChatGPT, people are eager to believe that the other end of the conversation is really a person, a thing that kind of can understand your questions and give some meaningful replies. But yeah, lumping all these developments under one banner is really a big mistake and we really have to go through various technologies and capabilities and understand, what exactly can they achieve. Because one thing that KuppingerCole has always communicated: Stop looking at labels, stop trusting buzzwords, look at capabilities, understand what you need to solve your problems, and look for those capabilities specifically.
And if you look at these capabilities, starting with one, maybe anomaly detection, is this something where machine learning really adds something new, maybe in terms of volume, when it comes to identifying anomalies. Where's the strength? Why anomaly detection and machine learning?
Well, first of all, I can tell you that almost 30 years ago, when I was a student at a university, I majored in mathematical statistics, among other things. And back then I already knew exactly how to find anomalies in a stream of data. That's what people could do 100 years ago, even before they had computers, even before they had AI or machine learning. And the irony here is that a lot of companies still do the same. They just use existing mathematical, numerical methods to apply statistics to find those anomalies, which is totally fine. If it works, why break it? The only issue I have with that is calling that AI, which it definitely isn't. It's not even machine learning. And we can even go deeper and discuss whether it's fair to equate machine learning and AI, because some people believe those are two completely separate things. But you are right, machine learning can be used for that as well. And of course nowadays, with the era of the cloud we have, it's possible to do things which were just impossible 10 or 15 years ago just in terms of sheer volume of data we have to sift through, if you have millions of security events per hour, for example, you can only boil it down to a meaningful number of anomalies detected with machine learning and which is great. The only question is, is it really enough for a security researcher?
Maybe we can change the perspective when it comes to the training data that we apply. If we use machine learning and apply it on training data that is actually resulting from previous security incidents, is this something where cybersecurity based on machine learning can really improve?
Well, let's turn this question around and look at it from a different perspective. What is it exactly that a security researcher needs to do his job efficiently? Finding an anomaly is definitely not enough because you can find dozens, if not hundreds of anomalies in a stream of data. The question is how do you understand which anomaly is actually a risky event? How do you rank your findings by severity, for example? It cannot be something static because, you know, like the quote unquote old school vulnerability scanners, which will give you at least findings, those are boring and almost useless nowadays. You have to know where you have to respond first, right? So to understand what the vulnerability actually means, what does it align to? For example, does it map to a specific technique from the MITRE ATT&CK framework? That is the next step which has to be taken. And this is exactly where what you just mentioned, training your models on existing curated datasets from real security incidents, comes into play. It's not enough to know that something suspicious just happened. If the machine learning model can tell, Yeah, it was actually a ransomware attack, and even better, if it can group, let's say 500 suspicious API transactions and say, This is actually Ebola attack, which we know how to respond to, that's the next step of quote unquote “AI” in cybersecurity. And this is exactly what we are observing nowadays. Well, maybe for a few years already. This is what most current generation of security tools are actually offering,
Right. You are looking at the leading edge of these AI supported cybersecurity technologies. What other technologies are really emerging, which could not be done before, call it machine learning or call it differently, but where these new technologies really support in and dealing, mitigating new types of attacks and applying new controls as well. What is new?
Well, one or newer thing, for example, that unfortunately doesn't get nearly enough attention, I believe, is the application of the so-called deep learning technologies. Without going too deep into technical stuff. For example, one very popular application of deep learning is recognizing images, like if you can train a model which would say, Yeah, that's a cat, or that’s a dog. That's exactly how deep learning networks work. But you can also use this or similar approaches, for example, to find complex patterns in security data as well. For example, you could basically train a model to look at your network traffic without even like decoding the TLS encryption layer for example, just kind of observing the large scale data flows. You could quote unquote visually identify something suspicious going on. And that could be used as an input for another related security tool to go deeper. I've seen some really interesting applications, for example, there are vendors who offer the same approach for source code analysis. Traditionally, you would have to actually run your code in the sandbox, for example, to see whether it will do something malicious. Instead, they would claim, Just give me a binary, I will run it through my again, quote unquote image analysis and identify suspicious patterns in that binary. Without the need to run it, without the need to decompile it to whatever. That's really some interesting and really promising capabilities we can expect to develop and mature in the near future.
That sounds really interesting because it reminds me a bit of these medical use cases where machine learning supports in identifying malign tissue problems and when it comes to cancer, skin cancer, etc.. So it looks quite like the same. But these are technologies that are applied in the SIEM somewhere in the back office where a security analysts are dealing with these topics, is there something that is closer also to us as end users dealing with ransomware, dealing with email? Is there also a new hope for cybersecurity being augmented by machine learning?
Well, I mean, of course, the notorious or the famous large language models made a literal revolution in this area for end users. I mean, everybody knows ChatGPT, even though probably nobody still has any idea of what GPT actually stands for. But they know that this is the magic tool where you can just talk to a bot, basically using your natural language and it will give you responses and not just some random responses, but useful facts and findings and recommendations and whatnot. So this is really a huge revolution, at least in terms of public perception and there is a lot of promise in those technologies as well. But I would argue we have to kind of get a small reality check in that regard. So yeah, absolutely, Chatbots, natural language processing technologies are there, they're already being used in tools. A typical example is, like if you're running a SOC and you have an incident, you would usually want to know like, what have we done? Like, what my colleagues have done with similar incidents in the past, you can run your entire history of your SOC through such a tool. They will tell you, Yeah, the recommended approach would be, do this instead of do that. Simply because it worked better. The average time to resolution was lower. This is like the most primitive application of those technologies. Some kind of an actionable recommendation. Of course, the next generation tools, again like ChatGPT, can do more. They can do much better. And we already see specialized developments of these large language models for cybersecurity, as well as for other industries like health care, for example. So I would absolutely not trust “the real ChatGPT” between to do security decisions for me. But there are other vendors who are basically adapting the same technology for specific use cases. And of course, they need lots and lots of security data to train those models. And of course only huge vendors with huge telemetry networks can afford doing that. Microsoft, for example, Cloudflare, Akamai and other companies who run their own clouds, basically. And the others have to deal with that problem because finding quality training data is like the biggest obstacle for training a machine learning model.
Rright. And you are, we have talked about this topic in earlier episodes and you are known to be a bit more cautious, a bit more tentative when it comes to, especially to hyped topics, whatever it may be, be it machine learning or something else. But when it comes to monitoring that market and making some predictions and sometimes we analysts do have to. What are your predictions when it comes to machine learning and cybersecurity? What would be, looking into the future, say, just six months or so, where do you expect machine learning to go in cybersecurity? Again, cautious and tentative?
Oh, first of all, I have to confess like I'm a little bit allergic to marketing, I guess, and nowadays it's just impossible to get around without seeing another buzzword, ChatGPT is obviously one of those buzzwords. So yeah, with that regard, I have to only reiterate, don't look at labels, don't trust marketing, look for the actual capabilities and limitations of every tool. This is exactly what we as analysts are doing when we cover this market and we are doing our research, identifying the strengths but also the challenges, because even the best and the most powerful language model does a lot of mistakes. If it only boils down to failing at your student essay because somebody uses a plagiarism tool to detect it was actually written by ChatGPT. I guess it's okay. You can survive that. But if you are making a business related mistake based on that recommendation or even worse, you're doing a security related mistake and you actually let hackers get away with your sensitive data, for example. That's not going to fly, especially with compliance auditors. So, yeah, absolutely. There is a lot coming in very soon. But you have to understand that it won't be those buzzwords, that it won't be ChatGPT. It will be specialized tools maybe from the same vendors. I mean, obviously Microsoft owns OpenAI, so it will be probably one of the leading vendors offering those specialized tools. But again, kind of you have to understand the difference between promises and the reality.
Right. And maybe sometimes plagiarism is not too bad if you plagiarize a successful mitigating strategy for a cyber attack that is well done. So that is reusing knowledge that is already there and then it's no longer intellectual property that's stolen, but reusing successful strategies for cybersecurity.
Well, one of the greatest things, one of the biggest differences between cybersecurity as a field of business, basically, and we have totally different priorities compared to marketing or education, whatever. Nobody will punish me for, as you mentioned, plagiarism if it actually led to a successful resolution of a data breach. So, yeah, we have to understand that probably or anything like that won't replace me or you or a cybersecurity researcher that quickly as it could replace, for example, a journalist. You've probably heard about the German tabloid BILD, which has recently replaced something like 95% of their writers with the actual literal ChatGPT. Luckily, we could keep our jobs a little bit longer than that, simply because we have still to basically bear a lot of responsibility for critical decisions. So this is why I guess we still have to look at making the right decision in time. Basically, surviving to see it resolve positively. That's our goal. And whether it will be done a shiny security tool with AI or something else, doesn't matter. The most important thing is that we actually can do it in time. And that's what I'm looking for in the future.
Right. And this is one approach to successful cybersecurity strategies. Before we close down, I want to mention two things. First is, cybersecurity and strategic approaches to cybersecurity is something that we will cover at an event in November in Frankfurt. This is the cyberevolution, not more to mention about that, we have talked about that and we will talk about that in an upcoming episode with the makers of the event. So that's the first mention. And the second is that you already have published a blog post on exactly the topic that we covered today. It's much more elaborated, it's much larger and much more content in there. So it's really a recommendation to have a look at that. It's called AI and Cybersecurity: A New Hope for Cyber Defenders? It's available for free on our website in the blog section. Just go there and try to find just Alexei's name, and that should lead you to that blog post. And maybe a third thing. If you have any questions, if you have comments, if you contradict, please leave your comments. We love to learn from you, to hear from you. Leave your comment below that YouTube video that you're watching right now or leave a comment on the platform that you are listening or watching this podcast episode on right now. And if you are there, talk to us at cyberevolution. Some final words, Alexei, from your side?
Well, Matthias, thanks for mentioning the blog post. Absolutely, one can start there, but you should really follow the links and go deeper and look at our earlier research because again, we do cover individual fields of cybersecurity like SOCs and SIEMs, for example, and data protection, API security and everywhere AI is a growing part of all solutions. So you should definitely dig deeper. I mean, we offer a lot of content from that and yeah, definitely talk to us, we are always eager to hear your questions and hopefully provide some meaningful replies and if not, at least leave you with some even more interesting questions and at least helping you to defend yourselves not just against cyber threats, but also peddlers of snake oil and false marketing and stuff like that.
That's the reason why we talk regularly with each other, because you are also that means of control for me as well to really deal properly with hypes, with buzzwords and to really understand what they really mean and what that really means for successful strategies. Thank you again, Alexei, for being my guest today. Looking forward to having you again soon and for the time being and have a great rest of the week. And for those who want to look up Alexei's blog post, you should. Thanks again, Alexei.
Thanks and bye.