Thank you. Good morning everybody, and thank you all for being here. I really hope you're enjoying the conference so far and you are keeping your energy levels up as we go into day three of the conference.
So, oh, sorry. Today I'm gonna talk about three topics, top of mind for all of us as security leaders, insight automation and ai. And a lot of the, to a lot of the conversation out there at the moment about AI is very abstract, a lot of it, very forward-looking, and I'm gonna try and bring that a bit closer to reality and talk about what we are doing already in production to prepare ourselves to take advantage of these technologies. My name is Anthony Scarff. I'm the Deputy CSO at Elastic. I've been there for four years building this, the security program.
We are a distributed remote first and cloud first company since our founding in 2012,
I was here, or rather in Berlin one year ago at the Coppinger Coal Cybersecurity Leadership Summit, talking about the need for cybersecurity leaders to adapt to a pace of change that have never been so fast but would never again be so slow. And this was the slide that had everybody taken out their phones, taking photographs, taking videos, and now it already seems old. I'm looking around the room, I don't see a single phone out. This is boring. This is old news. That was one year ago.
And so I'm using this slide really just to illustrate how fast this has gone in the, in the last 12 months.
In that time, we've seen chat GPT grow to be the fastest adopted technology in all of history. A hundred million daily users within two months. Goldman Sachs is estimating that 300 million jobs will be automated and many more, as we just heard, will be augmented. I'm really an optimist. AI is gonna create a lot of jobs just like social media brought us influences for, for good or bad AI is gonna bring us jobs that we didn't imagine would exist a few years ago.
Netflix, as an example, advertised a position that that got a lot of attention. Half a million every year total compensation for a prompt engineer. But for me as a, as an optimist, I think the augmentation of our security Analysts, addressing some of that cybersecurity skills gap, having our new Analysts get up to speed and be trusted in production more quickly, able to have an impact, able to do investigations not escalating everything is where we're gonna see a lot of real good benefits from, from ai.
And we'll see orders of magnitude shift in personalization, automation and productivity.
So like the internet, social media and the cloud before it. We are now at the point of a new inflection that's going to impact every industry in every region around the world. And I actually believe that AI and large language models in particular will become the default interface for the majority of our computing applications within five to 10 years. Good and bad. So we just heard about the malicious use of chat GPT, they're actually as chat GPT and, and the other legitimate services evolve, and they tried to put in place measures to, to prevent abuse.
We are seeing malicious LLMs as a service fraud, GPT worm, GPT, that are custom made and provided to malicious actors with the sole intention of accelerating malicious code and malicious phishing campaigns.
But I mentioned just now that I'm an optimist, even if it feels sometimes that there's not many of us in this industry. And I actually, I like to talk about the fact that our first instinct often as security leaders is about how, how do we protect, how do we manage, how do we control this new technology before we think about how do we adopt it and benefit our own teams?
And we saw that with the cloud and we saw that definitely with, with DevOps. Our first insect instinct of security leader is to put our arms around something and say, how do we, how do we keep our eye on this? And I think we can do more to embrace these technologies for our own security teams.
So how do we do that? How do we take advantage?
Well, data is really the common denominator in our security challenges. We saw Sergey yesterday had a, a great slide at the end of his talk where AI was this small figure. Was it an avatar? Was it a person, was it an identity? I don't know. And the rocket ship was your data, your control framework because AI isn't gonna do it by itself. So getting the right data or better yet, information, knowledge, wisdom in front of the people that need it is the biggest challenge that we have. And maybe your data is disconnected and requires humans to connect the dots.
Maybe it's spread across hybrid or multi-cloud with no real way to centralize.
So how are we tackling some of these challenges internally at Elastic? First a little bit about the company.
We are, as I mentioned, remote distributed since 2012. We have 3000 elastic edition in 44 countries. So my team is split about 50 50 between the Americas and Europe. Everybody working from home. We do have offices in some of those 44 countries, but it's basically just somewhere that we keep the snacks. There's no, there's no corporate network. If you show up there, you don't get any special privileges, you get good snacks. It's like a free coffee shop as you can imagine in that context. We don't run any data centers. We are heavy, heavy users of SaaS. Okta is our identity provider.
Slack, Google Workplace, Salesforce, office 3, 6 5. GitHub are our main collaboration tools. Looking at the numbers of our environment, we are a cloud service provider. We have about 60,000 customer deployments of the Elastic Stack, and those are dispersed across 60 cloud regions in all three of the major cloud providers. So our customers have the choice to come and basically deploy in whichever cloud provider makes the most sense for them. That equates to about half a million endpoints, 150 terabytes of security data every single day that we ingest into, into our stack.
And about 600 gig every day is from our workforce. So laptops, workstations,
And security is a data problem. So how do we enable our team to get insight from all of that data in real time and at scale? But when talking about security data, it's very easy to constrain ourselves to thinking about the right hand side of the NIST cybersecurity framework. So we always talk about the detection and response side of things. We always think about logs. And in our InfoSec program, we think it's much more than that. We think that data automation, AI should support the entire InfoSec program.
And yes, the detection response part is sexy. Nobody's making movies about somebody saving the day because they had really good identity and access management standards. But we can, we can do more, right? We can support all of our, all of our program with, with these technologies.
So a step more practical, what does that mean? What does that look like? If you're working in the cloud, there is so much rich data available to you at the end of an API from your cloud providers, whether that's infrastructure, whether that's SaaS.
So we, we ingest logs, of course we do a lot of them, but we also ingest data from our SaaS platforms. We ingest data from our vulnerability scanners. We enrich the data from our vulnerability scanners with public sources like CI's known exploited vulnerability catalog so that we can have more risk-based prioritization of those vulnerabilities. We ingest data from for Intel providers, from OSS query on our workstations.
So we, we know exactly what's running on every one of them at at any time. But probably most importantly in enabling this is the InfoSec data warehouse where we ingest asset information. So we treat identities as assets.
We have a 360 view of all of our users from various sources, whether that's our main identity provider, Okta, Google, slack, et cetera. And we could kind of combine those into a 360 view. We ingest all of our 2000 GitHub repositories and we know how that they're configured.
We, we can see all of our S three buckets, servers, load balances, do they have a sensible access control policy? Do they have encryption air enabled? So we pull all of this into one place and the logical view shown here is, is accurate from an Analyst point of view. So that one elastic cluster in the center is where our Analysts can spend their time and do their work. But we actually use a technology called cross cluster search that allows us to leave that data where it is. So all of that logging, that 600 terabytes a day of of data, we keep most of it in the region where it's generated.
And there are good data privacy reasons for doing that. There are compliance reasons for doing that, but there's a really good economic incentive as well. By leaving the data where it is and using cross cluster search, we save about $40,000 every single day just in cloud data transfer costs.
So I mentioned that this data is useful not just in the operational context and in incident response, but also kind of moving left in that NIST cybersecurity framework. We actually use the data to drive projects.
So we saw this year that it was, there was really an urgent need to deploy phishing resistant. MFA multifactor authentication is probably the most important control that we have in our context as a remote company. And tools like Evil Genex made the current forms of an MFA, like a push or a token code or an SMS absolutely redundant. If you're still using those, this is not on your roadmap yet to do phishing resistant MFA. Please go and do that. And if anybody wants to talk to me about it, I feel very strongly about it and happy to, happy to chat about why that's important.
Deploying phishing resistant MFA touched every single one of our employees, every one of our contractors, and we did that in three months and we did that by using data to support our, our rollout. So here we are combining in a single dashboard configuration state data. I'm sure it's not easy to read out there, but on the left we have configuration states. So how many of our users have actually registered a phishing resistant MFA mechanism? Which departments are they in? Which countries are they in? Which managers do we need to follow up with to, to get things moving?
If somebody's lagging behind, then as you move further, right? It's more kind of real time, time series data from logs. So maybe 99% of our people have registered phishing resistant MFA, but only 60% are logging in with it every day. So that's gonna change our communication strategy. That's how that's gonna change how we go to our users and say, you're not there yet. You're not protecting us yet. There's more to do. The far right, those pie charts, those are a a one hour, 24 hour and seven day view of those authentication events.
So we could really see in real time as this was getting adopted and adjust our approach.
This Drake meme was really popular. So Drake throwing out his iPhone and his type token code and switching to touch ID and moving to a phishing resistant MFA factor. People got the message, people loved it. People contacted InfoSec saying, this is great, but you can see further to the right that a reply to all from our CTO, that's what really got people moving.
So what what you're basically seeing there in the, in the chart is we sent out an email and one hour later we can already see the impact in the data of are people responding? Are people doing what we're asking them to do?
Next up automation. So we touched on insights and how data can give you insights.
How do we, how do we automate that and, and what are we doing? So we enrich our alerts as they come from the SIM with additional data from those various data sources that I just talked about. So before a security Analyst even sees an alert coming from the sim, if the first thing he was gonna do is go and look at some other data and bring that in and to, to kind of support the investigation.
Well, why don't we just do that in advance? So before the security folks see the alerts, we're already adding data about maybe the workstation, maybe it's the, maybe it's the user.
So we can just go and pull that automatically. We can run additional commands to receive context about the workstation, what software was running, how long has it been running, what OSS is it on? But what really excites me here is that we, we actually distribute those alerts to people who can add context.
So again, maybe the first thing that the security Analyst was gonna do is contact somebody in IT or contact the system owner or contact an end user. Was that you that just changed your MFA? We just saw that you changed from a, a token code to a YubiKey was that you, do you need help? And the user can answer yes or no. They can add some comments, they can say they need help and maybe a security person never needs to see that alert at all.
Before we were even talking about ai, just this kind of automation, these workflows are saving us more person hours every year than the entire headcount of our InfoSec team and we have 40 people in the team. So that's a lot. That's a big impact.
And again, it's not about re replacing people, it's about augmenting people. It's about never having to hire those additional people in the first place.
Moving into ai, how, how are we using AI now and how does that data help? The elastic AI assistant is already built into our sim and because so much of our code documentation, our security detections, they're all free and open. We're open source company,
The public AI models are already very well trained on, on what to do.
And a lot of that shared context about how to investigate a security event is already out there on the public internet. So our AI assistant basically tells an Analyst, here's some recommended steps, here's how you can proceed with this investigation and this is how we're gonna help with that cybersecurity skills gap, right? It's about getting, there's no shortage of people who really wanna work in this industry, but that first part, that ramp up from being a really excellent student who understands the domain to being somebody who you can trust to be on the front line and respond.
That's the really hard part of addressing the cybersecurity skills gap. And where AI can help.
We can also select, so data privacy, this case, this is public ai, this is, this is going out at the moment to, to chat GPT and and others. So we can choose in real time or or by policy which fields from that alert. So which of our data, right, our IP addresses, whatever context about the systems gets sent to that public GPT. The Analysts can turn things on and off, they can anonymize, but we'll very soon be turning that to private private instances. So what's next?
There are just so many opportunities when you look across the whole stack. There was, there was a slide yesterday, I think it was also Sergey that was highlighting that there are some, there are some use cases where you really need something like machine learning. You really need accuracy, you need real time response. But there are so many kind of fuzzy use cases around the edge where just semantic search and ai, generative AI can can really help.
So combining public and internal content, reducing that cognitive load, having having guides that get in front of our people and tell them here's the next step so that they don't need to escalate to a senior person. There are so many opportunities for us there.
Here's a really dumb one that just popped into my head the other day while I was preparing the slides. This technology only exists and was implemented 100% in Photoshop. But maybe a user just wants to know, can I install a browser extension? And so user wants to install the popular potato fax extension.
Imagine just having the A bot that can help. This has a lot of positive reviews on Google. It's very widely used. There's nothing in our internal policies that says no that you can't use it.
In fact, from OSS query on our workstations, we know that this is already installed on 850 workstations in our environment. All of your colleagues are using it. Go ahead and thank you. And this problem I'm sure is not top of mind or even like on the radar for a lot of people. Like it's not what you're thinking about every day, but it's just a small simple way that we can just help make everybody's life better, not have security people spending their time on busy work on these kind of approval flows.
So there are so many opportunities just we don't need to be thinking about robots, fighting robots and AI versus ai. We can just make things better for, for our users.
So as I said, it doesn't need to be this hyper futuristic AI versus AI thing that we think about.
Yes, that's exciting. Yes, it will come, but there are a lot of opportunities right now. But the power of AI resides in data and it was Paul yesterday who was saying that to give yourself the foundation to be, to be ready to take advantage of these technologies, relentless capturing and storing of data is part. It has to be part of your strategy. And you have to start thinking about that now, bringing that data in, what data sources are important, and I hope I've helped you to see that it's more than just logs.
AI will provide the bridge between security data and actionable knowledge and wisdom at scale, helping our teams to be better, faster in their analysis and action, stronger and more effective and hopefully less burnt out. Thank you everybody.