KuppingerCole's Advisory stands out due to our regular communication with vendors and key clients, providing us with in-depth insight into the issues and knowledge required to address real-world challenges.
Unlock the power of industry-leading insights and expertise. Gain access to our extensive knowledge base, vibrant community, and tailored analyst sessions—all designed to keep you at the forefront of identity security.
Get instant access to our complete research library.
Access essential knowledge at your fingertips with KuppingerCole's extensive resources. From in-depth reports to concise one-pagers, leverage our complete security library to inform strategy and drive innovation.
Get instant access to our complete research library.
Gain access to comprehensive resources, personalized analyst consultations, and exclusive events – all designed to enhance your decision-making capabilities and industry connections.
Get instant access to our complete research library.
Gain a true partner to drive transformative initiatives. Access comprehensive resources, tailored expert guidance, and networking opportunities.
Get instant access to our complete research library.
Optimize your decision-making process with the most comprehensive and up-to-date market data available.
Compare solution offerings and follow predefined best practices or adapt them to the individual requirements of your company.
Configure your individual requirements to discover the ideal solution for your business.
Meet our team of analysts and advisors who are highly skilled and experienced professionals dedicated to helping you make informed decisions and achieve your goals.
Meet our business team committed to helping you achieve success. We understand that running a business can be challenging, but with the right team in your corner, anything is possible.
So good evening everyone. I found my passion for building security programs very early in my career. And as described I've built programs and organizations of different sizes and industries, I've worked in the retail sector, consumer electronics tech, and most recently before joining elastic led the security team at a fortune 100 global financial services organization. And the textbook surviving security was selected by major universities to use their textbook for their foundation info set courses, which I was surprised to learn and always abused.
When students would reach out to me to have them have me help them with their homework for 45 days. This is the current media and dwell time for non ransomware events. According to Mandy's 2021 MDs report. It's 24 days when you include ransomware. Many of us look at this data point and strive to be better. What we are doing is working very well as medium dwell. Time was over 400 days in 2011, 100 days in 2017.
And over 50 days in 2020, I've been focusing for many years on how to make security detection and response teams in my organization's more effective and efficient and working at the speed of security. We've all felt the pain number of threats, increasing rapidly supply of skilled InfoSec professionals, too small consumerization of it impacting enterprise controls, move to the cloud commoditizing infrastructure, proliferation of data, name your buzzword. And then the pandemic had many quickly shifting to a remote work environment.
The impact for security is always the same or to protect more data, to analyze and more potential vectors of attack. We will rarely be an environment where we can completely prevent any security event. So how do we continue to find events faster and react even more quickly in today's world, continuing to do things the same way we have been doing them. Isn't going to get us there. We need another inflection point. How do we approach this problem differently? I'm going to describe how we implemented a more complete approach at elastic in three months than what I could implement.
In two years at a previous organization with different technology at the previous organization, we knew we needed to change our security monitoring approach. We were ingesting five terabytes a day or 50,000 EPS events per second of data into a leading correlation platform and into a separate logging platform for analytics and threat hunting. We struggled to find the right balance for false positives and false negatives. We couldn't keep up with the alerts. Our logging costs kept increasing past seven figures annually in us dollars.
As we always had more data to bring in the team was using the correlation platform less and less because correlation was no longer core to successful security detection level. One Analyst could not successfully triage alerts to understand what was happening in the environment we automated, where we could. And we were ineffective measured by how well or not.
Well, in this case, we responded to an unknown, independent pen test against our environment. The traditional approaches we were using were no longer working. And we went looking for a more effective way. We wanted to flip our mindset. We focused on finding a real time unsupervised machine learning based solution, a solution that did not require us to tell it what it, what was bad. We wanted it to tell us what was different from the norm, suspicious behavior across entities, users, or devices that warranted investigation.
There were a couple technologies on the market and we selected one to move forward with that best met our criteria at the time, after a six month evaluation and proof of concept, period, nothing gave us everything we wanted. So we prioritized unsupervised learning models and avoiding data volume pricing. It took six months and lots of professional services to get some initial models implemented.
After another 12 months, we had more models and we're running sock activities in parallel with our prior systems, implementation challenges largely stemmed from the complexity of building the models and scaling data processing the system, struggled to run models against large data sets. Even as batch jobs, smaller data sets could run real time. We ended up needing to more, to store more data in the machine learning system than originally planned to get the output we were looking for. So we had more data redundancy than we wanted, meaning increased costs as well.
We were seeing improvements in the effectiveness of our SOC. One of our initial initial models implemented the Superman use case for watching login location for geographic anomalies gave us insights into account takeover attempts that had been previously getting lost in all the failed login attempt alerts. In our correlation system, we saw improvements in our test response catching and stopping a potential intrusion in 45 minutes, where before we did not see any of the activities or take any action coming to elastic, I knew we needed to anchor in this approach and mindset.
Traditional security monitoring would not be effective and would not scale in this dynamic environment defining this further. We wanted many of the same things, real time, unsupervised machine learning and avoiding data duplication as much as possible. New for elastic. We wanted to avoid building out a traditional sock. I had done it several times and it no longer worked well. I had been closely following the sous monitoring approach that companies like Netflix and slack were implementing, and we would do something similar.
I also now worked at a company that had a product that should be able to do this. So we tried it in three months at elastic. We had a functioning monitoring environment that surpassed the capabilities I had running in my prior organization that took two years to build.
I, we started with an approach of decentralized search with centralized alerting the data stayed in its respective elastic search cluster around the organization. And we sent all alerts to one location. We built rare analysis models on elastic clusters, where the data lived, all writing realtime analysis with zero data redundancy. We could also do look back analysis when we implemented a new model and that ran in just a few seconds against 90 plus days of back data.
Even on our busiest clusters, the models now distributed with the SIM solution helped get this approach up and running even faster today. We also defined a few watches for specific activities we knew we never wanted to see in our environment. So any alert required immediate investigation, this overall approach did require manually configuring each elastic search cluster with detection, rules and watches, which could be time consuming and easily introduce errors.
Where we went next was to focus on decentralized collection, centralized search and centralized alerting, utilizing a functionality in elastic search called cross cluster search. The data stayed where it lived, but we could now run models and search across multiple clusters from one central location. This allowed us to configure once and have everything propagate across clusters, removing the manual effort and the significant risk of human error. Making this transition was about an additional three months of effort.
Elastic continued to grow with our data expanding 400% events per second, growing 57%, but staff staying the same when we switched across cluster search, we did see some performance impacts with our 90 day initial look back searches, going from a few seconds to a few minutes when starting a new machine learning job, looking at data across multiple remote clusters, deploying cross cluster search though overall significantly reduced the operational burden on the team by centralizing where we manage detections machine learning and alert review.
The next step on our journey was to take all the good things we built in phase two and share them with others. This is the phase we are currently building and are very close to completion. Here we move from centralized alerting to distributed response. Just as elastic as a company has always been distributed by design. Our response processes will be as well. Generally. This means we rely on domain experts for help when responding to security events.
But this phase takes that approach even further by presenting individual elastics, with the ability to respond to events in a guided manner, without instant response personnel needing to be involved. We now have the capability to send someone an alert about suspicious activity related to them, and they can verify if the activity was legitimate or not. We can also require them to reauthenticate with multifactor authentication, to help prove their identity.
In the case, when someone says it isn't them, we immediately notify an instant response team member and engage our standard response workflow. This frees us up to build new detections tune detections, build new workflows, threat hunts, and more phase four feels like the destination, but it is just another stop on our journey at decentralized collection, centralized search and distributed and automated response and being almost elastic stack native.
We'll be starting this phase of work in early 2022 building on this scenario from phase three, where we present an alert to an elastic about account activity. If the elastic Fisher were to indicate the activity, wasn't them, we will take action immediately.
This action could be removing privilege groups from their Okta profile, changing the elastic agent policy on their laptop to collect more information running OS query collections against their system and pulling the logs into a cabana case for the elastic and the systems they have interacted with this and more are all possible due to the available integrations from the tools we use in the API first nature of the elastic stack.
As we scaled and evolved our security monitoring approach, we also needed to make some changes on how we managed the underlying infrastructure with the original architecture approach system upgrades took almost three months to complete, largely due to data migration, timeframes, decoupling data storage, remove the data migration needs, allowing us to complete full system upgrades in just a few hours, moving to elastic cloud on Kubernetes, implementing standard configuration files and C I C D pipelines further improve our full system. Upgrade time to just a few minutes.
I have been amazed at what the team has been able to accomplish here in watching this progression. While we started with a focus on security monitoring, we quickly looked to apply these approaches to other areas in our information security program to help them scale as well. We first started with vulnerability management. The InfoSec team runs vulnerability management as a service within elastic providing engineering teams with vulnerability data for the tech that they operate.
We work with the teams to deploy the agent to their systems and then pull vulnerability detection data into the stack for the teams to work with. We provide dashboards and monthly reporting, but since the teams already know the elastic stack, some of them build their own dashboards or automation that creates tickets or alerts them in slack. When a new critical vulnerability is detected, they don't need to learn the UI or API of the specific scanning tools that we use. They just use the elastic APIs and they already work with every day. Our data is normalized in the elastic schema.
So if we change the underlying scanning product or add additional ones, it's transparent to users of the service, we then focused on asset management. One big advantage of the shift to infrastructure as a service is that the cloud provider knows exactly what assets you have in the environment. Otherwise they couldn't send you that huge bill every month. This means you have real time inventory information available and no longer need to rely on manual documentation or pink sweeps to understand what is on your network.
Of course, the reality is more complex. We are multi-cloud and have hundreds of separate accounts across GCP, Azure, AWS, and IBM. Each provider does it a little differently, but all of them make inventory data available via an API. And we are pulling that into a consolidated asset inventory built on the elastic stack. That is always up to date. In many cases, we can pull security and compliance relevant metadata along with the inventory so we can use it to support continuous control monitoring use cases.
At the same time, the detection team also uses this information to understand the security posture and state of an asset. When investigating an alert, our most recent area of focus has been in customer assurance. We want every elastic nation to be confident in talking to our customers about the security controls we have in place to help keep their data safe. We use elastic enterprise search to power, a self-service engine that enables anybody at elastic to search a library of 300 plus control statements based around the cloud security Alliance framework.
We can tune the search results based on synonyms, such as referring to access control results. When a user asks for our back or using curated queries, always showing a particular set of results. First for a given keyword, each control question and answer pair includes metadata so that we know which InfoSec service owns the answer. And when it was last updated. And since it is all in elastic search, we can visualize it in cabana and see which teams need to update their content.
We built a prototype that uses the same search engine to automatically fill a spreadsheet of questions with three proposed answers and their corresponding score. The continuous feedback and tuning of synonyms and curations makes the whole system better for the user that searches interactively for one specific question or for the account manager who uploads her customer's 300 question spreadsheet, where do we go? Next? A key area of focus for us will be adopting open policy agent to improve our ability to assess security configuration requirements in a free and open approach.
We will start with existing infrastructure and we'll then shift left to apply the security assessments against infrastructure as code. There are always new and exciting challenges working at the speed of security. Thank you.