Some startups, some multi-thousand person companies, none of our answers were perfect for them. And what you needed was to build a broad base, a coalition if you will, of individuals who could come together, bring their best practices, their learnings, merge them together and create an ability to democratize the access to security information for the ecosystem. Because everybody should have the information required to build secure AI systems.
How we all do it, our success at it is obviously based upon our own entities, but there's no way you should be in a position where you don't know what the best practice is, what the requirement is, what the risks are and the way to do it. So we said we need something in that domain. And in July 2024, we created the Coalition for Secure AI, and we're primarily focused on this sharing of the security information out to the ecosystem, starting with best practices, really wherever possible, materializing that into open source solutions, making sure we collaborate on AI security research.
And as we go forward, providing the things that are required so that companies, individuals, entities can secure their AI product development, AI integration and AI use across the ecosystem. We were lucky that when we were doing this, that OASIS Open was there for us to help us build an open coalition without having to figure out how to become experts in open. And OASIS provides us all of that construct, all of that framework so that we can all come together and work in this domain.
So when we started in July, we were 11 sponsors, including companies like Anthropic, NVIDIA, Google, Microsoft, Amazon. And we're lucky enough now to have 35 plus sponsors. And these are the entities that are providing their time, their people, their money to help this thing move forward. But the important thing is, is this is an open ecosystem. And that means anybody can participate, and we want folks to participate. So what we're going to talk more about now is what are the problems we're trying to approach and how can you join in if you're interested?
Because what we need is not just individuals who have answers in this space, but we need lots of partitions for people who have questions, who have problems they're trying to approach, who have issues that need resolution so they can do the best for their company, their entity, their organization. So where we start from is that there is a common set of problems and challenges around security for AI systems. It's you want to know where you got your system from and that it really was from that location, provenance. You want to make sure you can securely integrate them and use them.
You want to make sure you know how you can control your data. And you want to make sure you know that when you're using inference across your AI ecosystems, whether generative AI or classical AI, that there's a way to do that, that you have management of your risk. Not that there's no security issues because there's no such ecosystem, but that you understand your risk, it fits your tolerance, and you have the ability to move forward. Our goal was to be able to provide a set of work streams that could start to address the most pressing problems in those domains.
So what we found was, through all of this conversation, three problems were the loudest across the ecosystem of challenges with a lot of consistency. And the first was software supply chain for AI systems. We've done a lot of great work as an ecosystem over the last five, seven years on building Salsa, OpenSSF, really trying to provide ways to be able to prove that you know where your software comes from, that it hasn't been modified in production and the like. Doing that for AI systems has a lot of interesting complexities.
Classically, for Salsa-type systems and the rest, you're doing proofs across things that were hundreds of megabytes, hundreds of gigabytes as executable files and the rest. Now you're walking into a domain with AI systems where you're talking about up to large numbers of petabytes of data and data files and the rest that you need to be able to some way provide provenance statements around. Additionally, it's not enough just to be able to say you know where it came from and that it hasn't changed.
There's also things as AI practitioners that you are really keenly focused on understanding the background of. Like, one, where did the data come from? Are there geographic constraints to that data's usage? How was it trained? Who trained it? Where do you expect this to be used? What are the purposes and the testing that was expected beyond it? These are kinds of content that classically a provenance group by themselves wouldn't think about, but that as far as COSI, we knew we needed to bring to the table. And this brings together one of the key elements of COSI.
COSI doesn't exist in isolation. We have large partnerships with a whole bunch of other forums. One of them is OpenSSF. We don't have any goal of duplicating the efforts of any of these other organizations. Our goal is to extend and add to the color of their ecosystems. And with OpenSSF, they're providing a lot of really wonderful underpinnings at the algorithmic and the core file signing level. They have an AI working group. We have a lot of cross-partnership across that AI working group and COSI.
Our goal is really to focus on all of that AI-specific expertise on top of it that we can bring to the table so that practitioners can know they're actually solving for their provenance problem while they're using AI systems. The second area we heard huge questions and problems around is, hey, my company's already using AI. How do I ensure my defenders are prepared for that use of AI? So this is not the risk of AI to your company or your entity. This is your own use of AI, whether it's the use of an LLM in a logging infrastructure, whether it's the use of classic ML in a trading system.
Thinking about and knowing the questions to ask from a defender standpoint around what changed in my threat surface? How do I have to defend this differently? Where do I have to spend my money this week versus next week to make sure it's more secure? What are the things that I need to invest in that I haven't thought of previously? This is a really big problem for defenders, and they generally don't have enough information on how to secure themselves in this space.
So in this work stream, we're focusing on starting with, how do we get the threat information and the sharing information at the base level to defenders? How do we give them a way to think about the change to their threat surface and where they need to invest to defend themselves through their use of AI? The third area is AI security risk governance. It's too large to fit on the screen, so we made it AI risk governance, but it's AI security risk governance.
And in this domain, it's how do we enable engineering directors and chief technology officers with the questions, the frameworks with which they can evaluate their own risk taking with regards to AI? Are they in a space that they don't understand? Do they have shadow AI ecosystems? Do they have particular vulnerabilities and controls that they need to focus on based on their data usage, based on the way they've done fine tuning, based on where they're production data is being used in the ecosystem? The best part about COSI is that it's an open ecosystem, right?
We have wonderful sponsors, and we are open to new sponsors at any time. But more importantly for us is that everybody can engage in our work streams. Our work streams are open work streams. We share all of the content on GitHub. We have Slacks. We have mailing lists. And it's super important for us to find individuals who are bringing both their expertise on how to secure and how to build secure systems, as well as the questions, the issues, the things they need to resolve as part of this problem domain. And it's a globally based entity and company.
Our goal is to bring in the problems that address the problems everywhere, not just in one domicile, one area. So our work streams have launched. They started in October of this year. As an example, this is an RFC from our work stream one that's really focusing on how do you expand the basic idea of provenance into AI systems? Not just an AI model, but the system itself and all of the integrated components. And what does it mean to provide tamper proof so that in inference that you know you're still running the same system? All of our work streams are starting to progress.
They're all at different levels of maturity. And it's just a truly amazing time to engage and to get engaged with this type of effort.
With that, I'd love to pause and then just see if you have all any questions, any areas you'd like to focus on, problems in AI that you'd love to see answers for. For me, hearing the areas that you all have questions about is more important than almost anything else we can do in these kinds of sessions.
To you, and you can speak into the microphone. I'll try to. Thank you. First of all, fascinating. And I'm always really keen on this kind of things. But in my mind, there's like four possibilities where things like HEI might come from. So there's the industry, there's academia, there's the government, and there's the military. And all of them have different goals. And you put on the goals as well. So do you see any differentiation where you try to address that differently to those four sources? Because I think motivation is the one thing that drives it, right? And makes it adaptable.
And therefore, you have different security goals in mind as well. So I think it's a great question. So to reflect, are we seeing, due to different initiation of requests and needs around AI, different requirements for control and security? 100%. Right? So there are definitely use cases, as an example, where we know that companies and entities are like, this needs to be on your device. Your mobile phone. And it needs to be in a way that we can confirm for you and ensure for you that only your mobile phone can use it.
And only the specific application that you think you've got the AI model can share that content. Right? We absolutely see that as an example. We see examples where, hey, I'm a domicile. The data shouldn't leave my boundary, right? And so I want security controls that let me enforce that. And a large spectrum in between of different types of scenarios. One of the things we're looking at is a fourth work stream that's really focused on the delivery of secure applications and secure agentic infrastructures.
Because that's where lots of the variability that I think you're hinting at starts to come to the table. The wonderful thing about provenance is it's generally the same problem everywhere. How you do the final proof. Are you willing to use something like SIGSTOR, open source, PKI infrastructure, or do you need to use your own bound into your company? Those things are different. But our solutions will allow for all that variability. You can plug and play across the space. Helping the defenders tends to help everybody. There's no real general problem there.
And the risk side, there's always a problem with risk is that we can only ever help with generalized risk frameworks to a certain level. There's always that applicability to the way your business runs, that much risk you want to take that's always left to the user to implement. One of the reasons we started with those three is because we not only sidestep, but we get a little bit of progress. Those are the big questions before we jump into the more specific domain problems. Secure applications and secure agentic infrastructures have a lot more of those questions.
One of the big things we're looking at is ensuring that we have a lot more representation from areas, for an example, that have more constrained computation capability or requirements. Whether it's mobile phones or old mobile phones or smaller server infrastructures that can't use cloud or whatever it happens to be. And then how do you provide secure inference ecosystems or even better secure fine tuning ecosystems for those organizations? Or to the space that's obviously a little bit more interesting for folks right now, unagentic infrastructures.
How do you tie an LLM, maybe a more distilled model together with then debugging tools or coding tools and the rest and do it in safe ways? That's definitely where we see a lot more variability. The good news is so far, the control and variance, the technical underpinnings seem to be the similar type of control everybody needs. It's how they compose them. So we're a security program, right? We're fundamentally focused on security primitives, but we're going to provide you primitives that allow you to make all sorts of different claims around all sorts of different domains.
You could use our Providence privileged primitives to help you build a privacy program or something like that. We think because we're focused at that level, there's some additional work, this additional work stream that we may have to launch that's required, but we can allow all of those different entities with their different drivers and different risk tolerances to compose the outcomes they need. But I think it's a truly wonderful question. Thank you for that. So there's maybe three or four, maybe probably more than I can count, groups that are trying to do similar things.
How do you want to differentiate what you're doing from, let's say, CSA, who is going to also cover AI risk governance, or let's say OWASP and MITRE Atlas, who's going to cover the middle one, and tons of people also trying to cover the first one. So how do you want to carve out a – how do you want to – what's the specific niche that you want to create? I think that's a wonderful question. So one of the reasons it took – I spent a lot of time spinning up CoSci.
One of the reasons it took us a year, plus or minus, to launch it was to make sure, one, we knew where the space was, because there is a plethora of unbelievable other forums doing work in this domain. Some we haven't mentioned, ML Commons, CGPA, like there's just Frontier Model Forum, really, really great entities. And a lot of the work was ensuring we knew where the space was and where we were going to have really, really strong collaborations with.
So in our announcement in July, we made it clear that Frontier Model Forum is one of our key collaborators, because we're really starting at the integration layer after the building of foundational models, right, how you do fine-tuning plus. So there's your boundary with Frontier Model. They're dealing with all the important things around how labs build frontier models and the rest. That's their side. OWASP – I feel like I paid you to sit there and say this – OWASP and CSA are some of our stronger burgeoning partnerships, right? We all see the problem very similarly.
What COSI is bringing to the table is a lot more of the really focused security AI experience, expertise and practitioners who are focused on that portion of the domain so that we can expand the really great work going on in OWASP and CSA as two examples so that we have more color and more engagement in those spaces. So with those, we see really strong partnerships.
With the MITREs and the standard organizations and the like, we have a public sector steering committee that's really working around how we engage with the executive, how we engage with standards organizations, how we engage with some of the other labs around providing research requests for comments, feedback and engagement in those domains. And there's a whole bunch of stuff we just don't overlap with, right? We're very formally security, right? So we're not doing cyber evaluations or ML model testing, right? So there's no ML Commons overlap.
We're not doing data watermarking or image watermarking or media watermarking. There's no C2PA components of it. We have a lot of focus in security, and that allows us to create the really small, I call it narrow, but security is so unbelievably broad, but narrow when you look at the broader ecosystem, right? We're not safety. We're not privacy. We're not the broader problem of how you deploy a model in cloud all by itself. This is really about how do you do this stuff securely and provide the underpinnings. So we think there's two outcomes in these spaces.
One is we want people to be able to steal like artists based on our content, right? We want to provide the best foundational components required for folks to build secure systems, to use them to build their claims, and if it's other ecosystems and other standards that then take those things and integrate it into their own, we think that's perfect. That's like the best possible outcome. The other place is where we're going to do a lot of collaboration and cooperation around driving towards a particular goal.
Likely that'll be something in the OWASP or the CSA space or the FMF space as we go forward where there are very specific problems that can help with larger communities that are more focused in that domain. But I think it's a really great question. We have a few more minutes before we need to actually wrap up the next session. So maybe if you take a few words to leave us with and we'll do a last check for questions at the end. Sure. So more important to us than anything else is give us feedback. Get engaged if you're willing to. Our work streams are open. Anybody can read. Anybody can engage.
If you want to participate, all that's required is to sign an electronic contributor license agreement. We love new sponsors. We love individuals who can bring new problems to the table, new investments to the table, new people to the table. It's always a wonderful thing for us. But most important for us in this AI space is we want to help make sure that the ecosystem is not going to speed run the last 35 years of vulnerabilities in AI systems in the next year.
So we can do work to avoid the teething pains of enterprise use of cloud or middleware systems or web systems when we go to AI and avoid some of those really key problems. That's our real goal. That's our real focus. We believe that the information required to secure systems should be in everybody's hands. We believe that information should be living and that you should have the ability to leverage it to the best of your ability. And we think that wherever possible, we should provide systems that just make that simpler and make it more of a straightforward implementation.
And that's our open source focus and delivering stuff in that domain. So with that, if there are any other questions, I'd love to hear from them.
If not, you can always reach out to us. Any of those things, get to us. I'll be around this session. But I also just want to thank everybody for participating. Thank you very much. Thank you.