We need a new way to measure AI security
Tl;dr: Trail of Bits has launched a practice focused on machine learning and artificial intelligence, bringing together safety and security methodologies to create a new risk assessment and assurance program. This program evaluates potential bespoke risks and determines the necessary safety and security measures for AI-based systems.
If you’ve read any news over the past six months, you’re aware of the unbridled enthusiasm for artificial intelligence. The public has flocked to tools built on systems like GPT-3 and Stable Diffusion, captivated by how they alter our capacity to create and interact with each other. While these systems have amassed headlines, they constitute a small fraction of AI-based systems that are currently in use, powering technology that is influencing outcomes in all aspects of life, such as finance, healthcare, transportation and more. People are also attempting to shoehorn models like GPT-3 into their own applications, even though these models may introduce unintended risks or not be adequate for their desired results. Those risks will compound as the industry moves to multimodal models.
With people in many fields trying to hop on the AI bandwagon, we are dealing with security and safety issues that have plagued the waves of innovation that have swept through society in the last 50 years. This includes issues such as proper risk identification and quantification, responsible and coordinated vulnerability disclosures, and safe deployment strategies. In the rush to embrace AI, the public is at a loss as to the full scope of its impact, and whether these systems are truly safe. Furthermore, the work seeking to map, measure, and mitigate against newfound risks has fallen short, due to the limitations and nuances that come with applying traditional measures to AI-based systems.
The new ML/AI assurance practice at Trail of Bits aims to address these issues. With our forthcoming work, we not only want to ensure that AI systems have been accurately evaluated for potential risk and safety concerns, but we also want to establish a framework that auditors, developers and other stakeholders can use to better assess potential risks and required safety mitigations for AI-based systems. Further work will build evaluation benchmarks, particularly focused on cybersecurity, for future machine-learning models. We will approach the AI ecosystem with the same rigor that we are known to apply to other technological areas, and hope the services transform the way practitioners in this field work on a daily basis.
In a paper released by Heidy Khlaaf, our engineering director of ML/AI assurance, we propose a novel, end-to-end AI risk framework that incorporates the concept of an Operational Design Domain (ODD), which can better outline the hazards and harms a system can potentially have. ODDs are a concept that has been used in the autonomous vehicle space, but we want to take it further: By having a framework that can be applied to all AI-based systems, we can better assess potential risks and required safety mitigations, no matter the application.
We also discuss in the paper:
- When “safety” doesn’t mean safety: The AI community has conflated “requirements engineering” with “safety measures,” which is not the same thing. In fact, it’s often contradictory!
- The need for new measures: Risk assessment practices taken from other fields, i.e. hardware safety, don’t translate well to AI. There needs to be more done to uncover design issues that directly lead to systematic failures.
- When “safety” doesn’t mean “security”: The two terms are not interchangeable, and need to be assessed differently when applied to AI and ML systems.
- It hasn’t been all bad: The absence of well-defined operational boundaries for general AI and ML models has made it difficult to accurately assess the associated risks and safety, given the vast number of applications and potential hazards. We discuss what models can be adapted, specifically those that can ensure security and reliability.
The AI community, and the general public, will suffer the same or worse consequences we’ve seen in the past if we cannot safeguard the systems the world is rushing to adopt. In order to do so, it’s essential to get on the same page when it comes to terminology and techniques for safety objectives and risk assessments. However, we don’t need to reinvent the wheel. Applicable techniques already exist; they just need to be adapted to the AI and machine-learning space. With both this paper and our practice’s forthcoming work, we hope to bring clarity and cohesion to AI assurance and safety, in the hope that it can counter the marketing hype and exaggerated commercial messaging in the current marketplace that deemphasizes the security of this burgeoning technology.
This approach builds on our previous machine-learning work, and is just the beginning of our efforts in this domain. Any organizations interested in working with this team can contact Trail of Bits to inquire about future projects.