While most CISOs and CIOs have created AI policies, it is no surprise that more extensive due diligence, oversight and governance are required for the use of AI in a cybersecurity context. According to Deloitte’s annual cyberthreat report, 66% of organizations suffered ransomware attacks. There was also a 400% increase in IoT malware attacks. And in 2023 91% of organizations had to remediate a supply chain attack affecting the code or systems they used. That’s because the long-standing cybersecurity practices that worked in the past, haven’t caught up to the capabilities and threats presented by large language models (LLMs). LLMs trained on vast quantities of data can make both security operations teams, and the threats they’re trying to mitigate, smarter.
Because LLMs are different from other security tools, a different set of approaches is required to mitigate their risks. Some involve new security technologies. Others are tried-and-true tactics modified for LLMs. These include:
Adversarial Training: As part of the fine-tuning or testing process, cybersecurity users should expose LLMs to inputs designed to test their boundaries and induce the LLM to break the rules or behave maliciously. It works best at the training or tuning stage before the system is fully implemented. This can involve generating adversarial examples using techniques such as adding noise, crafting specific misleading prompts, or using known attack patterns to simulate potential threats. That said, CISOs should have their teams (or the vendors) perform adversarial attacks on an ongoing basis to ensure compliance and identify risks or failures.
Build in Explainability: In LLMs, explainability is the ability to explain why a specific output was offered. This requires that cybersecurity LLM vendors add a layer or explainability to their LLM-powered tools; deep neural networks used to build LLM models are in the early stages of developing full explainability. Tellingly, few security LLMs today promise explainability. That’s because it is very difficult to build reliable explainability and even the largest, best-resourced LLM makers struggle to do it. The lack of explainability leads logically to the next few mitigation steps.
Continuous Monitoring: Putting in place systems to monitor security controls is not novel. Asset inventories and security posture management tools attempt this. However, LLMs are a different instance and continuous monitoring must detect anomalous or unexpected LLM outputs in real-world use. This is particularly challenging when the outputs are unpredictable and potentially infinite. Large AI providers like OpenAI and Anthropic are deploying specific LLMs to monitor their LLMs — a spy to catch a spy, so to speak. In the future, most LLM deployments will be in pairs — one for output and use, the other for monitoring.
Human-in-the-Loop: Because LLMs are so novel and potentially risky, organizations should combine LLM suggestions with human expertise for critical decision-making. However, keeping a human in the loop does not completely solve the problem. Research on human decision-making when they are paired with AIs has demonstrated that LLMs which appear more authoritative induce the human operators to “take their hands off the wheel” and overly trust the AIs. To address this issue, CISOs and their teams need to create a security process where LLMs are not overly trusted or assigned too much responsibility so that human operators become overly dependent and unable to distinguish LLM errors and hallucinations. One mode might be to have LLMs initially introduced in “Suggestion Only” mode, where they provide advice and guidance but are not permitted to enact changes, share information or otherwise interact with systems and others without explicit permission from their human operator.
Sandboxing and Gradual Deployment: It is crucial to thoroughly test LLMs in isolated environments before live deployment. This is related to adversarial training but is different because the LLM should be test-driven in circumstances that are nearly identical to real cybersecurity processes and workflows. This training should even constitute real attacks based on real-world vulnerabilities and TTPs in play in the field. Most security controls and tools are put through a similar process of sandbox deployment, with good reason. Cybersecurity is so multifaceted and complex, with organizations deploying dozens of tools, that unexpected interactions and behaviors can emerge.
Slowly Introduce LLMs
LLMs have introduced a greater risk of the unexpected, so, their integration, usage and maintenance protocols should be extensive and closely monitored. Once the CISO is satisfied that the LLM will be safe enough and effective, then deployment should be gradual and methodical. A good approach is to deploy the LLM initially for less critical and complex tasks and slowly introduce it into the most cognitively challenging workflows and processes, where good judgment is essential.