OpenAI shares AI model cybersecurity warning
Navigating the double-edged sword of AI innovation and cyberthreats
Takeaways
- OpenAI warns that future AI models, especially those based on large language models (LLMs), will have advanced cybersecurity capabilities that could be exploited by cybercriminals.
- The company is monitoring model usage for unsafe activity and collaborating with red-teaming organizations to bolster safety measures.
- Aardvark, an AI agent trained for security research, is in private beta to help developers and security teams identify vulnerabilities, with free coverage for certain open-source projects.
- Initiatives like the OWASP GenAI Security Project and NIST’s taxonomy for AI security are helping organizations understand and mitigate emerging threats as AI technology evolves.
OpenAI is warning that the next generation of artificial intelligence (AI) models based on its large language models (LLMs) will have high levels of cybersecurity capabilities that might potentially be abused by cybercriminals.
Specifically, cybersecurity teams should expect cybercriminals will use LLMs to develop working zero-day remote exploits against well-defended systems or assist with complex, stealthy intrusions, according to a blog post shared by OpenAI.
OpenAI is working to minimize any potential misuse of its models by, among other things, training the model to safely respond to requests that would enable clear cyber abuse, while continuing to remain helpful for educational and defensive use cases.
Additionally, OpenAI is pledging to continue to monitor usage of its LLMs for unsafe activity, which it will then block or route to less capable models. At the same time, the company is working with red-teaming organizations to evaluate and improve its safety mitigations.
OpenAI also plans to add a trusted access program through which it will explore providing qualifying users and customers working on cyberdefense with tiered access to enhanced AI model capabilities for defensive use cases, in addition to establishing the Frontier Risk Council, an advisory group through which experienced security practitioners will work in close collaboration with its teams. OpenAI is also working with the Frontier Model Forum, a nonprofit consortium dedicated to a shared understanding of threat models and best practices.
Finally, OpenAI noted it is also making Aardvark available in a private beta. Aardvark is an AI agent that has been trained to be a security researcher to help developers and security teams find and fix vulnerabilities. The organization also plans to offer free coverage to select non-commercial open-source repositories to improve software supply chains.
Balancing AI innovation with emerging cybersecurity risks
Of course, the AI models developed by OpenAI are now only one of many options, many of which will similarly continue to become more advanced in terms of how they can be misused by nefarious actors. In fact, Anthropic last month disclosed that a nation-state threat group based in China abused its Claude AI model to help launch a series of cyberespionage attacks that could serve as a blueprint for how AI and associated agents can be used by bad actors in the future.
The unnamed group used Claude Code to target more than two dozen organizations in a campaign that automated 80% to 90% of the work, with human intervention needed at only four to six critical decision points for each hack.
Fortunately, other organizations are working to help improve AI security. Most recently, the OWASP GenAI Security Project published a top 10 list of the potential security threats that organizations are likely to encounter as they build and deploy AI agents, and the U.S. National Institute of Standards and Technology (NIST) is building a taxonomy of attack and mitigations for securing AI agents.
At this point, there is little doubt that from a cybersecurity perspective AI is a double-edged sword. The hope is that cybersecurity teams will ultimately benefit more from AI than adversaries. However, given the early evidence, it might be prudent for cybersecurity teams to start preparing now for what might turn out to be the worst of all possible outcomes.
Informe de Barracuda sobre Ransomware 2025
Principales conclusiones sobre la experiencia y el impacto del ransomware en las organizaciones de todo el mundo
Suscríbase al blog de Barracuda.
Regístrese para recibir Threat Spotlight, comentarios de la industria y más.
Seguridad de vulnerabilidades gestionada: corrección más rápida, menos riesgos, cumplimiento normativo más fácil
Descubra lo fácil que es encontrar las vulnerabilidades que los ciberdelincuentes quieren explotar.