The researchers at Penn Engineering are improving robot safety, identifying vulnerabilities, and developing solutions to integrate language models securely into the physical world.
Researchers at Penn Engineering have identified previously unrecognized security vulnerabilities and weaknesses in AI-controlled robots. The study addresses these emerging risks to ensure the safe implementation of large language models (LLMs) in robotics. The work demonstrates that, at this moment, large language models need to be more secure when integrated with the physical world.
RoboPAIR, an algorithm developed by the researchers, accomplished a 100% “jailbreak” rate in just a few days, bypassing the safety guardrails of three distinct robotic systems: the Unitree Go2, a quadruped robot used across various applications; the Clearpath Robotics Jackal, a wheeled vehicle commonly utilized in academic research; and the Dolphin LLM, a self-driving simulator created by NVIDIA. For instance, this breach of safety protocols could allow the self-driving system to dangerously speed through crosswalks.
It is crucial to recognize that systems become safer by identifying their weaknesses, a principle applicable to cybersecurity and AI safety. AI red teaming, a safety practice that involves testing AI systems for potential threats and vulnerabilities, is essential for protecting generative AI systems. By identifying weaknesses, these systems can be tested and trained to avoid potential issues.
Addressing the problem, researchers argue, involves more than just a software patch; it necessitates a comprehensive reevaluation of how the integration of AI into physical systems is regulated.
Intrinsic vulnerabilities must be addressed before deploying AI-enabled robots in the real world. Indeed, the researchers are developing a framework for verification and validation that ensures robotic systems can—and should—take only actions conforming to social norms.