Home Tech UST Develops Physical AI Safety Verification Model

UST Develops Physical AI Safety Verification Model

0
UST–ETRI doctoral student Kim Hyung-min (left) and professor Kim Do-hyung / Courtesy of UST
UST–ETRI doctoral student Kim Hyung-min (left) and professor Kim Do-hyung / Courtesy of UST

The University of Science and Technology (UST) said a research team from the Electronics and Telecommunications Research Institute (ETRI) at UST has developed a new AI robot performance evaluation benchmark called the Safety-Focused Planning and Operation Compliance (SPOC) model, designed to verify physical artificial intelligence (AI) systems.

Recently, research into embodied AI robots has accelerated, with large language models (LLMs) such as ChatGPT being applied directly to robotic systems. These robots can understand everyday language commands from users and independently plan and perform complex tasks.

However, existing evaluation methods typically assess only whether a robot achieves its final goal, rather than determining whether it performs tasks safely while accounting for various risks that may arise in real-world environments.

The SPOC model places safety at the core of its evaluation criteria. In addition to measuring whether a robot successfully completes its task, the system simultaneously evaluates compliance with safety rules related to five major household risks: fire, water overflow, object damage, human injury and food contamination.

In particular, the model introduces strict evaluation standards for factors that have been difficult to verify in previous benchmarks, including a robot’s realistic perception capability (partial observability) and physical constraints.

For example, when an AI robot receives a command such as “bring me a bottle of wine,” previous evaluation systems might allow the robot to skip intermediate steps — such as opening a cabinet door — and proceed directly to the target object. Under the SPOC evaluation model, such unrealistic action planning is treated as a failure.

Example of a task procedure that achieved an object manipulation goal but violated safety rules / Courtesy of UST
Example of a task procedure that achieved an object manipulation goal but violated safety rules / Courtesy of UST

Instead, the model verifies whether the robot can make realistic judgments on its own, such as recognizing that the object is not visible and deciding to open the cabinet door first to search for it.

It also strictly evaluates whether a robot follows proper physical reasoning. For example, if a single-arm robot holding an object must open a drawer, it must first put the object down and then open the drawer with an empty hand.

The model applies a stringent safety verification standard: if a robot violates a safety rule even once during task execution, the attempt is immediately classified as a failure under a zero-tolerance policy.

Experiments conducted using the SPOC model revealed significant limitations in the safety awareness of current AI models. In particular, small language models (SLMs) — which have attracted attention because they can be embedded directly into robots without requiring large external servers — showed extremely low safety compliance rates even when given explicit safety instructions.

These findings highlight the urgent need for further research to strengthen safety awareness in real-world AI robots.

Lead author Kim Hyung-min, a doctoral student at UST, said the SPOC model represents a serious attempt to evaluate whether robots can perform tasks while complying with strict physical constraints and safety conditions across diverse environments.

“We hope this evaluation model will help accelerate research into reliable AI robots that can be deployed in real-world settings,” Kim said.

Corresponding author Professor Kim Do-hyung said the research provides an important reference for developing safe physical AI systems as robots increasingly coexist with humans.

“In the future, we plan to expand the model so that it can evaluate whether robots can infer safety rules on their own and ask humans for clarification or adjust their behavior in dangerous situations,” he said.

The SPOC performance evaluation model and experimental data will be released to the global research community and are expected to serve as a shared verification platform for safety-focused autonomous AI research.

The research findings were presented at ICASSP 2026, an international conference in the fields of signal processing and speech recognition, where the work received recognition for its excellence.

NO COMMENTS

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Exit mobile version