Research has shown that adversarial training of robotics systems sees much better succes rates than collaborative training. This can completely change the way how robots learn from their surroundings.
A team of computer scientists at the University of Southern California (USC) have devised a unique training technique based on a human adversary to help a robot succeed in carrying out basic tasks.
Stefanos Nikolaidis, assistant professor of computer science, USC, said, “This is the first robot learning effort using adversarial human users. If we want them to learn a manipulation task, such as grasping, so they can help people, we need to challenge them.”
Nikolaidis along with his team have used reinforcement learning, a technique in which artificial intelligence programs “learn” from repeated experimentation. As easy and mundane it may sound, the “repeated learning” method is quite challenging due to the amount of training required. Robotic systems need to go through a huge number of examples and learn from each of them on how to manipulate an object, just as a human does.
A case in example is OpenAI’s robotic system that successfully managed in solving a Rubik’s cube. For this however, the robot had to undergo a simulated training (equivalent of 10,000 years) – in order to just learn how to manipulate the cube.
It is also important to consider the progress in robot’s proficiency for mastering a task. Without extensive training, it can’t pick up an object, manipulate it or effectively handle a different object/task.
“As a human, even if I know the object’s location, I don’t know exactly how much it weighs or how it will move or behave when I pick it up, yet we do this successfully almost all of the time. That’s because people are very intuitive about how the world behaves, but the robot is like a newborn baby.” said Nikolaidis.
The reason behind this could be that a robotic system finds it hard to generalize or distinguish between objects/tasks (something which humans take for granted). While this may seem irrelevant, it can however have serious consequences. If assistive robotic devices, such as grasping robots, are to carry out the task of helping people with disabilities, then they must be able to operate reliably in real-world environments,
Challenge is necessary to succeed
The experiment went something like this: in a computer simulation, the robot attempted to grasp an object while a human kept observing this. When the robot succeeded in grasping the object, the observing human tried to snatch that object from the robot’s grasp. This training helped the robot to understand the difference between a weak grasp and a firm grasp. Over a period of time, through repeated trainings, the robot finally learned how to make it harder for the human to perform the act of snatching.
Through this experiment, the researchers found out that a robotic system achieved a 52 percent success rate when trained with a human adversary as compared to 26.5 percent with a human collaborator.
“The robot learned not only how to grasp objects more robustly, but also to succeed more often in with new objects in a different orientation, because it has learned a more stable grasp,” said Nikolaidis.
They also found that the robotic system trained with a human adversary performed better than a simulated adversary. This proved that robotic systems learn best from actual human adversaries rather than virtual ones.
“That’s because humans can understand stability and robustness better than learned adversaries,” explained Nikolaidis.
Hoping to implement it further
Though this will present a new real-world challenge, it is hoped that such adversarial learning will be widely used for the purpose of improved training of future robotic systems.
“We are excited to explore human-in-the-loop adversarial learning in other tasks as well, such as obstacle avoidance for robotic arms and mobile robots, such as self-driving cars,” said Nikolaidis.
The question remains as to will adversarial learning have adverse effects? Will we be going as far beating robots into submission?
The answer, according to Nikolaidis, lies in finding a right – balance of tough love and encouragement with our robotics counterparts.
“I feel that tough love in the context of algorithm, is like a sport: it falls within specific rules and constraints. The robot needs to be challenged but still be allowed to succeed in order to learn,” said Nikolaidis.