Researchers at MIT, Carnegie Mellon University, and the University of California, San Diego developed a framework called DiffSkill, for a robotic manipulation system that employs a two-stage learning process. This allows a robot to do complicated dough manipulation tasks over an extended period of time. This technology could be used in various situations when a robot has to operate deformable objects, such as a caregiving robot feeding, bathing, or dressing someone who is elderly or has motor limitations.
Each step the robot must take to complete the objective is solved by a “teacher” algorithm. Then it trains a “student” machine-learning model, which learns abstract ideas about when and how to perform each skill it requires during the activity, such as rolling a pin. With this information, the system determines how to execute the talents in order to fulfil the task.
“This method is closer to how we, as humans, plan our actions. When a human does a long-horizon task, we are not writing down all the details. We have a higher-level planner that roughly tells us what the stages are and some of the intermediate goals we need to achieve along the way, and then we execute them,” says Yunzhu Li, a graduate student in the Computer Science and Artificial Intelligence Laboratory (CSAIL), and author of a paper presenting DiffSkill.
The “Teacher”
The DiffSkill framework’s “teacher” is a trajectory optimization technique that can tackle short-horizon challenges in which an object’s initial state and final location are nearby. The trajectory optimizer operates in a simulator that models real-world physics (known as a differentiable physics simulator, which is where the “Diff” in “DiffSkill” comes from). The “teacher” algorithm learns how the dough must travel at each stage using information from the simulator, and then outputs those trajectories.
The “Student”
The “student” neural network then learns to emulate the teacher’s behaviours. It takes two camera shots as inputs: one of the dough in its current state and the other of the dough at the completion of the task. The neural network creates a high-level plan to integrate diverse skills in order to achieve the goal. It then creates short-horizon trajectories for each skill and sends commands to the tools directly.
DiffSkill outperformed popular reinforcement learning approaches, in which a robot learns a task by trial and error. According to Xingyu Lin, the study’s lead author, the researchers discovered that the “student” neural network could even surpass the “teacher” algorithm!
This work is supported, in part, by the National Science Foundation, LG Electronics, the MIT-IBM Watson AI Lab, the Office of Naval Research, and the Defense Advanced Research Projects Agency Machine Common Sense program. The research will be presented at the International Conference on Learning Representations.