OPT TASK: LEARNED PRIORS FOR TAMP

For my graduate studies qualification exam in Computer Science at Rice University, I produced an original piece of research, created a thesis quality report, and presented a defense to a faculty committee.

At the time, I was very interested in task and motion planning (TAMP), a paradigm where a STRIPS-style task planner is integrated with motion planning algorithms. TAMP methods are typically deployed when we want a robot to accomplish a long horizon task, such as washing the dishes or cleaning up a dinner table. These tasks require the robot to plan and execute several motions successfully. When we ask a robot to wash the dishes, a robot might first have to open the dishwasher, remove any clean dishes, load dirty dishes, etc.

The basic idea is that when provided with a task domain and information about available actions a priori, a task planner can produce a solution grounded by planning robot motions to accomplish each step. When no motion is found, feedback is given to the task planner, which blocks the attempted sequence of actions. Going back and forth between these planners allows TAMP solvers to explore the space of actions that a robot can execute.

My work focused on refining the task planning step by incorporating knowledge gained from experience. When deploying a robot with a TAMP solver, it’s very reasonable to envision a situation where the robot must solve similar tasks repeatedly. Many motion planning queries run in solving these tasks, but state-of-the-art approaches disregard any information about their difficulty. I proposed using failed planning queries and previous experience to learn priors over actions. These priors would model qualities of the paths produced for a given action as more compute time is spent calculating motions.

Armed with a deep library of these priors, I introduced an optimization step between task and motion planning. Instead of proposing a single task plan, I would collect the top N solutions from the task planner. With these solutions, I ran a non-linear optimization using the learned action priors to produce a task plan. Next, I attempt to plan motions for this final task plan. Due to this optimization step, I called this method OptTask.

depending on available time to plan motions, OptTask selects different plans

examples of experience data used to learn priors

To demonstrate that OptTask is competitive with state-of-the-art methods, I created an experiment pipeline that leveraged the Gurobi NLP solver to perform optimization. Both task planning and motion planning needed high-performance implementations, for which I relied on several open-source and well-supported C++ libraries, namely FastDownward, Z3, and the Open Motion Planning Library. I ran experiments in a simulated 2D setting with a free-flying SE(2) robot, a Dubins Car robot, and a differential drive robot. I tasked these robots with point-to-point delivery tasks where actions represented entering and exiting rooms and corridors. While not the same as “wash the dishes,” this kind of task captured the essence of the problem and served well as a test bed.

For a given floorplan, all tested robots would be allotted a time limit and instructed to find a high-quality solution to complete random delivery tasks as quickly as possible. Not only did OptTask find solutions faster than state-of-the-art methods, but it also exploited the experience embedded in the priors to improve as planning continued. Without prior information, state-of-the-art TAMP methods would often commit to solutions that required careful navigation on the part of the robot but were barely possible. OptTask would usually opt for more extended solutions that were easier to compute, resulting in more successful solutions. This work earned me an honors grade from my defense committee.

example of motion planning in a small environment

example of task planning in a large environment (Rice University’s Duncan Hall)