-
Notifications
You must be signed in to change notification settings - Fork 0
Blog Post #1 (Planning)
Our second goal in the project, after navigating to a set of goals in an unknown environment, involved using a path planning algorithm to optimally return home after a mission. Based on our past experience, the answer seemed obvious: A* of course! But while A* works great in a controlled environment like a simple video game, the uncertainty of an imprecise robot interacting with the real world means that the plan A* produces becomes irrelevant very quickly. On a more personal note, we also wanted to learn about a different, more complex algorithm. So, after some research, we learned about Markov Decision Processes, aka MDPs. The beauty of the algorithm is that once the policy is computed, the robot can trivially determine what the optimal move is reach the goal, all while encoding the uncertainty of the system.
Our envisioned approach to the problem involved finding a fully implemented MDP package which we would learn how to interface with and have a working prototype very fast. Later on, we would begin to take the package apart and implement its functionality on our own, or throw some personal flare into it. This process would give us a good understanding of the concepts, while maximizing our time spent developing more interesting parts of the algorithm. We found a promising package, markov_decision_making, which executed the set of states, actions, and policy you define for your robot. While this was a higher abstraction than we expected, we decided to try it...and ran into a few problems. Firstly, we found that we did not have a strong enough grasp on MDPs to understand the abstraction made. Secondly, and most importantly, we found that the package was not tested on ROS Melodic, the distribution we were running.
And so, we decided that we would more likely succeed if we made the algorithm ourselves. We went through a few resources to understand how MDP works, but settled on replicating the algorithm described in "The Stochastic Motion Roadmap: A Sampling Framework for Planning with Markov Motion Uncertainty" by Ron Alterovitz, Thierry Simeon, and Ken Goldberg. The key appeal of this paper over other sources is that it goes through the specific implementation of a similar problem, rather than describing MDPs at a higher level. It also explains how they formulated their system as a Markov Model by sampling points (states) on the map and making weighted edges between states based on the likelihood of transitioning from one to the other (effectively computed via simulation). They then solve the MDP problem with dynamic programming, producing an optimal planning policy.
As of this blog post, we focused on building our Markov Model, which we will formulate into an MDP and solve in the next post. A Markov Model encodes how the robot transitions between states. In our case, the states of the robot are the potential locations of the robot on a map, represented by an (x, y) coordinate pair and the robot's orientation theta (which direction it is facing). The robot can perform three different actions at any location: turn left by a certain number of degrees, move forward for a given distance, or turn right. While in a perfect world we would know exactly how the robot would transition from one state given an action, we do not know for certain how the the action will be executed and whether we will end up in a slightly different position than expected. For this reason, we build a roadmap of all of the potential transitions of the robot by simulating where the robot will end up given a Gaussian distribution of uncertainty in how accurately it completes the action. In the image below, the green spheres represents starting states with green arrows pointing to all of the states that the robot could potentially move to if told to move forwards. The red and blue spheres represent these end states, with bluer states being hit with higher likelihood.
One challenge with this approach is the sheer number of states that we need to generate to accurately represent our system. To determine which state the robot has approximately moved to given an action, we first mathematically find the location it would end up in and then find a close enough state in our master list of states. If our states are not close enough, the approximation will return the state we started in, leading to a model where our robot never moves anywhere! We plan on optimizing the efficiency with which we generate the states.
Now that we have a set of states, actions, and a definition of how we transition between states, we can move on to setting up an MDP!
See you in the next post!