You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A fundamental requirement of legged robots is gait planning, the way the foot moves in a closed loop consisting of stance(when the foot touches the ground) and the swing phase (when the same foot leaves the ground). Quadrupeds use different gaits in different situations to optimise their energy, mainly: walk, trot, pace, and gallop. The difference in all these gaits lies in the phase between the legs. Check this link to see them in action.
Gait algorithms and need for RL
There are various methods we can use while gait implementation(the swing and stance phases), like spring-loaded inverted pendulum (SLIP), but we have finally decided to implement Bezier curves. However, this won't be much stable in rough and uneven terrains as the legs strictly follow a fixed loop. In order to counter this problem, we need to adapt the bot such that it adjusts the trajectory of the feet to make the bot stable. This is done using Reinforced learning, a system in which machines learn through trial and error using their own experience. In a bird's eye view for the working of RL, we have an environment (a simulation or kind of a game) in which the agent(player: computer in this case) can take one of the fixed states of actions. You reward the player/computer if the goal is achieved. RL helps to increase the reward by playing in the environment for several iterations called episodes. If the episode ends or the player loses the game, the reset function is called, which forms the default state of the environment for a fresh episode.
A word of caution: Don't confuse it with deep learning. The two might sound similar but are different because reinforced learning learns using rewards and punishment by taking random steps in an environment, whereas deep learning learns by finding patterns in past data to predict results on new data.
Libraries and resources
OpenAI's gym library comes handy in creating custom environments so that we can train our agent. However, it simulates the environment in Pybullet. So the team has decided first to train the model and then export it to Gazebo and integrate it with the other systems. On the RL agent that we have planned to use, we think PPO2 or ARS might be a good choice since our environment has continuous action space and other algorithms fail to work in this case. According to the plan, we are currently training with these two agents and whichever seems efficient will be used finally. The environment was a big task to complete so we are using the one created by Morisbots. We found this a useful guide to learn RL. Another useful course was UAlberta's specialization. Besides, to prepare your agents, you need to learn either Tensorflow or Pytorch. It is better to have a basic idea of both since the online resources are limited and both are equally used.
The text was updated successfully, but these errors were encountered:
Gait planning
A fundamental requirement of legged robots is gait planning, the way the foot moves in a closed loop consisting of stance(when the foot touches the ground) and the swing phase (when the same foot leaves the ground). Quadrupeds use different gaits in different situations to optimise their energy, mainly: walk, trot, pace, and gallop. The difference in all these gaits lies in the phase between the legs. Check this link to see them in action.
Gait algorithms and need for RL
There are various methods we can use while gait implementation(the swing and stance phases), like spring-loaded inverted pendulum (SLIP), but we have finally decided to implement Bezier curves. However, this won't be much stable in rough and uneven terrains as the legs strictly follow a fixed loop. In order to counter this problem, we need to adapt the bot such that it adjusts the trajectory of the feet to make the bot stable. This is done using Reinforced learning, a system in which machines learn through trial and error using their own experience. In a bird's eye view for the working of RL, we have an environment (a simulation or kind of a game) in which the agent(player: computer in this case) can take one of the fixed states of actions. You reward the player/computer if the goal is achieved. RL helps to increase the reward by playing in the environment for several iterations called episodes. If the episode ends or the player loses the game, the reset function is called, which forms the default state of the environment for a fresh episode.
A word of caution: Don't confuse it with deep learning. The two might sound similar but are different because reinforced learning learns using rewards and punishment by taking random steps in an environment, whereas deep learning learns by finding patterns in past data to predict results on new data.
Libraries and resources
OpenAI's gym library comes handy in creating custom environments so that we can train our agent. However, it simulates the environment in Pybullet. So the team has decided first to train the model and then export it to Gazebo and integrate it with the other systems. On the RL agent that we have planned to use, we think PPO2 or ARS might be a good choice since our environment has continuous action space and other algorithms fail to work in this case. According to the plan, we are currently training with these two agents and whichever seems efficient will be used finally. The environment was a big task to complete so we are using the one created by Morisbots. We found this a useful guide to learn RL. Another useful course was UAlberta's specialization. Besides, to prepare your agents, you need to learn either Tensorflow or Pytorch. It is better to have a basic idea of both since the online resources are limited and both are equally used.
The text was updated successfully, but these errors were encountered: