My implementations of popular deep reinforcement algorithms. Each algorithm is implemented in a single file for readability and ease of understanding.
Algorithm | Action Space | Implementation |
---|---|---|
Deep Q-learning (DQN) | Discrete | dqn.py |
REINFORCE (with baseline) | Discrete | reinforce.py |
Deep Deterministic Policy Gradient (DDPG) | Continuous | ddpg.py |
Twin Delayed Deep Deterministic Policy Gradient (TD3) | Continuous | td3.py |
5K timesteps into training (0.05% completed)
td3-halfcheetah-timestep-30k.mp4
205K timesteps into training (20.5% completed)
td3-halfcheetah-timestep-230k.mp4
955K timesteps into training (95.5% completed)