Skip to content

Latest commit

 

History

History
10 lines (7 loc) · 525 Bytes

README.md

File metadata and controls

10 lines (7 loc) · 525 Bytes

surge

Asynchronous server for collecting offline rollouts in a reinforcement learning setting

  1. Externally, Pytorch models of agent policy functions are are trained using PPO
  2. Models weights are are sent by clients to be cached in the server
  3. Each model version plays multiple matches against all other models
  4. Rollouts of these matches are collected and returned to the clients

A fruitbots clone is used as the game environment in this engine