Football ML

2D self-play reinforcement learning · PPO · live training

Agent A (left) 0
:
0 Agent B (right)
waiting for trainer…
Episodes: 0 Step: 0 Avg reward A: Avg reward B:

Reward (rolling avg 50 ep)

Episode length

● disconnected