Flappy Bird Game using Q Learning http://140.117.164.207/ Reinforcement learning State Space • Vertical distance from lower pipe(Y) • Horizontal distance from next pair of pipes(X) Life: Dead or Living The distance in X direction is bounded by 0 from below and by 300 from above. The Y distance ranges from −200 to 200. Actions • Click • Do Nothing Rewards • +1 if Flappy Bird is still alive • -1000 if Flappy Bird is dead Algorithm The Learning Loop(1/2) • The Q table is initialized with zeros. • Step 1: Observe what state Flappy Bird is in and perform the action that maximizes expected reward. Let the game engine perform its "tick". Flappy Bird is in a next state, s'. The Learning Loop(2/2) • Step 2: Observe new state, s', and the reward associated with it. +1 if the bird is still alive, 1000 otherwise. • Step 3: Update the Q array according to the Q Learning rule. Q[s,a] ← Q[s,a] + α (r + γ*𝑚𝑎𝑥𝑎 *Q[s',a'] - Q[s,a]) The alpha : 0.7 The dicount factor: 1 • Step 4: Set the current state to s' and start over. Example Q[s,a] ← Q[s,a] + α (r + γ*𝑚𝑎𝑥𝑎 *Q[s',a'] - Q[s,a]) α : 0.7 γ : 1 Rewards: alive :+1 ; dead : -1000 Actions : Click(x-1,y+1) ; Do Nothing(x-1,y-1) Q S/A Click not R S/A Click not (10,1) 0 0 (10,1) 0 0 0 0 0 0 0 0 0 0 0 0 0 0
© Copyright 2026 Paperzz