Flappy bird played by MPC controller.
- At least two implementation
- By Ben Wiener and Philip Zucker (FlapPyBird-MPC on Github)
- By Matthew Piper, Pranav Bhounsule and Krystel K. Castillo-Villar (paper, FlappyBirdController on Github)
1. Ben and Philip’s code:
- Ben’s blog and Philip’s blog
- 核心代码为
mip.py
- 运行配置: python 3.7, cvxpy, gurobi(申请免费license,运行其安装后目录下的pysetup.bat配置python API)。否则prob.solve需要去掉sovler=“GUROBI”参数。
pygame中
在左上角, 正半轴水平向右, 正半轴竖直向下
path = cvx.Variable((N, 2)) # initialize the y pos and y velocity
flap = cvx.Variable(N-1, boolean=True) # initialize the inputs, whether or not the bird should flap in each step
path是实际受控制器影响的状态量,为
PIPEGAPSIZE = 100 # gap between upper and lower pipe
PIPEWIDTH = 52
BIRDWIDTH = 34
BIRDHEIGHT = 24
BIRDDIAMETER = np.sqrt(BIRDHEIGHT**2 + BIRDWIDTH**2) # the bird rotates in the game, so we use it's maximum extent
上下管道之间的距离,管道宽度均为常数
def solve(playery, playerVelY, lowerPipes):
pipeVelX = -4 # speed in x
playerAccY = 1 # players downward accleration
playerFlapAcc = -14 # players speed on flapping
因为鸟与管道是相对运动,所以pipeVelX也就是鸟前进的速度(常数)
lowerpipes: 下方管道的上界(下方红色十字),因为上下管道距离为常数所以上方管道位置(蓝十字)也已知
y = path[:,0]
vy = path[:,1]
c = [] # init constraint list
c += [y <= GROUND, y >= SKY] # constraints for sky and ground
c += [y[0] == playery, vy[0] == playerVelY] # initial conditions
普适的contstraints包括:鸟不能越过上下边界,及
for t in range(N-1): # look ahead
dt = t//15 + 1 # let time get coarser further in the look ahead
x -= dt * pipeVelX # update x
xs += [x] # add to list
c += [vy[t + 1] == vy[t] + playerAccY * dt + playerFlapAcc * flap[t] ] # add y velocity constraint, f=ma
c += [y[t + 1] == y[t] + vy[t + 1]*dt ] # add y constraint, dy/dt = a
pipe_c, dist = getPipeConstraintsDistance(x, y[t+1], lowerPipes) # add pipe constraints
c += pipe_c
obj += dist
前面的dt较小,后面的dt较大。在前面进行相对密集的采样保持估计的精度,后面降低采样率扩展整个估计的长度,作者在blog里进行了解释。
This technique works pretty well. It doesn’t quite run in real time with the lookahead set to a distance that allows it to succeed. We used a neat trick to improve the speed and look ahead distance. The model’s time step increases with look ahead time. In other words, the model is precise for its first few time steps, and gets less careful later in its prediction. The thinking is that this allows it to make approximate long term plans about jump timing without over-taxing the solver.
def getPipeConstraintsDistance(x, y, lowerPipes):
constraints = [] # init pipe constraint list
pipe_dist = 0 # init dist from pipe center
for pipe in lowerPipes:
dist_from_front = pipe['x'] - x - BIRDDIAMETER
dist_from_back = pipe['x'] - x + PIPEWIDTH
if (dist_from_front < 0) and (dist_from_back > 0):
constraints += [y <= (pipe['y'] - BIRDDIAMETER)] # y above lower pipe
constraints += [y >= (pipe['y'] - PIPEGAPSIZE)] # y below upper pipe
pipe_dist += cvx.abs(pipe['y'] - (PIPEGAPSIZE//2) - (BIRDDIAMETER//2) - y) # add distance from center
return constraints, pipe_dist
此外还需满足的约束为鸟在管道间时不会撞到上下管。并计算鸟的中心到管开口中心的距离
objective = cvx.Minimize(cvx.sum(cvx.abs(vy)) + 100* obj)
objective方程为在所有采样点的y方向速度之和及鸟中心距管开口中心距离之和(后者更重要一些) 在满足上述约束的前提下优化objective,得出动作序列,取第一个采样点的动作,完成一次MPC的过程。