Neighboring-Optimal Control via Linear-Quadratic (LQ) Feedback Robert Stengel ! Optimal Control and Estimation, MAE 546 ! Princeton University, 2017 •! Linearization of nonlinear dynamic models •! Continuous-time, time-varying optimal LQ feedback control Discrete-time and sampled-data linear dynamic models Dynamic programming approach to optimal LQ sampled-data control •! •! –! –! –! –! ! Nominal trajectory Perturbations about the nominal trajectory Linear, time-invariant dynamic models Examples Copyright 2017 by Robert Stengel. All rights reserved. For educational use only. http://www.princeton.edu/~stengel/MAE546.html http://www.princeton.edu/~stengel/OptConEst.html 1 Neighboring Trajectories Satisfy Same Dynamic Equations x! N (t) = f[x N (t),u N (t),w N (t)], x N ( t o ) given x! (t) = f[x(t),u(t),w(t)], x ( t o ) given •! Neighboring-trajectory dynamic model is the same as the nominal dynamic model x! N (t) = f[x N (t),u N (t),w N (t),t] x! (t) = x! N (t) + !!x(t) = f[x N (t) + !x(t),u N (t) + !u(t),w N (t) + !w(t),t] !x(t o ) = x(t o ) " x N (t o ) !x(t) = x(t) " x N (t) !!x(t) = x! (t) " x! N (t) !u(t) = u(t) " u N (t) !w(t) = w(t) " w N (t) 2 Linear Perturbation to the Nominal Optimal Trajectory •! Nominal nonlinear dynamic equation plus linear perturbation equation x! (t) = x! N (t) + !!x(t) " #f #f #f f[x N (t),u N (t),w N (t),t] + !x(t) + !u(t) + !w(t), #x #u #w x(t o ) = x N (t o ) + !x(t o ) given 3 Solve for the Nominal and Perturbation Trajectories Separately x! N (t) = f[x N (t), u N (t), w N (t),t], x N ( t o ) given !!x(t) " #f #f #f (t ) !x(t) + (t ) !u(t) + (t ) !w(t), !x (t o ) given #x #u #w Jacobian matrices of the linear model are evaluated along the nominal trajectory !f !x x = x N (t ) u= u N (t ) w = w N (t ) ! F(t) ; !f !f ! G(t) ; x = x N (t ) ! u u= uN (t ) !w w = w N (t ) x = x N (t ) u= u N (t ) w = w N (t ) ! L(t) !!x(t) = F(t)!x(t) + G(t)!u(t) + L(t)!w(t), !x ( t o ) given 4 Linearization Example! 5 Cubic Springs Force is a nonlinear function of deflection f ( x ) = !k1 x ± k2 x 3 Example: Expand g sin ! in a power series moment (! ) " #k1! + k2! 3 6 Stiffening Cubic Spring Example 2nd-order nonlinear dynamic model x!1 (t) = f1 [ x(t)] = x2 (t) x!2 (t) = f2 [ x(t)] = !10x1 (t) ! 10x13 (t) ! x2 (t) Integrate equations to produce nominal path ! x1 (0) $ t f ! f1N [ x(t)] &' (# # # f2 [ x(t)] #" x2 (0) &% 0 " N $ & dt ' & % ! x1 (t) $ & in ! 0,t f $ # N " % # x2 N (t) & % " Evaluate partial derivatives of the Jacobian matrices ! f1 ! f2 ! x1 = 0; 2 ! x1 = "10 " 30x1N (t); ! f1 ! f2 ! x2 = 1 ! x2 = "1 ! f1 ! f2 ! u = 0; ! u = 0; ! f1 ! f2 !w = 0 !w = 0 7 Nominal and Perturbation Dynamic Equations Nonlinear, Time-Invariant (NTI) Equation Linear, Time-Varying (LTV) Equation Initial Conditions for Nonlinear and Linear Models x!1N (t) = x2 N (t) x!2 N (t) = !10x1N (t) ! 10x1N 3 (t) ! x2 N (t) " !!x1 (t) % " 0 '=$ $ 2 $# !!x2 (t) '& $# ( 10 + 30x1N(t) ( ! x1 (0) $ ! $ # N &=# 0 & 9 (0) x # 2N & " % " % ) 1 % " !x1 (t) % '$ ' (1 ' $ !x2 (t) ' & &# " !x1 (0) % " 0 % $ '=$ ' $# !x2 (0) '& # 1 & 8 Comparison of Approximate and Exact Solutions x N (t) !x(t) x N (t) + !x(t) x(t) x! N (t) !!x(t) x! N (t) + !!x(t) x! (t) x2 N (0) = 9 !x2 (0) = 1 x2 N (t) + !x2 (t) = 10 x2 (t) = 10 9 Euler-Lagrange Equations for Minimizing Variational Cost Function! 10 Optimize Nominal and Neighboring Dynamic Models Separately Nominal, Nonlinear, Open-Loop Optimal Control u(t) = u opt (t) ! u *(t) Linear, Neighboring-Optimal Control Added to Nominal Control !u(t) = "C(t)!x(t) = "C(t)[ x(t) " x *(t)] u(t) = u *(t) + !u(t) 11 Expand Optimal Control Function Expand optimized cost function to second degree { } J [ x *(t o ) + !x(t o )], "# x *(t f ) + !x(t f ) $% ! J * "# x *(t o ),x *(t f ) $% + !J "# !x(t o ), !x(t f ) $% + ! J "# !x(t o ), !x(t f ) $% 2 = J * !" x *(t o ),x *(t f ) #$ + % 2 J !" %x(t o ), %x(t f ) #$ because First Variation, %J !" %x(t o ), %x(t f ) #$ = 0 Nominal optimized cost, plus nonlinear dynamic constraint tf J * !" x * (t o ), x * (t f ) #$ = % !" x * (t f ) #$ + & L [ x * (t), u * (t)] dt to subject to nonlinear dynamic equation x! * (t) = f [ x * (t), u * (t)], x(t o ) = x o 12 2nd Variation of the Cost Function Objective: Given optimal nominal solution, minimize 2ndvariational cost subject to linear dynamic constraint tf # Lxx (t) Lxu (t) % # !x(t) % ., 1 T 1 *, # (' min ! J = !x (t f )"xx (t f )!x(t f ) + + ) !xT (t) !uT (t) % ' (dt & ' Lux (t) Luu (t) ( ' !u(t) ( / !u 2 2 , to $ & ,0 $ & $ 2 subject to perturbation dynamics !!x(t) = F(t)!x(t) + G(t)!u(t), !x(t o ) = !x o Cost weighting matrices expressed as # % %$ "2 ! P(t f ) ! !xx (t f ) = 2 (t f ) "x Q(t) M(t) & # Lxx (t) Lxu (t) (!% MT (t) R(t) (' %$ Lux (t) Luu (t) dim !" P(t f ) #$ = dim [ Q(t)] = n % n & ( (' dim [ R(t)] = m % m dim [ M(t)] = n % m 13 2nd Variational Hamiltonian Variational cost function tf " Q(t) M(t) $ " !x(t) $ -+ 1 T 1 )+ " '& ! J = !x (t f )P(t f )!x(t f ) + * ( !xT (t) !uT (t) $ & ' dt % & MT (t) R(t) ' & !u(t) ' . 2 2 + to # % /+ %# # , 2 = tf 1 T 1 )' )+ !x (t f )P(t f )!x(t f ) + ( & "# !xT (t)Q(t)!x(t) + 2!xT (t)M(t)!u(t) + !uT (t)R(t)!u(t) $% dt , 2 2 *) to -) Variational Lagrangian plus adjoined dynamic constraint H #$ !x(t), !u(t), !" " ( t ) %& = L [ !x(t), !u(t)] + !" " T ( t ) f [ !x(t), !u(t)] = 1 #$ !xT (t)Q(t)!x(t) + 2!xT (t)M(t)!u(t) + !uT (t)R(t)!u(t) %& 2 + !" " T ( t )[ F(t)!x(t) + G(t)!u(t)] 14 2nd Variational Euler-Lagrange Equations Terminal condition, solution for adjoint vector, and optimality condition ( ) !" " t f = #xx (t f )!x(t f ) = P(t f )!x(t f ) T " ( t ) '( ++) $H %& !x(t), !u(t), !" T !"! ( t ) = # * " (t ) . = #Q(t)!x(t) # M(t)!u(t) # F (t)!" $!x +, +/ T (* !H $% "x(t), "u(t), "# # ( t ) &' ,* T T # (t ) = 0 ) - = M (t)"x(t) + R(t)"u(t) / G (t)"# !"u +* .* 15 Two-Point Boundary-Value Problem State Equation !!x(t) = F(t)!x(t) + G(t)!u(t) !x(t o ) = !x o Adjoint Vector Equation !"! ( t ) = #Q(t)!x(t) # M(t)!u(t) # FT (t)!" " (t ) ( ) !" " t f = P(t f )!x(t f ) 16 Use Control Law to Solve the TwoPoint Boundary-Value Problem From Hu = 0 !u(t) = "R "1 (t) $% MT (t)!x(t) + GT (t)!# # ( t ) &' Control law that feeds back state and adjoint vectors Substitute for control in system and adjoint equations # !!x(t) % % !"! ( t ) $ & *, #$ F(t) ) G(t)R )1 (t)MT (t) &' )G(t)R )1 (t)GT (t) (=+ T ( , # )Q(t) + M(t)R )1 (t)MT (t) & ) # F(t) ) G(t)R )1 (t)MT (t) & ' - $ ' $ ' .# , !x(t) & ( /% " (t ) ( , %$ !" ' 0 Adjoint relationship at end point # !x(t o ) % % !" " tf $ ( ) & # !x o (=% ( % Pf !x f ' $ & ( (' Perturbation state vector Perturbation adjoint vector 17 Use Control Law to Solve the TwoPoint Boundary-Value Problem Assume the adjoint relationship between state and control applies over the entire interval !" " ( t ) = P ( t ) !x ( t ) Control law feeds back state alone !u(t) = "R "1 (t) #$ MT (t)!x(t) + GT (t)P ( t ) !x ( t ) %& = "R "1 (t) #$ MT (t) + GT (t)P ( t ) %& !x ( t ) ! "C(t)!x ( t ) dim ( C ) = m ! n !u(t) ! arg min ! 2 J "# !x ( !u ) $% !u( t ) in ( t o , t f ) 18 Linear-Quadratic (LQ) Optimal Control Gain Matrix !u(t) = "C(t)!x ( t ) Optimal feedback gain matrix C(t) = R !1 (t) "#GT (t)P ( t ) + MT (t) $% •! Properties of feedback gain matrix –! Full state feedback (m x n) –! Time-varying matrix •! R, G, and M given •! Control weighting matrix, R •! State-control weighting matrix, M •! Control effect matrix, G P(t) remains to be determined 19 Solution for the Adjoining Matrix, P(t) Time-derivative of adjoint vector !"! ( t ) = P! ( t ) !x ( t ) + P ( t ) !!x ( t ) Rearrange P! ( t ) !x ( t ) = !"! ( t ) # P ( t ) !!x ( t ) Recall coupled state/adjoint equation # !!x(t) % % !"! ( t ) $ & *, #$ F(t) ) G(t)R )1 (t)MT (t) &' )G(t)R )1 (t)GT (t) (=+ T ( , # )Q(t) + M(t)R )1 (t)MT (t) & ) # F(t) ) G(t)R )1 (t)MT (t) & ' - $ ' $ ' .# , !x(t) & ( /% " (t ) ( , %$ !" ' 0 Substitute in adjoint matrix equation T P! ( t ) !x ( t ) = #$ "Q(t) + M(t)R "1 (t)MT (t) %& !x ( t ) " #$ F(t) " G(t)R "1 (t)MT (t) %& !' ' (t ) { } "P ( t ) #$ F(t) " G(t)R "1 (t)MT (t) %& !x ( t ) " G(t)R "1 (t)GT (t)!' ' (t ) 20 Solution for the Adjoining Matrix, P(t) Substitute for adjoint vector !" " ( t ) = P ( t ) !x ( t ) P! ( t ) !x ( t ) = #$ "Q(t) + M(t)R "1 (t)MT (t) %& !x ( t ) T { " #$ F(t) " G(t)R "1 (t)MT (t) %& P ( t ) !x ( t ) } "P ( t ) #$ F(t) " G(t)R "1 (t)MT (t) %& !x ( t ) " G(t)R "1 (t)GT (t)P ( t ) !x ( t ) ... and eliminate state vector 21 Matrix Riccati Equation for P(t) The result is a nonlinear, ordinary differential equation for the Riccati Matrix, P(t), with terminal boundary conditions T P! ( t ) = "# !Q(t) + M(t)R !1 (t)MT (t) $% ! "# F(t) ! G(t)R !1 (t)MT (t) $% P ( t ) !P ( t ) "# F(t) ! G(t)R !1 (t)MT (t) $% + P ( t ) G(t)R !1 (t)GT (t)P ( t ) ( ) ( ) P t f = &xx t f Time-varying or time-invariant? 22 Characteristics of the Adjoining ( Riccati) Matrix, P(t) •! P(tf) is symmetric, n x n, and typically positive semidefinite •! Matrix Riccati equation is symmetric •! Therefore, P(t) is symmetric and positive semi-definite throughout •! Once P(t) has been determined, optimal feedback control gain matrix, C(t) can be calculated 23 Example of NeighboringOptimal Control: Improved Infection Treatment via Feedback! 24 Natural Response to Pathogen Assault (No Therapy) 25 Tradeoff Between Rate of Killing Pathogen, Preservation of Organ Health, and Drug Use J= 1 1 2 2 s11 x1 f + s44 x 4 f + 2 2 ( ) tf ! (q to 2 2 ) x + q44 x 4 + ru 2 dt 11 1 Optimal Open-Loop Control 26 50% Increased Infection: Nominal and Neighboring-Optimal Control (u1) u(t) = u *(t) ! C(t)"x(t) = u *(t) ! C(t)[x(t) ! x *(t)] 27 Linear-Quadratic Control of Linear, Time-Invariant (LTI) Systems! 28 Linear Time-Varying (LTV) System with Linear-Quadratic (LQ) Feedback Control Continuous-time linear dynamic system !!x(t) = F(t)!x(t) + G(t)!u(t) LQ optimal control law !u(t) = "R "1 (t) #$ MT (t) + GT (t)P ( t ) %& !x ( t ) ! "C(t)!x ( t ) Linear dynamic system with LQ feedback control !!x(t) = F(t)!x(t) + G(t)!u(t) = F(t)!x(t) + G(t) #$ "C(t)!x ( t ) %& = [ F(t) " G(t)C(t)] !x(t) 29 Linear Time-Invariant (LTI) System with Linear-Quadratic (LQ) Feedback Control LTI dynamic system !!x(t) = F!x(t) + G!u(t) Time-invariant cost function tf " Q 1 T 1 )+ " ! J = !x (t f )P(t f )!x(t f ) + * ( !xT (t) !uT (t) $ & % & MT 2 2 + to # # , 2 M $ " !x(t) $ -+ ' dt . '& R '% &# !u(t) '% +/ Riccati ordinary differential equation T P! ( t ) = "# !Q + MR !1MT $% ! "# F ! GR !1MT $% P ( t ) ! P ( t ) "# F ! GR !1MT $% ( ) ( ) +P ( t ) GR !1GT P ( t ) , P t f = &xx t f 30 LTI System with TimeVarying LQ Feedback Control Control gain matrix varies over time !u(t) = "R "1 #$ MT + GT P ( t ) %& !x ( t ) ! "C(t)!x ( t ) Linear dynamic system with time-varying LQ feedback control !!x(t) = F!x(t) + G #$ "C(t)!x ( t ) %& = [ F " GC(t)] !x(t) = Fclosed"loop ( t ) !x(t) 31 Example: LQ Optimal Control of a First-Order System tf 1 1 ! 2 J = p f !x 2 (t f ) + " q!x 2 + ( 0 ) !x!u + r!u 2 dt 2 2 to ( ) !!x = f !x + g!u g2 p2 (t ) p! ( t ) = !q ! 2 fp ( t ) + r p t f = pf ( ) !u = "r "1 #$ gp ( t ) %& !x ( t ) =" gp ( t ) !x r 32 Example: LQ Optimal Control of a Stable First-Order System f = !1; g = 1 !!x = "!x + !u; x ( 0 ) = 1 q = r =1 p! ( t ) = !1+ 2 p ( t ) + p 2 ( t ) !xopen"loop ( t ) p (t ) ( ) p tf =1 Control gain = p ( t ) !u ( t ) !u = " p ( t ) !x !!x = " #$1+ p ( t ) %& !x !xclosed"loop ( t ) 33 Example: LQ Optimal Control of an Unstable First-Order System f = 1; g = 1 !!x = !x + !u; x ( 0 ) = 1 p! ( t ) = !1! 2 p ( t ) + p 2 ( t ) ( ) !xopen"loop ( t ) p (t ) p tf =1 Control gain = p ( t ) !u = " p ( t ) !x !!x = #$1 " p ( t ) %& !x !u ( t ) !xclosed"loop ( t ) 34 Example: LQ Optimal Control, Stable First-Order Disturbance System, White-Noise !!x = "!x + !u + !w; x ( 0 ) = 0 !u = " p ( t ) !x p! ( t ) = !q + 2 p ( t ) + p 2 ( t ); q = 1 or 100 ( ) !!x = " #$1 + p ( t ) %& !x p tf =1 q=1 q = 100 Open-Loop Response !xopen"loop ( t ) !xopen"loop ( t ) p (t ) p (t ) !u ( t ) !u ( t ) Open- and Closed-Loop Responses 35 Discrete-Time and SampledData Linear, Time-Invariant (LTI) Systems! 36 Continuous- and Discrete-Time LTI System Models Continuous-time ( analog ) model is based on an ordinary differential equation !!x(t) = F!x(t) + G!u(t) + L!w(t) !x(t o ) given !y(t) = H x !x(t) + H u !u(t) + H w !w(t) Dynamic Process Observation Process Discrete-time (digital ) model is based on an ordinary difference equation !x(t k+1 ) = "!x(t k ) + #!u(t k ) + $!w(t k ) Dynamic Process !y(t k ) = H x !x(t k ) + H u !u(t k ) + H w !w(t k ) Observation Process !x(t o ) given 37 Digital Control Systems Use Sampled Data •! Periodic sequence !xk = !x ( t k ) = !x ( k!t ) •! •! Sampler is an analog-to-digital (A/D) converter Reconstructor is a digital-to-analog (D/A) converter 38 LTI System Response to Inputs and Initial Conditions Solution of a LTI dynamic model !!x(t) = F!x(t) + G!u(t) + L!w(t), !x(t o ) given t !x(t) = !x(t o ) + # [ F!x(" ) + G!u(" ) + L!w(" )] d" •! ... has two parts to –! Unforced (homogeneous) response to initial conditions –! Forced response to control and disturbance inputs Step input response Initial condition response 39 Unforced Response to Initial Conditions Neglecting forcing functions t !x(t) = !x(t o ) + # [ F!x(" )] d" = $ ( t,t o ) !x(t o ) to The state transition matrix propagates the state from to to t by a single multiplication The state transition matrix is the matrix exponential t !x(t) = !x(t o ) + # [ F!x(" )] d" to = eF(t$to ) !x(t o ) = % ( t $ t o ) !x(t o ) 40 State Transition Matrix is the Matrix Exponential eF(t!to ) ! eF" t = Matrix Exponential 1 1 2 3 F" t ] + [ F" t ] + ... [ 2! 3! = # (" t ) = State Transition Matrix = I + F" t + See pages 79-84 of Optimal Control and Estimation for a description of how the State Transition Matrix is calculated for LTV and LTI systems 41 Response to Inputs !x(t k ) = !x(t k "1 ) + Solution of the LTI model with piecewise-constant forcing functions tk $ [ F!x(# ) + G!u(# ) + L!w(# )] d# t k"1 ! t ! t k " t k"1 = constant !x(t k ) = e !x(t k#1 ) + e F" t F" t tk ) %& e t k#1 # F($ #t k#1 ) ' d$ [ G!u(t k#1 ) + L!w(t k#1 )] ( = *!x(t k#1 ) + +!u(t k#1 ) + ,!w(t k#1 ) ! ! eF" t # ! eF" t ( I $ e$ F" t ) F $1G = ( eF" t $ I ) F $1G % ! eF" t ( I $ e$ F" t ) F $1L = ( eF" t $ I ) F $1L 42 Discrete-Time LTI System Response to Step Input Propagation of !x ( t k ) with constant " t, #, $ , and % !x(t1 ) = "!x(t o ) + #!u(t o ) + $!w(t o ) !x(t 2 ) = "!x(t1 ) + #!u(t1 ) + $!w(t1 ) !x(t 3 ) = "!x(t 2 ) + #!u(t 2 ) + $!w(t 2 ) ! Step Input: Discretetime solution is exact at sampling instants 43 Series Expressions for DiscreteTime LTI System Matrices ! = I + F (" t ) + 1 1 2 3 #$ F (" t ) %& + #$ F (" t ) %& + ... 2! 3! 1 1 2 ( + ! = eF" t # I F #1G = * I + $% F (" t ) &' + $% F (" t ) &' + ...- G" t ) , 2! 3! ( ) 1 1 2 ( + . = eF" t # I F #1L = * I + $% F (" t ) &' + $% F (" t ) &' + ...- L" t ) , 2! 3! ( ) As time interval becomes very small, discrete-time model approaches continuous-time model ! $"$$ # ( I + F" t ) t #0 % $"$$ # G" t t #0 # L" t & $"$$ t #0 44 Sampled-Data Cost Function Sampled-Data Cost Function: a Discrete-Time Cost Function that accounts for system response between sampling instants tf " Q 1 T 1 )+ " min ! J = min !x (t f )P(t f )!x(t f ) + * ( !xT (t) !uT (t) $ & % & MT !u( t ) !u( t ) 2 2 + to # # , M $ " !x(t) $ -+ ' dt . '& R '% &# !u(t) '% /+ 2 Sum integrals over short time intervals, (tk, tk+1) k 01 t " Q 1 1 f )+ k+1 " T min ! J = min !x k f Pk f !x k f + 1 * ( !xT (t) !uT (t) $ & % & MT !u( t ) !u( t ) 2 2 k=0 + tk # # , 2 M $ " !x(t) $ -+ ' dt . '& R '% &# !u(t) '% /+ Minimize subject to sampled-data dynamic constraint !x(t k +1 ) = " (# t ) !x(t k ) + $ (# t ) !u(t k ) 45 Integrand of SampledData Cost Function Use dynamic equation ... !x ( t ) = " ( t,t k ) !x ( t k ) + # ( t,t k ) !u ( t k ) ! " ( t,t k ) !x k + # ( t,t k ) !u k ...to express the integrand in the sampling interval, (tk,tk+1) 1 2 )+tk+1 " Q T T " $ 1 * ( # !x (t) !u (t) % & MT k = 0 + tk &# , k f 01 M $ " !x(t) $ -+ ' dt . '& R '% &# !u(t) '% +/ 46 Integrand of SampledData Cost Function )+tk+1 " Q 1 * ( "# !xT (t) !uT (t) $% & MT k = 0 + tk &# , M $ " !x(t) $ -+ ' dt . '& R '% &# !u(t) '% /+ k f 01 1 2 Bring state and control out of integral Assume control is constant in sampling interval = 1 2 k f 21 + - 3 ,"# k=0 . !xTk !uTk $ % t k+1 * tk " & ( t,t k ) & T ( t,t k ) "#Q' ' ( t,t k ) + M $% & T ( t,t k ) Q& ( T ( T T ' ( t,t k ) + M $% & ( t,t k ) "# R + ' ( t,t k ) M + M ' ( t,t k ) + ' T ( t,t k ) Q' ' ( t,t k ) $% (# "#Q' $ " ) dt ( !x k ) ( !u k )% k # Integration has been replaced by summation = k /1 ( 1 f *" 0 ) !x k T 2 k=0 * # + " Q̂ !u k T $ & % & M̂T # M̂ $ " !x k ' & R̂ '% k &# !u k $ ,* ''% * . 47 Berman, Gran, J. Aircraft, 1974 Sampled-Data Cost Function Weighting Matrices Assume Q, M, and R are constant in the integration interval ! ( t,t k ) and " ( t,t k ) vary in the integration interval Q̂ = M̂ = R̂ = t k+1 t k+1 t k+1 ! ( t,t ) dt " ! (t,t ) Q! T k tk k " ! (t,t ) $%Q## (t,t ) + M &' dt tk T k k " $% R + # (t,t ) M + M # (t,t ) + # (t,t ) Q## (t,t )&' dt tk T k T k T k k The integrand accounts for continuous-time variations of the Inter-sample ripple ) LTI system between sampling instants ( 48 / $)0 )% 1 Dynamic Programming Approach to Sampled-Data Optimal Control! 49 Discrete-Time Hamilton-JacobiBellman Equation Discrete HJB equation Value Function at to V (t o ) = ! k f + •! •! Vk * = ! min { Lk + Vk+1 *} "u k k f "1 #L k=0 k = ! min H k , V * #$ "x k f *%& = given "u k subject to "x k+1 = '"x k + ("u k Begin at terminal point with optimal value function Working backward, add minimum value function increment (Bellmans Principle of Optimality) … optimal policy … whatever the initial state and initial decision …remaining decisions must constitute an optimal policy with regard to the current state 50 Sampled-Data Cost Function Contains Terminal and Summation Costs Integral cost has been replaced by a summation cost Terminal cost is the same min J sampled !u k ( k /1 ( 1 f *" *1 T = min ) !x k f Pk f !x k f + 0 ) !x k T !u k 2 k=0 * # *+ 2 + !u k T " $ & Q̂ % & M̂T # M̂ $ " !x k ' & R̂ '% k &# !u k $ ,* ,* ' -'% * * .. subject to !x k+1 = "!x k + #!u k 51 Dynamic Programming Approach to Sampled-Data LQ Control Quadratic Value Function at to 1 1 V (t o ) = !xTk fPk f !x k f + 2 2 ( * 0 )"# !xTk k=0 * + k f /1 !u T k " $ & Q̂ % & M̂T # M̂ $ " !x k '& R̂ '% &# !u k $ ,* ''% *. Discrete HJB equation '1 * Vk * = ! min ( #$ "x k *T Q̂"x k * +2"x k *T M̂"u k + "u k T R̂"u k %& + Vk+1 *+ "u k ) 2 , = ! min H k , V * #$ "x k f *%& = "x k f *T Pk f "x k f *T "u k subject to "x k+1 = -"x k + ."u k 52 Optimality Condition Assume value function takes a quadratic form Vk = 1 T 1 !x k Pk !x k ; Vk +1 = !x k +1 T Pk +1!x k +1 2 2 Optimality condition !H k !V = #$ "xTk M̂ + "uTk R̂ %& + k +1 = 0 !"u k !"u k where #1 & ! % "x k+1 T Pk+1"x k+1 ( !Vk+1 2 ' = )"x + *"u T P * = $ [ k k] k+1 !"u k !"u k hence " !xTk M̂ + !uTk R̂ $ + "# !xTk & T + !uTk ' T $% Pk +1' = 0 # % 53 Minimizing Value of Control !H k = "x k T %& M̂ + # T Pk+1$ '( + "u k T %& R̂ + $ T Pk+1$ '( = 0 !"u k "1 !u k = " $% R̂ + # T Pk +1# &' $% M̂T + # T Pk +1( &' !x k ! "C k !x k ( Must find Pk in 0, k f ) Use definitions of V * and !u in HJB equation 54 Solution for Pk Vk = 1 T 1 !x k Pk !x k ; Vk +1 = !x k +1 T Pk +1!x k +1 2 2 "1 !u k = " $% R̂ + # T Pk +1# &' $% M̂T + # T Pk +1( &' !x k ! "C k !x k Substitute for Vk ,Vk +1 , and !u k in discrete-time HJB equation 1 !x k T Pk !x k = 2 T '1 * " min ( # !x k T Q̂!x k + 2!x k T M̂ ( "C k !x k ) + ( "C k !x k ) R̂ ( "C k !x k ) + !x k+1 T Pk+1!x k+1 % + $ & !u k )2 , Rearrange and cancel !x k on both sides of the equation to yield the discrete-time Riccati equation T "1 Pk = Q̂ + ! T Pk+1! " $% M̂ T + # T Pk+1! &' $% R̂ + # T Pk+1# &' $% M̂ T + # T Pk+1! &' Pk f given 55 Discrete-Time System with LinearQuadratic Feedback Control Dynamic System !x k +1 = "!x k + #!u k Control law "1 !u k = " $% R̂ + # T Pk+1# &' $% M̂ T + # T Pk+1( &' !x k ! "C k !x k Dynamic system with LQ feedback control !x k+1 = "!x k + #!u k = "!x k + # ( $C k !x )k = ( " $ #C k ) !x k 56 Next Time:! Dynamic System Stability! Reading! OCE: Section 2.5! ! 57 Supplemental Material 58 Neighboring Trajectories •! Nominal (or reference) trajectory and control history {x N (t), u N (t), w N (t)} x : dynamic state for t in [t o ,t f ] u : control input w : disturbance input •! Trajectory perturbed by –! Small initial condition variation –! Small control variation –! Small disturbance variation {x(t), u(t), w(t)} for t in [to ,t f ] = {x N (t) + !x(t), u N (t) + !u(t), w N (t) + !w(t)} 59 Example: Separate Solutions for Nominal and Perturbation Trajectories Original nonlinear equation describes nominal dynamics ! x!1 # N x! N = # x!2 N # #" x! 3N $ x2 N + dw1N $ ! & # & 2 # & = a2 x3N ' x2 N + a1 x3N ' x1N + b1u1N + b2u2 N & , & & # & # 3 &% c2 x3N + c1 x1N + x2 N + b3 x1N u1N &% #" ( ) ( ( ) ) ! x1 # N # x2 N # #" x3N $ & & given & &% Linear, time-varying equation describes perturbation dynamics " !x!1 $ $ !x!2 $ !x! 3 # 0 % " ' $ (2a x ( x 1 3N 1N '=$ ' $ c1 + b3u1N & $# ( " 0 $ + $ b1 $ b3 x1N # ( ) ) 1 0 ( (a2 a2 + 2a1 x3N ( x1N c1 3c2 x3N 0 % ' " !u1 b2 ' $ !u2 0 ' $# & 2 % " d % ' + $ 0 ' !w1 ; '& $ 0 ' $# '& ) %" ' $ !x1 ' $ !x2 '$ ' # !x3 & " !x1 (t o ) % ' $ $ !x2 (t o ) ' $ !x (t ) ' 3 o & # % ' ' ' & given 60 Example: LQ Optimal Control, Stable First-Order Disturbance System, White-Noise !!x = "!x + !u + !w; x ( 0 ) = 0 !u = " p ( t ) !x p! ( t ) = !1+ 2 p ( t ) + p 2 ( t ) ( ) p t f = 1 or 1000 !!x = " #$1 + p ( t ) %& !x pf = 1 Open-Loop Response pf = 1000 !xopen"loop ( t ) !xopen"loop ( t ) p (t ) p (t ) !u ( t ) !u ( t ) Open- and Closed-Loop Responses 61 Example: LQ Optimal Control, Stable First-Order Disturbance System, White-Noise !!x = "!x + !u + !w; x ( 0 ) = 0 !u = " p ( t ) !x !!x = " #$1 + p ( t ) %& !x pf = 1 p! ( t ) = !100 + 2 p ( t ) + p 2 ( t ) ( ) p t f = 1 or 0 pf = 0 !xopen"loop ( t ) !xopen"loop ( t ) p (t ) p (t ) !u ( t ) !u ( t ) !xopen"loop ( t ) , !xclosed"loop ( t ) !xopen"loop ( t ) , !xclosed"loop ( t ) 62 Multivariable LQ Optimal Control with Cross Weighting, M, = 0 No state/control coupling in cost function 1 T !x (t f )P(t f )!x(t f ) !! x(t) = F!x(t) + G!u(t) 2 tf " Q 0 $ " !x(t) $ -+ 1 )+ + * ( " !xT (t) !uT (t) $ & & ' dt % 0 R ' & !u(t) ' . 2 + to # # % # % +/ , !2 J = T P! ( t ) = [ !Q] ! [ F ] P ( t ) ! P ( t )[ F ] + P ( t ) GR !1GT P ( t ) ( ) P t f = Pf !u(t) = "R "1 #$GT P ( t ) %& !x ( t ) 63 Nonlinear System with NeighboringOptimal Feedback Control Nonlinear dynamic system x! (t) = f !" x ( t ) , u(t) #$ Neighboring-optimal control law u(t) = u * (t) ! C(t)"x ( t ) = u * (t) ! C(t) #$ x ( t ) ! x * ( t ) %& Nonlinear dynamic system with neighboringoptimal feedback control { x! (t) = f x ( t ) , "# u * (t) ! C(t) "# x ( t ) ! x * ( t ) $% $% } 64 Development of ! Neighboring-Optimal Therapy •! Expand dynamic equation to first degree •! •! •! x(t) = x * (t) + !x(t) u(t) = u * (t) + !u(t) Compute nominal optimal control history u(t) = u opt (t) using original nonlinear dynamic model Compute optimal perturbation control using locally linearized dynamic model !u(t) = "C(t) #$ x(t) " x opt (t) %& Sum the two for neighboring-optimal control of the dynamic system u(t) = u opt (t) + !u(t) 65 First-Order LQ Example Code % % % First-Order LQ Example Rob Stengel 2/23/2011 clear global tp p q tw w function xdot = global xo = 0; to = 0; tf = 10; wt tspanx = [to tf]; xdot tw = [0:0.01:10]; for k = 1:length(tw) function pdot = w(k) = randn(1); global end pdot [tx,x] = ode15s('First',tspanx,xo); pf = 0; q = 100; function xdot = tspanp = [tf to]; global [tp,p] = ode15s('FirstRiccati',tspanp,pf); wt [tc,xc] = ode15s('FirstCL',tspanx,xo); pt u = interp1(tp,p,tc).*xc; xdot figure subplot(4,1,1) plot(tx,x),grid,xlabel('Time'),ylabel('x-open-loop') subplot(4,1,2) plot(tp,p,'r'),grid,xlabel('Time'),ylabel('p') subplot(4,1,3) plot(tc,u),grid, xlabel('Time'),ylabel('u') subplot(4,1,4) plot(tc,xc,'r',tx,x,'b'),grid,xlabel('Time'),ylabel('x- closed-loop') First(t,x) tp p tw w = interp1(tw,w,t,'nearest'); = -x + wt; FirstRiccati(t,p) q = -q + 2*p + p^2; FirstCL(tc,xc); tp p tw w = interp1(tw,w,tc,'nearest'); = interp1(tp,p,tc); = -(1 + pt)*xc + wt; 66 Initial-Condition Response via State Transition Propagation of !x ( t k ) in LTI system !x(t1 ) = " ( t1 # t o ) !x(t o ) !x(t 2 ) = " ( t 2 # t1 ) !x(t1 ) !x(t 3 ) = " ( t 3 # t 2 ) !x(t 2 ) !x(t1 ) = " (# t ) !x(t o ) = "!x(t o ) !x(t 2 ) = "!x(t1 ) = " 2 !x(t o ) !x(t 3 ) = "!x(t 2 ) = " 3!x(t o ) … … State transition matrix is constant if (t k ! t k!1 ) = " t = ! = I + F (" t ) + + constant 2 1 #$ F (" t ) %& 2! 3 1 #$ F (" t ) %& + ... 3! 67 Example: Equivalent Continuous-Time and Discrete-Time System Matrices Continuous-time ( analog ) system " !1.2794 !7.9856 % F=$ ' 1 !1.2709 & # " !9.069 % G=$ ' 0 # & " !7.9856 % L=$ ' # !1.2709 & ) system Discrete-time (digital # 0.845 "0.6936 & !=% ( $ 0.0869 0.8457 ' ! t = 0.1s # "0.8404 & )=% ( $ "0.0414 ' # "0.6936 & *=% ( $ "0.1543 ' ! t = 0.5s Time interval has a large effect on the discrete-time matrices # 0.0823 "1.4751 & !=% ( $ 0.1847 0.0839 ' # "2.4923 & )=% ( $ "0.6429 ' # "1.4751 & *=% ( $ "0.9161 ' 68 Evaluating Sampled-Data Weighting Matrices ! ( t,t k ) and " ( t,t k ) vary in the integration interval Break interval into smaller intervals, and approximate as sum of short rectangular integration steps "t ! ( t,0 ) dt Q̂ = # ! T ( t,0 ) Q! 0 100 ! * &' ! T ( t k$1 ,0 ) Q! ! ( t k$1 ,0 )% t (), % t = "t 100, t k = k% t k=1 100 ! * &' eF tk$1 QeFtk$1% t () T k=1 69 Evaluating Sampled-Data Weighting Matrices 1 1 2 ( + ! = eF" t # I F #1G = * I + $% F (" t ) &' + %$ F (" t ) &' + ...- G" t ) , 2! 3! ( ) Q̂, M̂, and R̂ evaluated just once for LTI system 't " ( t,0 ) + M %& dt M̂ = ( ! T ( t,0 ) #$Q" 0 5 2# T 1 1 2 % * ! 9 3 0 eF tk)1 Q , I + [ Ft k)1 ] + [ Ft k)1 ] + .../ Gt k)1 1 + M 68 t + . 2! 3! & k=1 4 $ 7 100 100 " ( t k)1 ) %& 8 t R̂ ! 9 #$ R + " T ( t k)1 ) M + MT " ( t k)1 ) + " T ( t k)1 ) Q" k=1 70 Sampled-Data Cost Function Weighting Always Includes State-Control Weighting M̂ = = t k+1 ' tk t k+1 " ( t,t k ) + M %& dt ! T ( t,t k ) #$Q" ' ! (t,t ) Q"" (t,t ) dt tk T k k even if M = 0 Sampled-Data Lagrangian Lk = 1 " !xTk Q̂!x k + 2!xTk M̂!u k + !uTk R̂!u k $ % 2# 71 Continuous-Time LQ Optimization via Dynamic Programming! 72 Dynamic Programming Approach to ContinuousTime LQ Control Value Function at to 1 T !x (t f )P(t f ) f !x(t f ) 2 tf " Q(t) M(t) $ " !x(t) $ -+ 1 )+ " T T $ '& + * ( !x (t) !u (t) & ' dt % & MT (t) R(t) ' & !u(t) ' . 2 + to # # % +/ # % , V (t o ) = Value Function at t1 1 T !x (t f )P(t f )!x(t f ) 2 t # Q(t) M(t) % # !x(t) % ., 1 *, 1 # T T % (' " + ) !x (t) !u (t) ' ( dt & ' MT (t) R(t) ( ' !u(t) ( / 2 , tf $ $ & ,0 $ & - V (t1 ) = 73 Dynamic Programming Approach to ContinousTime LQ Control Time Derivative of Value Function !V * [ "x *(t1 )] = !t (1 , T T T ** 2 $% "x * (t1 )Q(t1 )"x *(t1 ) + 2"x * (t1 )M(t1 )"u(t1 ) + "u (t1 )R(t1 )"u(t1 ) &' ** # min ) "u !V * [ "x *(t1 )] * * + [ F(t1 )"x *(t1 ) + G(t1 )"u(t1 )] *+ *. !"x 74 Dynamic Programming Approach to ContinuousTime LQ Control Hamiltonian "V * & # ! H % !x*, !u, "!x (' $ "V * [ !x *] 1 #$ !x *T Q!x * +2!x *T M!u + !uT R!u &' + [ F!x * +G!u ] 2 "!x HJB Equation !V * [ "x *] !V * ' $ = # min H & "x*, "u, , "u !"x )( !t % V * $% "x(t f ) '( = "x *T (t f )P(t f )"x *T (t f ) 75 Plausible Form for the Value Function Quadratic Function of State Perturbation V * [ !x * (t)] = !x *T (t)P(t)!x *T (t) Time Derivative of the Value Function !V * 1 ! 1 )#x * (t1 ) & = " $% #x *T (t1 )P(t ' !t 2 Gradient of the Value Function with respect to the state !V * = "x *T (t)P(t) !"x 76 Optimal Control Law and HJB Equation Optimal control law !H = "xT M + "uT R + "xT PG = 0 !u "u(t) = #R #1 ( GT P + MT ) "x(t) Incorporate Value Function Model in HJB equation { ! = !xT P!x T !xT #$ "Q + MR "1MT %& " #$ F " GR "1MT %& P " P #$ F " GR "1MT %& + PGR "1GT P !x ( t ) can be cancelled on left and right 77 Matrix Riccati Equation P! ( t ) = "# !Q(t) + M(t)R !1 (t)MT (t) $% T ! "# F(t) ! G(t)R !1 (t)MT (t) $% P ( t ) !P ( t ) "# F(t) ! G(t)R !1 (t)MT (t) $% +P ( t ) G(t)R !1 (t)GT (t)P ( t ) ( ) ( ) P t f = &xx t f 78 } !x Example: 1st-Order System with LQ Feedback Control 1st-order discrete-time dynamic system !xk +1 = "!xk + #!uk LQ optimal control law m + #$ pk +1 !uk = " !xk ! "ck !xk r + $ 2 pk +1 pk = q + ! pk +1 2 2 m + !# pk +1 ) ( " , r + # 2 pk +1 pk f given Dynamic system with LQ feedback control !xk+1 = "!xk + #!uk = "!xk + # ( $ck !x )k = (" $ # ck ) !xk 79
© Copyright 2025 Paperzz