7. Neighboring-Optimal Control MAE 546 2017.pptx

Neighboring-Optimal Control via
Linear-Quadratic (LQ) Feedback
Robert Stengel !
Optimal Control and Estimation, MAE 546 !
Princeton University, 2017
•!
Linearization of nonlinear dynamic models
•!
Continuous-time, time-varying optimal LQ
feedback control
Discrete-time and sampled-data linear
dynamic models
Dynamic programming approach to optimal
LQ sampled-data control
•!
•!
–!
–!
–!
–!
!
Nominal trajectory
Perturbations about the nominal trajectory
Linear, time-invariant dynamic models
Examples
Copyright 2017 by Robert Stengel. All rights reserved. For educational use only.
http://www.princeton.edu/~stengel/MAE546.html
http://www.princeton.edu/~stengel/OptConEst.html
1
Neighboring Trajectories Satisfy
Same Dynamic Equations
x! N (t) = f[x N (t),u N (t),w N (t)], x N ( t o ) given
x! (t) = f[x(t),u(t),w(t)], x ( t o ) given
•! Neighboring-trajectory dynamic model is the same as the
nominal dynamic model
x! N (t) = f[x N (t),u N (t),w N (t),t]
x! (t) = x! N (t) + !!x(t) = f[x N (t) + !x(t),u N (t) + !u(t),w N (t) + !w(t),t]
!x(t o ) = x(t o ) " x N (t o )
!x(t) = x(t) " x N (t)
!!x(t) = x! (t) " x! N (t)
!u(t) = u(t) " u N (t)
!w(t) = w(t) " w N (t)
2
Linear Perturbation to the
Nominal Optimal Trajectory
•! Nominal nonlinear dynamic equation plus
linear perturbation equation
x! (t) = x! N (t) + !!x(t) "
#f
#f
#f
f[x N (t),u N (t),w N (t),t] +
!x(t) +
!u(t) +
!w(t),
#x
#u
#w
x(t o ) = x N (t o ) + !x(t o ) given
3
Solve for the Nominal and
Perturbation Trajectories Separately
x! N (t) = f[x N (t), u N (t), w N (t),t], x N ( t o ) given
!!x(t) "
#f
#f
#f
(t ) !x(t) + (t ) !u(t) + (t ) !w(t), !x (t o ) given
#x
#u
#w
Jacobian matrices of the linear model are evaluated along the
nominal trajectory
!f
!x
x = x N (t )
u= u N (t )
w = w N (t )
! F(t) ;
!f
!f
! G(t) ;
x = x N (t )
! u u= uN (t )
!w
w = w N (t )
x = x N (t )
u= u N (t )
w = w N (t )
! L(t)
!!x(t) = F(t)!x(t) + G(t)!u(t) + L(t)!w(t), !x ( t o ) given
4
Linearization
Example!
5
Cubic Springs
Force is a nonlinear function of deflection
f ( x ) = !k1 x ± k2 x 3
Example: Expand g sin !
in a power series
moment (! ) " #k1! + k2! 3
6
Stiffening Cubic Spring Example
2nd-order nonlinear dynamic model
x!1 (t) = f1 [ x(t)] = x2 (t)
x!2 (t) = f2 [ x(t)] = !10x1 (t) ! 10x13 (t) ! x2 (t)
Integrate equations to produce nominal path
! x1 (0) $ t f ! f1N [ x(t)]
&' (#
#
# f2 [ x(t)]
#" x2 (0) &%
0
" N
$
& dt '
&
%
! x1 (t) $
& in ! 0,t f $
# N
"
%
# x2 N (t) &
%
"
Evaluate partial derivatives of the Jacobian matrices
! f1
! f2
! x1 = 0;
2
! x1 = "10 " 30x1N (t);
! f1
! f2
! x2 = 1
! x2 = "1
! f1
! f2
! u = 0;
! u = 0;
! f1
! f2
!w = 0
!w = 0
7
Nominal and Perturbation
Dynamic Equations
Nonlinear, Time-Invariant
(NTI) Equation
Linear, Time-Varying
(LTV) Equation
Initial Conditions
for Nonlinear and
Linear Models
x!1N (t) = x2 N (t)
x!2 N (t) = !10x1N (t) ! 10x1N 3 (t) ! x2 N (t)
" !!x1 (t) % "
0
'=$
$
2
$# !!x2 (t) '& $# ( 10 + 30x1N(t)
(
! x1 (0) $ !
$
# N
&=# 0 &
9
(0)
x
# 2N
& "
%
"
%
)
1 % " !x1 (t) %
'$
'
(1 ' $ !x2 (t) '
&
&#
" !x1 (0) % " 0 %
$
'=$
'
$# !x2 (0) '& # 1 &
8
Comparison of Approximate and Exact Solutions
x N (t)
!x(t)
x N (t) + !x(t)
x(t)
x! N (t)
!!x(t)
x! N (t) + !!x(t)
x! (t)
x2 N (0) = 9
!x2 (0) = 1
x2 N (t) + !x2 (t) = 10
x2 (t) = 10
9
Euler-Lagrange Equations
for Minimizing Variational
Cost Function!
10
Optimize Nominal and Neighboring
Dynamic Models Separately
Nominal, Nonlinear, Open-Loop Optimal Control
u(t) = u opt (t) ! u *(t)
Linear, Neighboring-Optimal Control Added
to Nominal Control
!u(t) = "C(t)!x(t)
= "C(t)[ x(t) " x *(t)]
u(t) = u *(t) + !u(t)
11
Expand Optimal Control Function
Expand optimized cost function to second degree
{
}
J [ x *(t o ) + !x(t o )], "# x *(t f ) + !x(t f ) $% !
J * "# x *(t o ),x *(t f ) $% + !J "# !x(t o ), !x(t f ) $% + ! J "# !x(t o ), !x(t f ) $%
2
= J * !" x *(t o ),x *(t f ) #$ + % 2 J !" %x(t o ), %x(t f ) #$
because First Variation, %J !" %x(t o ), %x(t f ) #$ = 0
Nominal optimized cost, plus nonlinear dynamic constraint
tf
J * !" x * (t o ), x * (t f ) #$ = % !" x * (t f ) #$ + & L [ x * (t), u * (t)] dt
to
subject to nonlinear dynamic equation
x! * (t) = f [ x * (t), u * (t)], x(t o ) = x o
12
2nd Variation of the Cost Function
Objective: Given optimal nominal solution, minimize 2ndvariational cost subject to linear dynamic constraint
tf
# Lxx (t) Lxu (t) % # !x(t) % .,
1 T
1 *, #
('
min ! J = !x (t f )"xx (t f )!x(t f ) + + ) !xT (t) !uT (t) % '
(dt
& ' Lux (t) Luu (t) ( ' !u(t) ( /
!u
2
2 , to $
& ,0
$
&
$
2
subject to perturbation dynamics
!!x(t) = F(t)!x(t) + G(t)!u(t), !x(t o ) = !x o
Cost weighting matrices expressed as
#
%
%$
"2 !
P(t f ) ! !xx (t f ) = 2 (t f )
"x
Q(t) M(t) & # Lxx (t) Lxu (t)
(!%
MT (t) R(t) (' %$ Lux (t) Luu (t)
dim !" P(t f ) #$ = dim [ Q(t)] = n % n
&
(
('
dim [ R(t)] = m % m
dim [ M(t)] = n % m
13
2nd Variational Hamiltonian
Variational cost function
tf
" Q(t) M(t) $ " !x(t) $ -+
1 T
1 )+ "
'&
! J = !x (t f )P(t f )!x(t f ) + * ( !xT (t) !uT (t) $ &
' dt
% & MT (t) R(t) ' & !u(t) ' .
2
2 + to #
% /+
%#
#
,
2
=
tf
1 T
1 )'
)+
!x (t f )P(t f )!x(t f ) + ( & "# !xT (t)Q(t)!x(t) + 2!xT (t)M(t)!u(t) + !uT (t)R(t)!u(t) $% dt ,
2
2 *) to
-)
Variational Lagrangian plus adjoined dynamic constraint
H #$ !x(t), !u(t), !"
" ( t ) %& = L [ !x(t), !u(t)] + !"
" T ( t ) f [ !x(t), !u(t)]
=
1
#$ !xT (t)Q(t)!x(t) + 2!xT (t)M(t)!u(t) + !uT (t)R(t)!u(t) %&
2
+ !"
" T ( t )[ F(t)!x(t) + G(t)!u(t)]
14
2nd Variational Euler-Lagrange
Equations
Terminal condition, solution for adjoint vector, and
optimality condition
( )
!"
" t f = #xx (t f )!x(t f ) = P(t f )!x(t f )
T
" ( t ) '( ++) $H %& !x(t), !u(t), !"
T
!"! ( t ) = # *
" (t )
. = #Q(t)!x(t) # M(t)!u(t) # F (t)!"
$!x
+,
+/
T
(* !H $% "x(t), "u(t), "#
# ( t ) &' ,*
T
T
# (t ) = 0
)
- = M (t)"x(t) + R(t)"u(t) / G (t)"#
!"u
+*
.*
15
Two-Point Boundary-Value Problem
State Equation
!!x(t) = F(t)!x(t) + G(t)!u(t)
!x(t o ) = !x o
Adjoint Vector Equation
!"! ( t ) = #Q(t)!x(t) # M(t)!u(t) # FT (t)!"
" (t )
( )
!"
" t f = P(t f )!x(t f )
16
Use Control Law to Solve the TwoPoint Boundary-Value Problem
From Hu = 0
!u(t) = "R "1 (t) $% MT (t)!x(t) + GT (t)!#
# ( t ) &'
Control law that feeds back state and adjoint vectors
Substitute for control in system and adjoint equations
# !!x(t)
%
% !"! ( t )
$
& *, #$ F(t) ) G(t)R )1 (t)MT (t) &'
)G(t)R )1 (t)GT (t)
(=+
T
( , # )Q(t) + M(t)R )1 (t)MT (t) & ) # F(t) ) G(t)R )1 (t)MT (t) &
' - $
'
$
'
.#
, !x(t) &
(
/%
" (t ) (
, %$ !"
'
0
Adjoint relationship at end point
# !x(t o )
%
% !"
" tf
$
( )
& # !x
o
(=%
( % Pf !x f
' $
&
(
('
Perturbation state vector
Perturbation adjoint vector
17
Use Control Law to Solve the TwoPoint Boundary-Value Problem
Assume the adjoint relationship between state
and control applies over the entire interval
!"
" ( t ) = P ( t ) !x ( t )
Control law feeds back state alone
!u(t) = "R "1 (t) #$ MT (t)!x(t) + GT (t)P ( t ) !x ( t ) %&
= "R "1 (t) #$ MT (t) + GT (t)P ( t ) %& !x ( t )
! "C(t)!x ( t )
dim ( C ) = m ! n
!u(t) ! arg min ! 2 J "# !x ( !u ) $%
!u( t ) in ( t o , t f )
18
Linear-Quadratic (LQ) Optimal
Control Gain Matrix
!u(t) = "C(t)!x ( t )
Optimal feedback gain matrix
C(t) = R !1 (t) "#GT (t)P ( t ) + MT (t) $%
•! Properties of feedback gain matrix
–! Full state feedback (m x n)
–! Time-varying matrix
•! R, G, and M given
•! Control weighting matrix, R
•! State-control weighting matrix, M
•! Control effect matrix, G
P(t) remains to be determined
19
Solution for the Adjoining
Matrix, P(t)
Time-derivative of adjoint vector
!"! ( t ) = P! ( t ) !x ( t ) + P ( t ) !!x ( t )
Rearrange
P! ( t ) !x ( t ) = !"! ( t ) # P ( t ) !!x ( t )
Recall coupled state/adjoint equation
# !!x(t)
%
% !"! ( t )
$
& *, #$ F(t) ) G(t)R )1 (t)MT (t) &'
)G(t)R )1 (t)GT (t)
(=+
T
( , # )Q(t) + M(t)R )1 (t)MT (t) & ) # F(t) ) G(t)R )1 (t)MT (t) &
' - $
'
$
'
.#
, !x(t) &
(
/%
" (t ) (
, %$ !"
'
0
Substitute in adjoint matrix equation
T
P! ( t ) !x ( t ) = #$ "Q(t) + M(t)R "1 (t)MT (t) %& !x ( t ) " #$ F(t) " G(t)R "1 (t)MT (t) %& !'
' (t )
{
}
"P ( t ) #$ F(t) " G(t)R "1 (t)MT (t) %& !x ( t ) " G(t)R "1 (t)GT (t)!'
' (t )
20
Solution for the Adjoining
Matrix, P(t)
Substitute for adjoint vector
!"
" ( t ) = P ( t ) !x ( t )
P! ( t ) !x ( t ) = #$ "Q(t) + M(t)R "1 (t)MT (t) %& !x ( t )
T
{
" #$ F(t) " G(t)R "1 (t)MT (t) %& P ( t ) !x ( t )
}
"P ( t ) #$ F(t) " G(t)R "1 (t)MT (t) %& !x ( t ) " G(t)R "1 (t)GT (t)P ( t ) !x ( t )
... and eliminate state vector
21
Matrix Riccati Equation for P(t)
The result is a nonlinear, ordinary differential equation for
the Riccati Matrix, P(t), with terminal boundary conditions
T
P! ( t ) = "# !Q(t) + M(t)R !1 (t)MT (t) $% ! "# F(t) ! G(t)R !1 (t)MT (t) $% P ( t )
!P ( t ) "# F(t) ! G(t)R !1 (t)MT (t) $% + P ( t ) G(t)R !1 (t)GT (t)P ( t )
( )
( )
P t f = &xx t f
Time-varying or time-invariant?
22
Characteristics of the Adjoining
( Riccati) Matrix, P(t)
•! P(tf) is symmetric, n x n, and typically positive semidefinite
•! Matrix Riccati equation is symmetric
•! Therefore, P(t) is symmetric and positive semi-definite
throughout
•! Once P(t) has been determined, optimal feedback
control gain matrix, C(t) can be calculated
23
Example of NeighboringOptimal Control: Improved
Infection Treatment via
Feedback!
24
Natural Response to Pathogen
Assault (No Therapy)
25
Tradeoff Between Rate of Killing
Pathogen, Preservation of Organ
Health, and Drug Use
J=
1
1
2
2
s11 x1 f + s44 x 4 f +
2
2
(
)
tf
! (q
to
2
2
)
x + q44 x 4 + ru 2 dt
11 1
Optimal
Open-Loop
Control
26
50% Increased Infection: Nominal and
Neighboring-Optimal Control (u1)
u(t) = u *(t) ! C(t)"x(t)
= u *(t) ! C(t)[x(t) ! x *(t)]
27
Linear-Quadratic Control
of Linear, Time-Invariant
(LTI) Systems!
28
Linear Time-Varying (LTV) System
with Linear-Quadratic (LQ) Feedback
Control
Continuous-time linear dynamic system
!!x(t) = F(t)!x(t) + G(t)!u(t)
LQ optimal control law
!u(t) = "R "1 (t) #$ MT (t) + GT (t)P ( t ) %& !x ( t ) ! "C(t)!x ( t )
Linear dynamic system with LQ feedback control
!!x(t) = F(t)!x(t) + G(t)!u(t)
= F(t)!x(t) + G(t) #$ "C(t)!x ( t ) %&
= [ F(t) " G(t)C(t)] !x(t)
29
Linear Time-Invariant (LTI) System
with Linear-Quadratic (LQ) Feedback
Control
LTI dynamic system
!!x(t) = F!x(t) + G!u(t)
Time-invariant cost function
tf
" Q
1 T
1 )+ "
! J = !x (t f )P(t f )!x(t f ) + * ( !xT (t) !uT (t) $ &
% & MT
2
2 + to #
#
,
2
M $ " !x(t) $ -+
' dt .
'&
R '% &# !u(t) '% +/
Riccati ordinary differential equation
T
P! ( t ) = "# !Q + MR !1MT $% ! "# F ! GR !1MT $% P ( t ) ! P ( t ) "# F ! GR !1MT $%
( )
( )
+P ( t ) GR !1GT P ( t ) , P t f = &xx t f
30
LTI System with TimeVarying LQ Feedback Control
Control gain matrix varies over time
!u(t) = "R "1 #$ MT + GT P ( t ) %& !x ( t ) ! "C(t)!x ( t )
Linear dynamic system with time-varying
LQ feedback control
!!x(t) = F!x(t) + G #$ "C(t)!x ( t ) %&
= [ F " GC(t)] !x(t) = Fclosed"loop ( t ) !x(t)
31
Example: LQ Optimal Control
of a First-Order System
tf
1
1
! 2 J = p f !x 2 (t f ) + " q!x 2 + ( 0 ) !x!u + r!u 2 dt
2
2 to
(
)
!!x = f !x + g!u
g2 p2 (t )
p! ( t ) = !q ! 2 fp ( t ) +
r
p t f = pf
( )
!u = "r "1 #$ gp ( t ) %& !x ( t )
="
gp ( t )
!x
r
32
Example: LQ Optimal Control of a
Stable First-Order System
f = !1; g = 1
!!x = "!x + !u; x ( 0 ) = 1
q = r =1
p! ( t ) = !1+ 2 p ( t ) + p 2 ( t )
!xopen"loop ( t )
p (t )
( )
p tf =1
Control gain = p ( t )
!u ( t )
!u = " p ( t ) !x
!!x = " #$1+ p ( t ) %& !x
!xclosed"loop ( t )
33
Example: LQ Optimal Control of
an Unstable First-Order System
f = 1; g = 1
!!x = !x + !u; x ( 0 ) = 1
p! ( t ) = !1! 2 p ( t ) + p 2 ( t )
( )
!xopen"loop ( t )
p (t )
p tf =1
Control gain = p ( t )
!u = " p ( t ) !x
!!x = #$1 " p ( t ) %& !x
!u ( t )
!xclosed"loop ( t )
34
Example: LQ Optimal Control, Stable First-Order
Disturbance
System, White-Noise
!!x = "!x + !u + !w; x ( 0 ) = 0
!u = " p ( t ) !x
p! ( t ) = !q + 2 p ( t ) + p 2 ( t ); q = 1 or 100
( )
!!x = " #$1 + p ( t ) %& !x
p tf =1
q=1
q = 100
Open-Loop Response
!xopen"loop ( t )
!xopen"loop ( t )
p (t )
p (t )
!u ( t )
!u ( t )
Open- and Closed-Loop Responses
35
Discrete-Time and SampledData Linear, Time-Invariant
(LTI) Systems!
36
Continuous- and Discrete-Time LTI
System Models
Continuous-time (
analog
) model is based
on an ordinary differential equation
!!x(t) = F!x(t) + G!u(t) + L!w(t)
!x(t o ) given
!y(t) = H x !x(t) + H u !u(t) + H w !w(t)
Dynamic Process
Observation Process
Discrete-time (digital
) model is based
on an ordinary difference equation
!x(t k+1 ) = "!x(t k ) + #!u(t k ) + $!w(t k )
Dynamic Process
!y(t k ) = H x !x(t k ) + H u !u(t k ) + H w !w(t k )
Observation Process
!x(t o ) given
37
Digital Control Systems Use
Sampled Data
•! Periodic sequence
!xk = !x ( t k ) = !x ( k!t )
•!
•!
Sampler is an analog-to-digital (A/D) converter
Reconstructor is a digital-to-analog (D/A) converter
38
LTI System Response to
Inputs and Initial Conditions
Solution of a LTI dynamic model
!!x(t) = F!x(t) + G!u(t) + L!w(t), !x(t o ) given
t
!x(t) = !x(t o ) + # [ F!x(" ) + G!u(" ) + L!w(" )] d"
•! ... has two parts
to
–! Unforced (homogeneous) response to initial conditions
–! Forced response to control and disturbance inputs
Step input
response
Initial condition
response
39
Unforced Response
to Initial Conditions
Neglecting forcing functions
t
!x(t) = !x(t o ) + # [ F!x(" )] d" = $ ( t,t o ) !x(t o )
to
The state transition matrix propagates the state
from to to t by a single multiplication
The state transition matrix is the matrix exponential
t
!x(t) = !x(t o ) + # [ F!x(" )] d"
to
= eF(t$to ) !x(t o ) = % ( t $ t o ) !x(t o )
40
State Transition Matrix is the
Matrix Exponential
eF(t!to ) ! eF" t = Matrix Exponential
1
1
2
3
F" t ] + [ F" t ] + ...
[
2!
3!
= # (" t ) = State Transition Matrix
= I + F" t +
See pages 79-84 of Optimal Control and Estimation for a
description of how the State Transition Matrix is calculated for LTV
and LTI systems
41
Response
to Inputs
!x(t k ) = !x(t k "1 ) +
Solution of the LTI model
with piecewise-constant
forcing functions
tk
$ [ F!x(# ) + G!u(# ) + L!w(# )] d#
t k"1
! t ! t k " t k"1 = constant
!x(t k ) = e !x(t k#1 ) + e
F" t
F" t
tk
) %& e
t k#1
# F($ #t k#1 )
' d$ [ G!u(t k#1 ) + L!w(t k#1 )]
(
= *!x(t k#1 ) + +!u(t k#1 ) + ,!w(t k#1 )
! ! eF" t
# ! eF" t ( I $ e$ F" t ) F $1G = ( eF" t $ I ) F $1G
% ! eF" t ( I $ e$ F" t ) F $1L = ( eF" t $ I ) F $1L
42
Discrete-Time LTI System
Response to Step Input
Propagation of !x ( t k )
with constant " t, #, $ , and %
!x(t1 ) = "!x(t o ) + #!u(t o ) + $!w(t o )
!x(t 2 ) = "!x(t1 ) + #!u(t1 ) + $!w(t1 )
!x(t 3 ) = "!x(t 2 ) + #!u(t 2 ) + $!w(t 2 )
!
Step Input: Discretetime solution is exact
at sampling instants
43
Series Expressions for DiscreteTime LTI System Matrices
! = I + F (" t ) +
1
1
2
3
#$ F (" t ) %& + #$ F (" t ) %& + ...
2!
3!
1
1
2
(
+
! = eF" t # I F #1G = * I + $% F (" t ) &' + $% F (" t ) &' + ...- G" t
)
,
2!
3!
(
)
1
1
2
(
+
. = eF" t # I F #1L = * I + $% F (" t ) &' + $% F (" t ) &' + ...- L" t
)
,
2!
3!
(
)
As time interval becomes very small,
discrete-time model approaches
continuous-time model
! $"$$
# ( I + F" t )
t #0
% $"$$
# G" t
t #0
# L" t
& $"$$
t #0
44
Sampled-Data Cost Function
Sampled-Data Cost Function: a Discrete-Time Cost Function
that accounts for system response between sampling instants
tf
" Q
1 T
1 )+ "
min ! J = min
!x (t f )P(t f )!x(t f ) + * ( !xT (t) !uT (t) $ &
% & MT
!u( t )
!u( t )
2
2 + to #
#
,
M $ " !x(t) $ -+
' dt .
'&
R '% &# !u(t) '% /+
2
Sum integrals over short time intervals, (tk, tk+1)
k 01 t
" Q
1
1 f )+ k+1 "
T
min ! J = min
!x k f Pk f !x k f + 1 * ( !xT (t) !uT (t) $ &
% & MT
!u( t )
!u( t )
2
2 k=0 + tk #
#
,
2
M $ " !x(t) $ -+
' dt .
'&
R '% &# !u(t) '% /+
Minimize subject to sampled-data dynamic constraint
!x(t k +1 ) = " (# t ) !x(t k ) + $ (# t ) !u(t k )
45
Integrand of SampledData Cost Function
Use dynamic equation ...
!x ( t ) = " ( t,t k ) !x ( t k ) + # ( t,t k ) !u ( t k )
! " ( t,t k ) !x k + # ( t,t k ) !u k
...to express the integrand in the sampling interval, (tk,tk+1)
1
2
)+tk+1
" Q
T
T
"
$
1 * ( # !x (t) !u (t) % & MT
k = 0 + tk
&#
,
k f 01
M $ " !x(t) $ -+
' dt .
'&
R '% &# !u(t) '% +/
46
Integrand of SampledData Cost Function
)+tk+1
" Q
1 * ( "# !xT (t) !uT (t) $% & MT
k = 0 + tk
&#
,
M $ " !x(t) $ -+
' dt .
'&
R '% &# !u(t) '% /+
k f 01
1
2
Bring state and control out of integral
Assume control is constant in sampling interval
=
1
2
k f 21
+
-
3 ,"#
k=0
.
!xTk
!uTk $
%
t k+1
*
tk
"
& ( t,t k )
& T ( t,t k ) "#Q'
' ( t,t k ) + M $%
& T ( t,t k ) Q&
(
T
(
T
T
' ( t,t k ) + M $% & ( t,t k ) "# R + ' ( t,t k ) M + M ' ( t,t k ) + ' T ( t,t k ) Q'
' ( t,t k ) $%
(# "#Q'
$ "
) dt ( !x k
) ( !u
k
)% k #
Integration has been replaced by summation
=
k /1 (
1 f *"
0 ) !x k T
2 k=0 * #
+
" Q̂
!u k T $ &
% & M̂T
#
M̂ $ " !x k
' &
R̂ '% k &# !u k
$ ,*
''% *
.
47
Berman, Gran, J. Aircraft, 1974
Sampled-Data Cost Function
Weighting Matrices
Assume Q, M, and R are constant in the integration interval
! ( t,t k ) and " ( t,t k ) vary in the integration interval
Q̂ =
M̂ =
R̂ =
t k+1
t k+1
t k+1
! ( t,t ) dt
" ! (t,t ) Q!
T
k
tk
k
" ! (t,t ) $%Q## (t,t ) + M &' dt
tk
T
k
k
" $% R + # (t,t ) M + M # (t,t ) + # (t,t ) Q## (t,t )&' dt
tk
T
k
T
k
T
k
k
The integrand accounts for continuous-time variations of the
Inter-sample ripple
)
LTI system between sampling instants (
48
/
$)0
)% 1
Dynamic Programming
Approach to Sampled-Data
Optimal Control!
49
Discrete-Time Hamilton-JacobiBellman Equation
Discrete HJB equation
Value Function at to
V (t o ) = ! k f +
•!
•!
Vk * = ! min { Lk + Vk+1 *}
"u k
k f "1
#L
k=0
k
= ! min H k , V * #$ "x k f *%& = given
"u k
subject to
"x k+1 = '"x k + ("u k
Begin at terminal point with optimal value function
Working backward, add minimum value function
increment (Bellmans Principle of Optimality)
… optimal policy … whatever the initial state and initial decision …remaining
decisions must constitute an optimal policy with regard to the current state
50
Sampled-Data Cost Function Contains
Terminal and Summation Costs
Integral cost has been replaced by a summation cost
Terminal cost is the same
min J sampled
!u k
(
k /1 (
1 f *"
*1
T
= min ) !x k f Pk f !x k f + 0 ) !x k T
!u k
2 k=0 * #
*+ 2
+
!u k
T
"
$ & Q̂
% & M̂T
#
M̂ $ " !x k
' &
R̂ '% k &# !u k
$ ,* ,*
' -'% * *
..
subject to
!x k+1 = "!x k + #!u k
51
Dynamic Programming Approach to
Sampled-Data LQ Control
Quadratic Value Function at to
1
1
V (t o ) = !xTk fPk f !x k f +
2
2
(
*
0 )"# !xTk
k=0 *
+
k f /1
!u
T
k
"
$ & Q̂
% & M̂T
#
M̂ $ " !x k
'&
R̂ '% &# !u k
$ ,*
''% *.
Discrete HJB equation
'1
*
Vk * = ! min ( #$ "x k *T Q̂"x k * +2"x k *T M̂"u k + "u k T R̂"u k %& + Vk+1 *+
"u k ) 2
,
= ! min H k , V * #$ "x k f *%& = "x k f *T Pk f "x k f *T
"u k
subject to "x k+1 = -"x k + ."u k
52
Optimality Condition
Assume value function takes a quadratic form
Vk =
1 T
1
!x k Pk !x k ; Vk +1 = !x k +1 T Pk +1!x k +1
2
2
Optimality condition
!H k
!V
= #$ "xTk M̂ + "uTk R̂ %& + k +1 = 0
!"u k
!"u k
where
#1
&
! % "x k+1 T Pk+1"x k+1 (
!Vk+1
2
' = )"x + *"u T P *
= $
[ k
k]
k+1
!"u k
!"u k
hence
" !xTk M̂ + !uTk R̂ $ + "# !xTk & T + !uTk ' T $% Pk +1' = 0
#
%
53
Minimizing Value of Control
!H k
= "x k T %& M̂ + # T Pk+1$ '( + "u k T %& R̂ + $ T Pk+1$ '( = 0
!"u k
"1
!u k = " $% R̂ + # T Pk +1# &' $% M̂T + # T Pk +1( &' !x k ! "C k !x k
(
Must find Pk in 0, k f
)
Use definitions of V * and !u in HJB equation
54
Solution for Pk
Vk =
1 T
1
!x k Pk !x k ; Vk +1 = !x k +1 T Pk +1!x k +1
2
2
"1
!u k = " $% R̂ + # T Pk +1# &' $% M̂T + # T Pk +1( &' !x k ! "C k !x k
Substitute for Vk ,Vk +1 , and !u k in discrete-time HJB equation
1
!x k T Pk !x k =
2
T
'1
*
" min ( # !x k T Q̂!x k + 2!x k T M̂ ( "C k !x k ) + ( "C k !x k ) R̂ ( "C k !x k ) + !x k+1 T Pk+1!x k+1 % +
$
&
!u k
)2
,
Rearrange and cancel !x k on both sides of the equation to yield the
discrete-time Riccati equation
T
"1
Pk = Q̂ + ! T Pk+1! " $% M̂ T + # T Pk+1! &' $% R̂ + # T Pk+1# &' $% M̂ T + # T Pk+1! &'
Pk f given
55
Discrete-Time System with LinearQuadratic Feedback Control
Dynamic System
!x k +1 = "!x k + #!u k
Control law
"1
!u k = " $% R̂ + # T Pk+1# &' $% M̂ T + # T Pk+1( &' !x k ! "C k !x k
Dynamic system with LQ feedback control
!x k+1 = "!x k + #!u k
= "!x k + # ( $C k !x )k
= ( " $ #C k ) !x k
56
Next Time:!
Dynamic System Stability!
Reading!
OCE: Section 2.5!
!
57
Supplemental
Material
58
Neighboring Trajectories
•! Nominal (or reference) trajectory and control history
{x N (t), u N (t), w N (t)}
x : dynamic state
for t in [t o ,t f ]
u : control input
w : disturbance input
•! Trajectory perturbed by
–! Small initial condition variation
–! Small control variation
–! Small disturbance variation
{x(t), u(t), w(t)} for t in [to ,t f ]
= {x N (t) + !x(t), u N (t) + !u(t), w N (t) + !w(t)}
59
Example: Separate
Solutions for Nominal and
Perturbation Trajectories
Original nonlinear equation describes nominal dynamics
! x!1
# N
x! N = # x!2 N
#
#" x! 3N
$
x2 N + dw1N
$ !
&
#
&
2
#
& = a2 x3N ' x2 N + a1 x3N ' x1N + b1u1N + b2u2 N & ,
&
& #
&
#
3
&%
c2 x3N + c1 x1N + x2 N + b3 x1N u1N
&%
#"
(
)
(
(
)
)
! x1
# N
# x2 N
#
#" x3N
$
&
& given
&
&%
Linear, time-varying equation describes perturbation dynamics
" !x!1
$
$ !x!2
$ !x!
3
#
0
% "
' $ (2a x ( x
1
3N
1N
'=$
' $
c1 + b3u1N
& $#
(
" 0
$
+ $ b1
$ b3 x1N
#
(
)
)
1
0
(
(a2
a2 + 2a1 x3N ( x1N
c1
3c2 x3N
0 %
' " !u1
b2 ' $
!u2
0 ' $#
&
2
% " d %
' + $ 0 ' !w1 ;
'& $ 0 '
$#
'&
)
%"
' $ !x1
' $ !x2
'$
' # !x3
&
" !x1 (t o ) %
'
$
$ !x2 (t o ) '
$ !x (t ) '
3 o
&
#
%
'
'
'
&
given
60
Example: LQ Optimal Control, Stable First-Order
Disturbance
System, White-Noise
!!x = "!x + !u + !w; x ( 0 ) = 0
!u = " p ( t ) !x
p! ( t ) = !1+ 2 p ( t ) + p 2 ( t )
( )
p t f = 1 or 1000
!!x = " #$1 + p ( t ) %& !x
pf = 1
Open-Loop Response
pf = 1000
!xopen"loop ( t )
!xopen"loop ( t )
p (t )
p (t )
!u ( t )
!u ( t )
Open- and Closed-Loop Responses
61
Example: LQ Optimal Control, Stable First-Order
Disturbance
System, White-Noise
!!x = "!x + !u + !w; x ( 0 ) = 0
!u = " p ( t ) !x
!!x = " #$1 + p ( t ) %& !x
pf = 1
p! ( t ) = !100 + 2 p ( t ) + p 2 ( t )
( )
p t f = 1 or 0
pf = 0
!xopen"loop ( t )
!xopen"loop ( t )
p (t )
p (t )
!u ( t )
!u ( t )
!xopen"loop ( t ) , !xclosed"loop ( t )
!xopen"loop ( t ) , !xclosed"loop ( t )
62
Multivariable LQ Optimal Control
with Cross Weighting, M, = 0
No state/control coupling in cost function
1 T
!x (t f )P(t f )!x(t f ) !!
x(t) = F!x(t) + G!u(t)
2
tf
" Q 0 $ " !x(t) $ -+
1 )+
+ * ( " !xT (t) !uT (t) $ &
&
' dt
% 0 R ' & !u(t) ' .
2 + to #
#
%
#
% +/
,
!2 J =
T
P! ( t ) = [ !Q] ! [ F ] P ( t ) ! P ( t )[ F ] + P ( t ) GR !1GT P ( t )
( )
P t f = Pf
!u(t) = "R "1 #$GT P ( t ) %& !x ( t )
63
Nonlinear System with NeighboringOptimal Feedback Control
Nonlinear dynamic system
x! (t) = f !" x ( t ) , u(t) #$
Neighboring-optimal control law
u(t) = u * (t) ! C(t)"x ( t ) = u * (t) ! C(t) #$ x ( t ) ! x * ( t ) %&
Nonlinear dynamic system with neighboringoptimal feedback control
{
x! (t) = f x ( t ) , "# u * (t) ! C(t) "# x ( t ) ! x * ( t ) $% $%
}
64
Development of !
Neighboring-Optimal Therapy
•! Expand dynamic equation to first degree
•!
•!
•!
x(t) = x * (t) + !x(t)
u(t) = u * (t) + !u(t)
Compute nominal optimal control history
u(t) = u opt (t)
using original nonlinear dynamic model
Compute optimal perturbation control
using locally linearized dynamic model
!u(t) = "C(t) #$ x(t) " x opt (t) %&
Sum the two for neighboring-optimal
control of the dynamic system
u(t) = u opt (t) + !u(t)
65
First-Order LQ Example Code
%
%
%
First-Order LQ Example
Rob Stengel
2/23/2011
clear
global tp p q tw w
function xdot
=
global
xo = 0; to = 0;
tf = 10;
wt
tspanx =
[to tf];
xdot
tw
=
[0:0.01:10];
for k
=
1:length(tw)
function pdot
=
w(k)
=
randn(1);
global
end
pdot
[tx,x] =
ode15s('First',tspanx,xo);
pf
=
0;
q
=
100;
function
xdot
=
tspanp =
[tf to];
global
[tp,p] =
ode15s('FirstRiccati',tspanp,pf);
wt
[tc,xc] =
ode15s('FirstCL',tspanx,xo);
pt
u
=
interp1(tp,p,tc).*xc;
xdot
figure
subplot(4,1,1)
plot(tx,x),grid,xlabel('Time'),ylabel('x-open-loop')
subplot(4,1,2)
plot(tp,p,'r'),grid,xlabel('Time'),ylabel('p')
subplot(4,1,3)
plot(tc,u),grid, xlabel('Time'),ylabel('u')
subplot(4,1,4)
plot(tc,xc,'r',tx,x,'b'),grid,xlabel('Time'),ylabel('x- closed-loop')
First(t,x)
tp p tw w
=
interp1(tw,w,t,'nearest');
=
-x + wt;
FirstRiccati(t,p)
q
=
-q + 2*p + p^2;
FirstCL(tc,xc);
tp p tw w
=
interp1(tw,w,tc,'nearest');
=
interp1(tp,p,tc);
=
-(1 + pt)*xc + wt;
66
Initial-Condition Response
via State Transition
Propagation of !x ( t k ) in LTI system
!x(t1 ) = " ( t1 # t o ) !x(t o )
!x(t 2 ) = " ( t 2 # t1 ) !x(t1 )
!x(t 3 ) = " ( t 3 # t 2 ) !x(t 2 )
!x(t1 ) = " (# t ) !x(t o ) = "!x(t o )
!x(t 2 ) = "!x(t1 ) = " 2 !x(t o )
!x(t 3 ) = "!x(t 2 ) = " 3!x(t o )
…
…
State transition matrix is constant if
(t k ! t k!1 ) = " t =
! = I + F (" t ) +
+
constant
2
1
#$ F (" t ) %&
2!
3
1
#$ F (" t ) %& + ...
3!
67
Example: Equivalent Continuous-Time
and Discrete-Time System Matrices
Continuous-time
(
analog
) system
" !1.2794 !7.9856 %
F=$
'
1
!1.2709 &
#
" !9.069 %
G=$
'
0
#
&
" !7.9856 %
L=$
'
# !1.2709 &
) system
Discrete-time (digital
# 0.845 "0.6936 &
!=%
(
$ 0.0869 0.8457 '
! t = 0.1s
# "0.8404 &
)=%
(
$ "0.0414 '
# "0.6936 &
*=%
(
$ "0.1543 '
! t = 0.5s
Time interval
has a large
effect on the
discrete-time
matrices
# 0.0823 "1.4751 &
!=%
(
$ 0.1847 0.0839 '
# "2.4923 &
)=%
(
$ "0.6429 '
# "1.4751 &
*=%
(
$ "0.9161 '
68
Evaluating Sampled-Data
Weighting Matrices
! ( t,t k ) and " ( t,t k ) vary in the integration interval
Break interval into smaller intervals, and approximate as
sum of short rectangular integration steps
"t
! ( t,0 ) dt
Q̂ = # ! T ( t,0 ) Q!
0
100
! * &' ! T ( t k$1 ,0 ) Q!
! ( t k$1 ,0 )% t (), % t = "t 100, t k = k% t
k=1
100
! * &' eF tk$1 QeFtk$1% t ()
T
k=1
69
Evaluating Sampled-Data
Weighting Matrices
1
1
2
(
+
! = eF" t # I F #1G = * I + $% F (" t ) &' + %$ F (" t ) &' + ...- G" t
)
,
2!
3!
(
)
Q̂, M̂, and R̂ evaluated just once for LTI system
't
" ( t,0 ) + M %& dt
M̂ = ( ! T ( t,0 ) #$Q"
0
5
2# T
1
1
2
%
*
! 9 3 0 eF tk)1 Q , I + [ Ft k)1 ] + [ Ft k)1 ] + .../ Gt k)1 1 + M 68 t
+
.
2!
3!
&
k=1 4 $
7
100
100
" ( t k)1 ) %& 8 t
R̂ ! 9 #$ R + " T ( t k)1 ) M + MT " ( t k)1 ) + " T ( t k)1 ) Q"
k=1
70
Sampled-Data Cost Function
Weighting Always Includes
State-Control Weighting
M̂ =
=
t k+1
'
tk
t k+1
" ( t,t k ) + M %& dt
! T ( t,t k ) #$Q"
' ! (t,t ) Q"" (t,t ) dt
tk
T
k
k
even if M = 0
Sampled-Data Lagrangian
Lk =
1
" !xTk Q̂!x k + 2!xTk M̂!u k + !uTk R̂!u k $
%
2#
71
Continuous-Time LQ
Optimization via Dynamic
Programming!
72
Dynamic Programming
Approach to ContinuousTime LQ Control
Value Function at to
1 T
!x (t f )P(t f ) f !x(t f )
2
tf
" Q(t) M(t) $ " !x(t) $ -+
1 )+ "
T
T
$
'&
+ * ( !x (t) !u (t) &
' dt
% & MT (t) R(t) ' & !u(t) ' .
2 + to #
#
% +/
#
%
,
V (t o ) =
Value Function at t1
1 T
!x (t f )P(t f )!x(t f )
2
t
# Q(t) M(t) % # !x(t) % .,
1 *, 1 #
T
T
%
('
" + ) !x (t) !u (t) '
( dt
& ' MT (t) R(t) ( ' !u(t) ( /
2 , tf $
$
& ,0
$
&
-
V (t1 ) =
73
Dynamic Programming
Approach to ContinousTime LQ Control
Time Derivative of Value Function
!V * [ "x *(t1 )]
=
!t
(1
,
T
T
T
** 2 $% "x * (t1 )Q(t1 )"x *(t1 ) + 2"x * (t1 )M(t1 )"u(t1 ) + "u (t1 )R(t1 )"u(t1 ) &' **
# min )
"u
!V * [ "x *(t1 )]
*
*
+
[ F(t1 )"x *(t1 ) + G(t1 )"u(t1 )]
*+
*.
!"x
74
Dynamic Programming
Approach to ContinuousTime LQ Control
Hamiltonian
"V * &
#
!
H % !x*, !u,
"!x ('
$
"V * [ !x *]
1
#$ !x *T Q!x * +2!x *T M!u + !uT R!u &' +
[ F!x * +G!u ]
2
"!x
HJB Equation
!V * [ "x *]
!V * '
$
= # min H & "x*, "u,
,
"u
!"x )(
!t
%
V * $% "x(t f ) '( = "x *T (t f )P(t f )"x *T (t f )
75
Plausible Form for
the Value Function
Quadratic Function of State Perturbation
V * [ !x * (t)] = !x *T (t)P(t)!x *T (t)
Time Derivative of the Value Function
!V *
1
! 1 )#x * (t1 ) &
= " $% #x *T (t1 )P(t
'
!t
2
Gradient of the Value Function with respect to the state
!V *
= "x *T (t)P(t)
!"x
76
Optimal Control Law
and HJB Equation
Optimal control law
!H
= "xT M + "uT R + "xT PG = 0
!u
"u(t) = #R #1 ( GT P + MT ) "x(t)
Incorporate Value Function Model in HJB equation
{
! =
!xT P!x
T
!xT #$ "Q + MR "1MT %& " #$ F " GR "1MT %& P " P #$ F " GR "1MT %& + PGR "1GT P
!x ( t ) can be cancelled on left and right
77
Matrix Riccati Equation
P! ( t ) = "# !Q(t) + M(t)R !1 (t)MT (t) $%
T
! "# F(t) ! G(t)R !1 (t)MT (t) $% P ( t )
!P ( t ) "# F(t) ! G(t)R !1 (t)MT (t) $%
+P ( t ) G(t)R !1 (t)GT (t)P ( t )
( )
( )
P t f = &xx t f
78
} !x
Example: 1st-Order System with
LQ Feedback Control
1st-order discrete-time dynamic system
!xk +1 = "!xk + #!uk
LQ optimal control law
m + #$ pk +1
!uk = "
!xk ! "ck !xk
r + $ 2 pk +1
pk = q + ! pk +1
2
2
m + !# pk +1 )
(
"
,
r + # 2 pk +1
pk f given
Dynamic system with LQ feedback control
!xk+1 = "!xk + #!uk
= "!xk + # ( $ck !x )k
= (" $ # ck ) !xk
79