03.pdf

Chapter 3
Power and Energy Basics
Power and Energy Basics
Jan M. Rabaey
Slide 3.1
Chapter Outline
Metrics
Dynamic power
Static power
Energy– delay trade-offs
Slide 3.2
Metrics
Delay (s):
– Performance metric
Energy (Joule)
– Efficiency metric: effort to perform a task
Power (Watt)
– Energy consumed per unit time
Power*Delay (Joule)
– Mostly a technology parameter – measures the efficiency of
performing an operation in a given technology
Energy*Delay = Power*Delay2 (Joule s)
– Combined performance and energy metric – figure of merit of
design style
Other Metrics: Energy-Delayn(Joule sn)
– Increased weight on performance over energy
Slide 3.3
Where Is Power Dissipated in CMOS?
Active (Dynamic) power
– (Dis)charging capacitors
– Short-circuit power
Both pull-up and pull-down on during transition
Static (leakage) power
– Transistors are imperfect switches
Static currents
– Biasing currents
Slide 3.4
Active (or Dynamic) Power
Key property of active power:
Pdyn ∝ f
where f is the switching frequency
Sources:
Charging and discharging capacitors
Temporary glitches (dynamic hazards)
Short-circuit currents
Slide 3.5
Charging Capacitors
Applying a voltage step
ER =
E0→1 = CV 2
1
CV 2
2
R
EC =
V
C
d C
d
d
d
C
Value of R does not impact energy!
Slide 3.6
1
CV 2
2
Applied to Complementary CMOS Gate
VDD
PMOS
A1
AN
2
E 0→1 = CLV DD
iL
ER =
NETWORK
1
2
CLVDD
2
Vout
NMOS
CL EC =
1
2
CLVDD
2
NETWORK
One half of the power from the supply is consumed in the
pull-up network and one half is stored on C L
Charge from C L is dumped during the 1→ 0 transition
Independent of resistance of charging/discharging network
Slide 3.7
Circuits with Reduced Swing
E0
1
=
VC
0
dVC
dt = CV
dt
V VT
dVC = CV (V VTH )
0
Energy consumed is proportional to output swing
Slide 3.8
Charging Capacitors – Revisited
Driving from a constant current source
ER = (
RC
) CV 2
T
R
E0→1 = EC + E R
EC =
I
C
T=
CV
I
∞
∫
E R = I ( RI ) dt = RI 2T = (
0
1
CV 2
2
RC
) CV 2
T
Energy dissipated in resistor can be reduced
by increasing charging time T (i.e., decreasing I)
Slide 3.9
Charging Capacitors
Using constant voltage or current driver?
Econstant_current < Econstant_voltage
if
T > 2RC
Energy dissipated using constant-current charging
can be made arbitrarily small at the expense of delay:
Adiabatic charging
Note: tp (RC) = 0.69 RC
t0 → 90%(RC) = 2.3 RC
Slide 3.10
Charging Capacitors
Driving using a sine wave (e.g., from resonant circuit)
R
EC =
v(t)
1
CV 2
2
C
Energy dissipated in resistor can be made arbitrarily small
if frequency ω << 1/RC
(output signal in phase with input sinusoid)
Slide 3.11
Dynamic Power Consumption
Power = Energy per transition × Transition rate
= CLVDD2 f0→1
= CLVDD2 f p0→1
= CswitchedVDD2f
Power dissipation is data dependent – depends
on the switching probability, p0→1
Switched capacitance Cswitched = p0→1CL= αCL
(α is called the switching activity factor)
Slide 3.12
Impact of Logic Function
Example: Static two-input NOR gate
A
B
Out
0
0
1
0
1
0
1
1
0
1
0
0
Assume signal probabilities
pA =1 = 1/2
pB =1 = 1/2
Then transition probability
p0→1 = pout=0 × pout=1
= 3/4 × 1/4 = 3/16
If inputs switch every cycle
α NOR = 3/16
NAND gate yields similar result
Slide 3.13
Impact of Logic Function
Example: Static two-input XOR gate
A
B
Out
0
0
1
0
1
1
1
0
1
0
0
0
Assume signal probabilities
pA=1 = 1/2
pB=1 = 1/2
Then transition probability
p0→1 = pout=0 × pout=1
= 1/2 × 1/2 = 1/4
If inputs switch every cycle
p0→1 = 1/4
Slide 3.14
Transition Probabilities for Basic Gates
As a function of the input probabilities
p0→1
AND
(1 – pA pB )pA pB
OR
(1 – pA)(1 –pB)(1 – (1 –pA)(1 – pB))
XOR
(1– (pA + pB – 2 pA pB))(pA + pB – 2 pA pB)
Activity for static CMOS gates
α = p 0 p1
Slide 3.15
Activity as a Function of Topology
XOR versus NAND/NOR
P
XOR
NAND/NOR
α NOR,NAND = (2N – 1)/2 2NαXOR = 1/4
Slide 3.16
How About Dynamic Logic?
V DD
Pre-charge
Energy dissipated
when effective output is zero!
or p0→1 = p0
Eval
Always larger than p 0p 1!
E.g., p0→1(NAND) = 1/2 N ; p0→1 (NOR) = (2 N – 1)/2N
Activity in dynamic circuits hence always higher than in static.
But ... capacitance most often smaller.
Slide 3.17
Differential Logic?
V DD
Out
Out
Gate
Static:
Activity is doubled
Dynamic:
Transition
probability is 1!
Hence power always increases.
Slide 3.18
Evaluating Power Dissipation of Complex Logic
Simple idea: start from inputs and propagate signal
probabilities to outputs
p1
0.1
0.5
0.9
0.1
0.045
0.99
0.989
0.1
0.5
0.25
0.5
But:
– Reconvergent fan-out
– Feedback and temporal/spatial correlations
Slide 3.19
Reconvergent Fan-out (Spatial Correlation)
Inputs to gates can be interdependent (correlated)
reconvergence
A
X
Z
A
X
Z
B
no reconvergence
p Z = 1–(1–pA )p B
PZ : probability that Z = 1
reconvergent
p Z = 1–(1–pA)pA ?
NO!
pZ = 1
Must use conditional probabilities
pZ = 1–p A . p(X |A) = 1
probability that X = 1 given that A = 1
Becomes complex and intractable real fast
Slide 3.20
Temporal Correlations
Feedback
X
R
Logic
X is a function of itself
→ correlated in time
Temporal correlation in
input streams
01010101010101...
00000001111111...
Both streams have same P = 1
but different switching
statistics
Activity estimation the hardest part of power analysis
Typically done through simulation with actual input
vectors (see later slides)
Slide 3.21
Glitching in Static CMOS
Analysis so far did not include timing effects
A
X
B
Z
C
ABC
101
000
X
Glitch
Z
Gate Delay
The result is correct,
but extra power is dissipated
Slide 3.22
Also known as dynamic hazards:
“A single input change causing
multiple changes in the output”
Example: Chain of NAND Gates
Out1
Out2
Out3
1
3.0
Out 6
Voltage (V)
Out 2
2.0
Out 6
Out 8
Out 7
1.0
Out 1
Out 3
0.0
0
Out 5
200
400
Time (ps)
Slide 3.23
600
Out4
Out5
What Causes Glitches?
A
A
X
B
B
Y
C
C
Z
D
A,B
A,B
C,D
X
Z
Y
D
C,D
Y
X
X
Y
Z
Uneven arrival times of input signals of gates due to
unbalanced delay paths
Solution: balancing delay paths!
Slide 3.24
Z
Short-Circuit Currents
(also called crowbar currents)
V DD
V DD–V TH
V in
V in
I sc
V out
CL
I peak
V TH
t
I sc
t
PMOS and NMOS simultaneously ON during transition
Psc ~ f
Slide 3.25
Short-Circuit Currents
VDD
VDD
Isc = IMAX
Isc∼ 0
Vin
Vout
CL
2.5
Large load
Vin
× 10−4
Small load
C = 20 fF
2
Isc (A)
Vout
CL
1.5
C = 100 fF
1
C = 500 fF
0.5
0
−0.5
0
20
40
time (s)
60
Equalizing rise/fall times of input and output signals limits Psc to 10–15%
of the dynamic dissipation
[Ref: H. Veendrick, JSSC’84]
Slide 3.26
Modeling Short-Circuit Power
Can be modeled as capacitor
τin
C SC = k (a τ + b)
out
a, b: technology parameters
k: function of supply and threshold voltages, and transistor sizes
E SC = C SCVDD2
Easily included in timing and power models
Slide 3.27
Transistors Leak
Drain leakage
– Diffusion currents
– Drain-induced barrier lowering (DIBL)
Junction leakage
– Gate-induced drain leakage (GIDL)
Gate leakage
– Tunneling currents through thin oxide
Slide 3.28
Sub-threshold Leakage
Off-current increases exponentially when reducing VTH
I leak = I 0
Slide 3.29
W
10
W0
−VTH
S
Pleak = VDD.I leak
Sub-Threshold Leakage
Leakage current increases with drain voltage
(mostly due to DIBL)
I leak = I 0
W
10
W0
−VTH + λ dVDS
S
(for VDS > 3 kT/q)
Hence
Pleak = ( I 0
W
10
W0
−VTH
S
λ d VDD
)(VDD 10
S
)
Leakage power is a strong function of supply voltage
Slide 3.30
Stacking Effect
Assume that body-biasing effect in short-channel
transistor is small
NAND gate:
VDD
I leak,M1 = I 0′ 10
−VM − VTH + λ d (VDD − VM )
S
Ileak,M2 = I 0′ 10
VM ≈
−
I stack
≈ 10
I inv
Slide 3.31
−VTH + λ dVM
S
λd
VDD
1+ 2λ d
λ d VDD 1+ λd
(
)
1+ 2 λd (instead of the
S
expected factor of 2)
Stacking Effect
3
× 10–9
2.5
90 nm NMOS
I leak (A)
2
I M1
1.5
factor 9
IM2
1
0.5
Leakage Reduction
0
0
Slide 3.32
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
VM (V)
1
2 NMOS
9
3 NMOS
17
4 NMOS
24
2 PMOS
8
3 PMOS
12
4 PMOS
16
Gate Tunneling
VDD
Exponential function of supply voltage
IGD~ e–Tox eVGD , IGS ~ e–Tox eVGS
Independent of the sub-threshold
leakage
ISUB
VDD
0V
IGD
× 10
1.8
1.6
ILeak
90 nm CMOS
Igate (A)
1.4
IGS
1.2
1
Modeled in BSIM4
0.8
Also in BSIM3v3 (but not
always included in foundry
models)
0.6
0.4
NMOS gate leakage usually
worse than PMOS
0.2
0
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
VDD (V)
Slide 3.33
0.9
1
Other Sources of Static Power Dissipation
Diode (drain–substrate) reverse-bias currents
p+
n+
n+
p+
p+
n+
n well
p substrate
• Electron-hole pair generation in depletion region of reversebiased diodes
• Diffusion of minority carriers through junction
• For sub-50 nm technologies with highly doped pn junctions,
tunneling through narrow depletion region becomes an issue
Strong function of temperature
Much smaller than other leakage components in general
Slide 3.34
Other Sources of Static Power Dissipation
Circuit with dc bias currents:
sense amplifiers,
voltage converters
and regulators,
sensors, mixed-signal
components, etc.
Should be turned off when not in use, or standby current should
be minimized
Slide 3.35
Summary of Power Dissipation Sources
P ~ α ⋅ (CL + CSC) ⋅ Vswing ⋅ VDD ⋅ f + (IDC + ILeak) ⋅ VDD
α – switching activity
C L – load capacitance
CSC – short-circuit
capacitance
Vswing – voltage swing
f – frequency
P=
Slide 3.36
I DC – static current
I leak – leakage current
energy
× rate + staticpower
operation
The Traditional Design Philosophy
Maximum performance is primary goal
– Minimum delay at circuit level
Architecture implements the required function
with target throughput, latency
Performance achieved through optimum sizing,
logic mapping, architectural transformations
Supplies, thresholds set to achieve maximum
performance, subject to reliability constraints
Slide 3.37
CMOS Performance Optimization
Sizing: Optimal performance with equal fan-out per stage
CL
Extendable to general logic cone through “logical effort”
Equal effective fan-outs (g i Ci+1 /Ci ) per stage
Example: memory decoder
pre-decoder
addr
input
word driver
3
15
CW
{Ref: I. Sutherland, Morgan-Kaufman‘98]
Slide 3.38
word
line
CL
Model Not Appropriate Any Longer
Traditional scaling model
1
),
0 .7
1
1
2
× 1 . 14 2 ) × ( 0 . 7 2 ) × (
Power = CVDD
f= (
) = 1 .3
0 .7
0 .7
If VDD = 0 . 7 , and Freq = (
Maintaining the frequency scaling model
If V DD = 0 . 7 , and Freq = 2 ,
1
2
= CVDD
× 1 . 14 2 ) × ( 0 . 7 2 ) × ( 2 ) = 1 . 8
f= (
0 .7
Power
While slowing down voltage scaling
If V DD = 0 . 85 , and Freq = 2 ,
1
× 1 . 14 2 ) × ( 0 . 85 2 ) × ( 2 ) = 2 . 7
0 .7
2
Power = CVDD
f= (
Slide 3.39
The New Design Philosophy
Maximum performance (in terms of
propagation delay) is too power-hungry,
and/or not even practically achievable
Many (if not most) applications either can
tolerate larger latency or can live with
lower-than-maximum clock speeds
Excess performance (as offered by
technology) to be used for energy/power
reduction
Trading off speed for power
Slide 3.40
Relationship Between Power and Delay
–4
–10
× 10
1
5
4
0.6
Delay (s)
Power (W)
0.8
× 10
A
0.4
0.2
0
4
V
DD
B
3
(V 2
)
1 0.8
0.4
–0. 4
0
)
V
(
H
VT
3
2
1
0
4
A
V 3 2
DD
(V
)
B
1 0.8
0.4
0 –0.4
V)
V TH (
For a given activity level, power is reduced while delay is unchanged if both
VDD and VTH are lowered, such as from A to B
[Ref: T. Sakurai and T. Kuroda, numerous references]
Slide 3.41
The Energy–Delay Space
VDD
Equal-performance curves
Equalenergy
curves
VTH
Energy minimum
Slide 3.42
Energy–Delay Product As a Metric
3.5
3
90 nm technology
VTH approx 0.35 V
delay
2.5
2
1.5
energy–delay product
1
energy
0.5
0
0.6
0.7
0.8
0.9
1
1.1
1.2
VDD
Energy–delay product exhibits minimum at approximately 2V TH
(typical unless leakage dominates)
Slide 3.43
Exploring the Energy–Delay Space
Energy
Unoptimized
design
Emax
Pareto-optimal
designs
Emin
Dmin
Dmax
Delay
In energy-constrained world, design is trade-off process
♦ Minimize energy for a given performance requirement
♦ Maximize performance for given energy budget
[Ref: D. Markovic, JSSC’04]
Slide 3.44
Summary
Power and energy are now primary design
constraints
Active power still dominating for most
applications
–Supply voltage, activity and capacitance the key
parameters
Leakage becomes major factor in sub-100 nm
technology nodes
–Mostly impacted by supply and threshold voltages
Design has become energy–delay trade-off
exercise!
Slide 3.45
References
D. Markovic, V. Stojanovic, B. Nikolic, M.A. Horowitz and R.W.
Brodersen, “Methods for true energy–performance optimization,”
IEEE Journal of Solid-State Circuits, 39(8), pp. 1282–1293,
Aug. 2004.
J. Rabaey, A. Chandrakasan and B. Nikolic, Digital Integrated Circuits:
A Design Perspective,” 2nd ed, Prentice Hall 2003.
T. Sakurai, “Perspectives on power-aware electronics,”
Digest of Technical Papers ISSCC, pp. 26–29, Feb. 2003.
I. Sutherland, B. Sproull and D. Harris, “Logical Effort”, Morgan
Kaufmann, 1999.
H. Veendrick, “Short-circuit dissipation of static CMOS circuitry
and its impact on the design of buffer circuits,” IEEE Journal of
Solid-State Circuits, SC-19(4), pp. 468–473, 1984.
Slide 3.46