OneDay860Rev1_4.ppt

Goals
•
•
•
•
Provide an overview of the 860 device
Allow a quick start of an 860 design cycle
Gain familiarity with debug issues particular to the 860
Create the basis to build further experience
[Rev 1.4]
1 of 91
Outline
• 860 Architecture
• Debug considerations
[Rev 1.4]
2 of 91
Outline
• 860 Architecture
–
–
–
–
Device overview
Core CPU
SIU
CPM
[Rev 1.4]
3 of 91
PowerPC
Core
4 KB I-Cache
IMMU
4 KB D-Cache
DMMU
SYSTEM INTERFACE UNIT
Memory Controller
Bus Interface Unit
System Functions
Real Time Clock
PCMCIA
COMM. PROCESSOR MODULE
Internal
Four
Interrupt Memory MAC
Timers Controller
Space
Parallel I/O
32-bit RISC and
Baud Rate
Generators Timers Program ROM
Serial
DMAs
Virtual
IDMAs
FEC
(860T)
SCC1
SCC2
SCC3
SCC4
Time Slot Assigner
SMC1
SMC2
SPI
I2 C
Serial Interface
[Rev 1.4]
4 of 91
CPU
•
•
•
•
•
•
Embedded version of the PowerPC core
One instruction fetched per clock
One instruction issued and retired per clock
Up to three instructions in execution per clock
Most instructions execute in one clock
Branches can execute in zero clocks
[Rev 1.4]
5 of 91
Programming Model
32 bits
GPR0
GPR1
GPR2
GPR3
GPR4
CR
XER
FPSCR
MSR
PVR
GPR30
GPR31
CTR
LR
TBU
TBL
SRR0
SRR1
DEC
SPRn
SPRx
[Rev 1.4]
6 of 91
MSR
Bit 0 is MSB
0
0
0
0
0
0
0
0
0
0
0
0
0 POW 0 ILE EE PR FP ME 0 SE BE 0
Bit 31 is LSB
0
IP IR DR 0
0
RI LE
Power management enabled
Interrupt little endian mode
External interrupt enable
Privilege level
Floating point available
Machine check enable
Floating point exception mode [0,1]
Single step trace enabled
Branch trace enabled
Exception [interrupt] prefix
Instruction address translation enabled
Data address translation enabled
Recoverable exception
Little endian mode
[Rev 1.4]
7 of 91
CPU Overview
Sequential
Fetcher
Instruction
Queue
Dispatch
Integer Unit
/,+,*,XER
GPR File
Branch
Processing
CTR
CR
LR
Inst.
MMU
Instruction Unit
R0-R31
GP Rename Regs
Completion
Unit
Inst. Cache
Load/Store Unit
Main
Memory
Data
MMU
Data
Cache
[Rev 1.4]
8 of 91
Execution Units
• Execution units operate in parallel
–
–
–
–
Fetch / Branch
Integer
Load / Store
Completion
[Rev 1.4]
9 of 91
Fetch / Dispatch
•
•
•
•
Instructions are fetched individually
Non-branch instructions enter the instruction queue
Branch instructions are redirected to the branch unit
One instruction can be sent to the execution units and one
to the branch unit for a total of two issued instructions per
clock
• All instructions “appear” to execute sequentially
[Rev 1.4]
10 of 91
On each CPU clock:
32 bit wide transfer from instruction cache
Instruction Cache
Instructions fall through to
first open location in queue
Instruction
Instruction
Instruction
Instruction
Instruction
Branch instruction closest to the
bottom of the queue is issued to
the branch unit on each clock
Instruction
Bottom non-branch instruction is
dispatched to available execution
unit
Instruction
Execution Unit
Branch
Processing
CTR
CR
LR
[Rev 1.4]
11 of 91
Branch
• Branches are pre-executed, giving an effective execution
time of zero clocks
• Instruction queue provides look ahead to determine data
dependencies
• Unresolved conditional branches are statically predicted
under control of the compiler
[Rev 1.4]
12 of 91
Subroutine Control Flow
Software maintained stack
Address of this instruction is
placed into the Link Register
by the branch function
GPR1
Branch to sub
LR
Instructions save the LR to the stack
to allow nested function calls
Branch to sub
The LR is reused for another call
LR
Branch to LR
The LR is recalled from the stack
to allow a return from subroutine
Branching to the contents of the LR is a return instruction
[Rev 1.4]
13 of 91
Integer
• Integer unit directly accesses the GPR file
• Rename registers prevent stalls and allow instructions to be
un-executed
• Most instructions execute in one clock
[Rev 1.4]
14 of 91
Load/Store
• Responsible for all transfers between the GPR file and main
memory
• Speculative loads are placed in the rename registers
• Speculative stores remain in the store queue
[Rev 1.4]
15 of 91
Completion
• Holds instructions executed in parallel or out of order until
they can be retired in order
• Retiring an instruction commits it’s results to the processor
state
• Simply discarding an instruction from the completion
queue effectively un-executes it
• One instruction can be retired per clock
[Rev 1.4]
16 of 91
Instruction Set
• 68K instructions were based on an accumulator, direct memory model
add (0x00035300).L, D4
D0
D1
D2
D3
D4
D5
D6
D7
0x00035300
+
[Rev 1.4]
17 of 91
Instruction Set
• PowerPC instructions are based on a triadic, load/store model
lwz
add
r2,0x00035300
r6,r2,r4
GPR0
GPR1
GPR2
GPR3
GPR4
GPR5
GPR6
GPR7
0x00035300
+
GPR31
[Rev 1.4]
18 of 91
Exceptions
• All exceptions cause processing to vector to a
predetermined memory location
• The base address of the vector table is controlled by the [IP] bit in
the MSR
• Each vector is placed at a page boundary
•
•
•
•
•
• 64 instructions can be placed at a vector before hitting the next
vector
Reset = 0xnnn00100
Machine Check = 0xnnn00200
External Interrupt = 0xnnn00500
Decrementer = 0xnnn00900
Etc.
[Rev 1.4]
19 of 91
Exceptions
Flash
MSR[IP] = 1
FFF00100
Instruction
64 instructions
External
500
Instruction
Instruction
64 instructions
ISI
400
Instruction
Instruction
64 instructions
DSI
300
Instruction
Instruction
64 instructions
RAM
00000100
MSR[IP] = 0
Machine Check 200
Instruction
Instruction
64 instructions
Reset
100
Instruction
[Rev 1.4]
20 of 91
Exceptions
• Only the Decrementer and the External Interrupt can be
masked by the [EE] bit in the MSR
• Machine Check exceptions can vector to a routine or force
Checkstop state
• All other exceptions are synchronous (caused by
instruction execution) and are unmaskable
[Rev 1.4]
21 of 91
Nesting Exceptions
• When an exception occurs, return state is stored in the
processor
•
•
•
•
There is no automated stacking of critical registers
The address of the return instruction is stored in SRR0
The MSR prior to the exception is in SRR1
The [EE] bit of the MSR is cleared
• The processor must save these registers and any other
GPR’s to a software maintained stack
• The EABI specifies GPR1 to be the stack pointer
• The [RI] bit in the MSR is set by software when enough
information is saved to allow recovery from a nested
exception
[Rev 1.4]
22 of 91
Exception Control Flow
Address of this instruction
is placed into SRR0 by
the hardware
An exception after the completion of
this instruction
causes flow to be directed to the
Software maintained stack
GPR1
ISR
SRR0
SRR1
Instructions save the SRR’s to the stack
to allow nested exceptions
The MSR[RI] bit is cleared by the
exception hardware and set by software
after the SRR’s have been saved
An exception while MSR[RI]
is cleared causes a machine
check event
The MSR[RI] bit is cleared by the
software just before the SRR’s are
restored by the software
It is safe for exceptions to occur
in this section of code
Breakpoints
Are
Exceptions!
The SRR’s is recalled from the stack
to allow a return from subroutine
rfi
[Rev 1.4]
23 of 91
Cache
• Independent instruction and data caches implements an
internal Harvard Architecture
• Each cache is 4Kbyte, two way set associative
– The 860P has an 8K, four way set associative instruction cache
• Caching of separate memory areas is controlled by the
MMU
[Rev 1.4]
24 of 91
Cache Organization
0
Stored in address tag (20)
128 sets
Set select (7)
31
Word Byte
Way 0
Block 254
Way 1
Block 255
Way 0
Address Tag 0
State
Words 0-7
Block 0
Way 1
Address Tag 1
State
Words 0-7
Block 1
[Rev 1.4]
25 of 91
Cache Operation
• Each cache block (or line) can be in one of three state (MEI
protocol)
– M = modified (or dirty)
• Resides in cache and is different than memory
– E = exclusive (resident and clean)
• Resides in cache and is identical to memory
– I = invalid (not resident)
• The “shared” state of the full MESI protocol is not supported
– Would allow synchronization of multiply cached blocks
• There is no cache snooping to monitor external masters
[Rev 1.4]
26 of 91
Cache control
• Hardware implementation dependent registers (HIDn)
control cache function
– Enabling
– Invalidate
– Locking
• Supervisor instructions provide block level control
– Allocate, flush, invalidate, store, touch, zero
• Ability to store a given block of memory into the cache is
controlled by the MMU
– Each block or page in the MMU has WIG bits
• (Write-through, Inhibited, Guarded)
[Rev 1.4]
27 of 91
MMU
• The MMU provides for both memory translation and
access control
• The system boots in Real (un-translated) mode
• To effectively use the caches, the MMU must be used in
page mode
– Effectively, a null translation is performed
[Rev 1.4]
28 of 91
Protection
• The primary use of the MMU in embedded applications is
for cache control and access protection
• The WIG bits are set for each page
– W = write-through (applicable only to data cache)
– I = inhibited
– G = guarded (indicates that memory is ill-behaved)
• I/O spaces
• No speculative reads or pre-fetches
[Rev 1.4]
29 of 91
Translation
• Page translation provides a virtual memory space of 252
bytes
• System must be debugged with RTOS tools
– Emulators and hardware debuggers don’t support it
[Rev 1.4]
30 of 91
Real mode
32
Logical address
WIG:
W = 0: write-back
I = 0: cache enable
G = 1: memory is guarded
32
Physical address
[Rev 1.4]
31 of 91
Page mode
10
Level One Descriptor
10
12
WG
20
Level Two Descriptor
20
Logical address
10
00
I
12
Physical address
[Rev 1.4]
32 of 91
Reset operation
Reset Source
Power-on reset
External hard reset
Debug hard reset
Loss of lock
Software watchdog
Bus monitor
Checkstop
External soft reset
Debug soft reset
Reset
PLL
System
configuration
sampled
Clock
module
reset
HREST
driven
Other internal
logic reset
SREST
driven
Core
reset
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
[Rev 1.4]
33 of 91
Reset Types
• Power-on reset is used to align all logic from a chaotic
state after Vcc stabilizes
– The PLL then begins to lock
• Hard reset is analogous to the normal reset on other
processors
– The PLL is not affected
• Soft reset can be used to initiate a warm start
– Not commonly used
– Not driven or monitored by the emulator
– Basically, a non-returnable exception to the reset vector
[Rev 1.4]
34 of 91
Reset Sequence
POR asserted
HRESET asserted
SREST asserted
HREST &
SREST asserted
HREST &
SREST asserted
SREST asserted
PLL locks
RSTCONF
sampled
RSTCONF
sampled
Internal logic
reset
Internal logic
reset
Internal logic
reset
HREST &
SRESET negated
HREST &
SRESET negated
SRESET negated
[Rev 1.4]
35 of 91
Memory Map Startup
Boot Map
CS0
At boot,
CS0 is
active for
the entire
address
space. All
other chip
selects are
invalid.
Before Software execution
Application
Target Map
Flash
Flash
Flash
IMMR
Flash
CSi
IMMR
I/O
Flash
Flash
CSx,y,z
RAM
Flash
[Rev 1.4]
36 of 91
Configuration Word
• Configuration word is latched from upper 16 bits of the
data bus during reset cycle
EARB
IIP
00
BPS
•
•
•
•
•
•
0
ISB
DBGC
DBPC
EBDF
0
0000_0000_0000_0000
EARB – External arbitration
IIP – Initial core prefix
BPS – Boot port size
DBGC – Debug pin configuration
DBPC – Debug port pins configuration
EBDF – External bus division factor
[Rev 1.4]
37 of 91
Memory Map Implications
•Since the Flash memory access by CS0 occupies the entire address space, boot code can
be linked to execute in a number of different locations
•Any branches will change the NIA from the boot location to the linked location
•All other chip selects are off
•IMMR RAM is still available
•CS0 must be reduced in scope before activating other chip selects
•Be careful no to pull the rug out from under the boot code when reducing CS0
•BSP re-entry issues:
•Altering chip select option registers while assuming the value in the Valid bit
•Can the chip selects to the RAM and Flash be altered while running out of either?
[Rev 1.4]
38 of 91
Memory Map Init Issues
•Three different factors can enhance (confuse) the boot process:
•The MSR[IP]
•The reset vector can be 0x0000_0100 or 0xfff0_0100
•Determined by the Reset Configuration Word
•Not changed by an SRESET
•CS0 scope
•CS0 responds to the entire memory map
•It must be changed while it is being used
•It may have already been reduced by a previous pass through the BSP
•Code link results
•Execution can start in code that is linked to a different address than the boot vector
•Only the address lines within the memory device are significant
•PC Relative addressing will solve this, right? WRONG!
•The first branch, will set the NIA MSB’s to the current execution value
[Rev 1.4]
39 of 91
RTOS Boot Sequences
Compressed
application
image
Flash
External
application
image
Boot Code
Boot code
decompresses
and relocates
application
from flash
BSP
IMMR
Data, stack,
heap, etc.
I/O
Chip Select x
Uncompressed
application
image
BSP
Boot code loads
application over
communication channel or
backplane
Base Register
Base Address
RAM
V
Option Register
Mask
Options
[Rev 1.4]
40 of 91
Endian Bus Connections
31 MS Byte Lane
24
7
0
8 Bit
7 LS Byte Lane
0
7
0
8 Bit
0 MS Byte Lane
7
7
0
8 Bit
68K
7 LS Byte Lane
0
31 MS Byte Lane
24
X86
PPC
24 LS Byte Lane
31
[Rev 1.4]
41 of 91
Big Endian Bus
8 Bit
16 Bit
7-0
15-8
0-7
0-7
7-0
32 Bit
31-24 23-16 15-8
7-0
0-7
8-15
8-15
16-23
860
0
7
8
15
16
23
24
31
MS Byte Lane
24-31
Byte Lane
Byte Lane
LS Byte Lane
[Rev 1.4]
42 of 91
SIU
• The SIU contains the logic to interface the external system
components to the 860
• Contains all of the glue logic needed for a typical
embedded application
[Rev 1.4]
43 of 91
SIU Overview
SYSTEM INTERFACE UNIT
Memory Controller
Bus Interface Unit
System Functions
Real Time Clock
PCMCIA
[Rev 1.4]
44 of 91
860 bus cycle
Address
Data
[Rev 1.4]
45 of 91
Memory Control
• 8 banks of memory
– Each can be configured for any type of device
• Glueless support of SRAM, EPROM, Flash
– Using general purpose chip select machine
• Two user programmable machines
[Rev 1.4]
46 of 91
System control
•
•
•
•
•
•
•
•
Clock synthesis
Reset control
Interrupt control
Real time clock
Periodic interrupt timer
Bus monitor
Bus arbiter
Watchdog timer
[Rev 1.4]
47 of 91
Interrupt Control
Software
Watchdog Timer
IRQ[0-7]
SCC [1:4]
SMC [1:4]
SPI
I2C
PIP
IDMA [1:2]
SDMA
RISC Timers
IRQ0
Reset
Edge / Level
Timebase
PIT
Realtime
clock
PCMCIA
SIU Interrupt Controller
CPM Timer[1:4]
CPM Interrupt Controller
Port C [4:15]
Or
INT
CPU
[Rev 1.4]
48 of 91
SIU Interrupt Vectors
•
All external interrupts cause processing at 0xnnn00500
– There is space for 64 instructions to save processor state and resolve the SIU
vector
•
Vectors are six bits
– Indirect addressing is used to decommutate to service routines
– A 16 bit load from the long word address of the SIVEC register will point to a 64
entry array of 1K byte (256 instruction) service routines.
– An 8 bit load will allow a 64 entry jump table of branch instructions
•
A shifting operation can alter the size between these two choices
[Rev 1.4]
49 of 91
SIU Interrupt Vector Register
5 6
0
Six Bit Interrupt Code
0
7 8
0
0
15 16
0
0
0
0
0
0
0
0
31
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
8 bit read from
address 0xnnnn001C
16 bit read from address 0xnnnn001C
32 bit read from address 0xnnnn001C
[Rev 1.4]
50 of 91
SIU Interrupt Vectors 8 bit Read
Six Bit Interrupt Code
0
0
Table of branch instructions to ISRs
Each vector value points
to a different branch instruction
in the table
ba routine_g
ba routine_f
ba routine_e
ba routine_d
ba routine_c
ba routine_b
ba routine_a
_18
_14
_10
_0c
_08
_04
_00
[Rev 1.4]
51 of 91
SIU Interrupt Vectors 16 bit Read
Six Bit Interrupt Code
0
0
0
0
0
0
0
0
0
0
nnnn0fff
Each vector value points
to a block of 256 instructions
256 32-bit
instructions
nnnn0c00
nnnn0bff
256 32-bit
instructions
nnnn0800
nnnn07ff
256 32-bit
instructions
nnnn0400
nnnn03ff
256 32-bit
instructions
nnnn0000
[Rev 1.4]
52 of 91
CPM Interrupt Vector Register
4
5
6
7 8
5 Bit Interrupt Code
0
0
0
0
0
15 16
0
0
0
0
0
0 IACK 0
31
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
8 bit read from
address 0xnnnn0930
16 bit read from address 0xnnnn0930
32 bit read from address 0xnnnn0930
[Rev 1.4]
53 of 91
CPM
• Communications processor module
• Direct hardware support for all protocol and application
interfaces
– Ethernet, HDLC, Async HDLC, T1/E1, T3/E3, UART, ISDN,
Infrared
– Parallel I/O
– Full serial and virtual DMA support
[Rev 1.4]
54 of 91
IMMR Format
• All on-chip peripherals are accessed though a single 64K
byte area of memory
• The first 8K of address space contains the control registers
of the on-chip peripherals and the SI RAM
• Within the second 8K of address space, there are three
blocks of dual ported RAM
[Rev 1.4]
55 of 91
IMMR Area
0xnnnn_0000
CPM/SIU Control Registers [3K]
0xnnnn_0C00
0xnnnn_0E00
0xnnnn_1000
May reside on any
64K boundary
[16K]
SI RAM [512]
FEC Control Registers [512]
Reserved [4K]
0xnnnn_2000
Dual ported RAM [4K]
0xnnnn_3000
Reserved for Dual ported
RAM expansion [3K]
0xnnnn_3C00
Parameter RAM [1K]
0xnnnn_4000
[Rev 1.4]
56 of 91
0 KB
1 KB
Dual ported RAM
Buffer Descriptors / uCode / Data
2 KB
3 KB
Buffer Descriptors / Data
4 KB
5 KB
6 KB
7 KB
Parameter RAM
8 KB
[Rev 1.4]
57 of 91
Dual Ported RAM usage
• The layout of the Dual Ported RAM is determined by the
uCode in the CPM
• When the CPM is not in operation, it is nothing more than
internal memory
– During the boot sequence, stack, global data, and heap can reside
in this memory
– Initialization code can be written in C++!
– A multi-layered boot process can be used
• First code resides in flash, uses internal RAM to setup chip selects
• Second code resides in another section of flash and uses external
RAM to load main application over a CPM channel
• Third level is the main application
– Each level has it’s own crt0.s function and initializes the EABI
from scratch
[Rev 1.4]
58 of 91
CPM Overview
COMM. PROCESSOR MODULE
Internal
Four
Interrupt Memory MAC
Timers Controller
Space
Parallel I/O
32-bit RISC and
Baud Rate
Generators Timers Program ROM
Serial
DMAs
Virtual
IDMAs
FEC
(860T)
SCC1
SCC2
SCC3
SCC4
Time Slot Assigner
SMC1
SMC2
SPI
I2 C
Serial Interface
[Rev 1.4]
59 of 91
DMA’s
• Serial DMA’s
– Full bi-directional support of all serial channels
• Virtual DMA
– 4 channels
– Uses the serial DMA hardware to generate transfers
– Memory to memory or memory to/from I/O
[Rev 1.4]
60 of 91
CPM Buffer Structure
BD128
IMMR
BD3
BD2
BD1
RAM
[Rev 1.4]
61 of 91
Buffer Descriptor Format
16 bits
Status and Control
Data Length
High Order Pointer
Low Order Pointer
[Rev 1.4]
62 of 91
From Channel to Buffer
Location fixed by:
- Hardware channel
Format fixed by:
- Protocol
Communication
Channel hardware
Parameter
RAM
Dual ported
RAM
(Buffer
Descriptors)
Location determined by:
- Value in Buffer Descriptor
- Memory controller mapping of Local/603 bus
Format determined by:
- Protocol
Data Buffers
Location determined by:
- Parameter RAM value
Format of control and status determined by Protocol
[Rev 1.4]
63 of 91
SCC’s
• The SCC’s implement the following protocols:
–
–
–
–
SDLC/HDLC
AppleTalk
UART
10-Mbps Ethernet
[Rev 1.4]
64 of 91
Ethernet Frame
Stored by CPM in Receive buffer
Stored by CPU in Transmit buffer
Preamble
Start
Frame
Destination
Address
Source
Address
Type /
Length
7 bytes
1 bytes
6 bytes
6 bytes
2 bytes
Data
46 - 1500 bytes
Frame
Check
4 bytes
[Rev 1.4]
65 of 91
Ethernet Buffer Descriptor
Receive
Control & Status
E
Transmit
Control & Status
R
Common for
Transmit and
Receive
- W I L F
PAD
- M
-
LG NO SH CR OV CL
W I L TC DEF HB RC RL
RC
UN CSL
Data Length
High Order Pointer
Low Order Pointer
[Rev 1.4]
66 of 91
Status and Control Definitions
Receive
Control & Status
E
- W I L F
- M
-
LG NO SH CR OV CL
First in Frame: Set by the CPM to inform the CPU that this is the
start of a new frame.
Last in Frame: Set by the CPM or the CPU to inform the other that
this is the last buffer of a frame.
Interrupt: Generate an interrupt after this buffer is used by the CPM.
Wrap: This is the last BD in this set of BD’s.
Empty / Ready:
0 = This buffer is owned by the CPU
1 = This buffer is owned by the CPM
Transmit CRC: Transmit the CRC after this buffer
Transmit
Control & Status
R
PAD
W I L TC DEF HB RC RL
RC
UN CSL
[Rev 1.4]
67 of 91
Transmit Frames
Parameter RAM points to this BD
R=0 W=0 I=0
L = 0 TC = 1
R=0 W=0 I=0
L = 0 TC = 1
R=0 W=0 I=0
L = 0 TC = 1
R=0 W=0 I=0
L = 0 TC = 1
R=0 W=0 I=1
L = 1 TC = 1
R=0 W=0 I=0
L = 0 TC = 1
R=0 W=0 I=0
L = 0 TC = 1
R=0 W=0 I=1
L = 1 TC = 1
R=0 W=1 I=1
L = 1 TC = 1
After all buffers are filled, “R” is set
to “1” in all BD’s in this list
These BD’s are for the next frame
for this channel
This BD is for a single buffer frame
[Rev 1.4]
68 of 91
Receive Frames
Parameter RAM points to this BD
E=1 W=0 I=0
L= 0 F= 1
E=1 W=0 I=0
L= 0 F= 0
E=1 W=0 I=0
L= 0 F= 0
E=1 W=0 I=0
L= 0 F= 0
E=1 W=0 I=0
L= 1 F= 0
E=1 W=0 I=0
L= 0 F= 1
E=1 W=0 I=0
L= 0 F= 0
E=1 W=0 I=0
L= 1 F= 0
E=1 W=1 I=0
L= 1 F= 1
After all buffers are filled, “E” is set
to “1” in all BD’s in this list
These BD’s are for the next frame
for this channel
This BD is for a single buffer frame
[Rev 1.4]
69 of 91
The [E/R] bits
Initial
Value
Operation
Transmit
[Ready]
0
Fill with data
by CPU
Receive
[Empty]
1
Fill with data
by CPM
Changed
by
Changed
to
Operation
Changed Changed
by
to
CPU
1
CPM
transmits
buffer
CPM
0
CPM
0
CPU reads
buffer
CPU
1
Polarity can be confusing because the sense is reversed for
complementary operations. However, the same level always
indicates who [CPU vs. CPM] owns the buffer. This bit is the
same for all protocols on all channels.
[Rev 1.4]
70 of 91
The [W] bits
• The Wrap bit is always set to indicate the last buffer
descriptor for the channel
• It does not delineate frames!
• The value of the first buffer descriptor is stored in the
channel’s parameter RAM
– The list of BD’s is bounded by the parameter RAM and the [W] bit
• Any BD past a BD with the [W] bit set, that’s not pointed
to by parameter RAM is inaccessible by the CPM
• This bit is the same for all protocols on all channels.
[Rev 1.4]
71 of 91
The [I] Bits
• The Interrupt bits generate an interrupt to the CPU when the CPM
hands the BD to the CPU
– Whenever the CPM flips the [E/R] bit to “0”
• A redundant phrase, the CPM can only flip that bit to “0”, right?
• For transmit, it’s common to only receive an interrupt at the end of
transmission of the last buffer
• For receive, the last buffer is not known, so it’s more common to
receive an interrupt for most buffers on non-frame oriented protocols
– If a buffer is small enough that it can’t contain an entire frame, then this
bit might be cleared
• The CPU has to stay ahead of the CPM to know when a wrap occurred
– On Ethernet, the end of frame interrupt is more efficient
• This bit is the same for all protocols on all channels.
[Rev 1.4]
72 of 91
The [L] Bits
• The Last bits indicate the end of a frame within the list of
buffer descriptors
• Set and cleared by the CPU on transmit frames
– The CPM only reads this bit for transmit
• Set by the CPM on receive frames
– Should be cleared by the CPU before the [E] is used to hand the
buffer to the CPM
• This bit is not the same for all protocols on all channels.
[Rev 1.4]
73 of 91
The [F] Bits
• The First bit is only present in receive frames
• Set by the CPM to tell the CPU that this buffer starts a
frame
– An underrun, late collision, or aborted frame can cause a new
frame in the next buffer without the [L] bit being set in the
previous BD
• Not needed for transmit
– The CPU will control the state of the CPM with the [L] bit
– An [L] bit set or an underrun will cause the next buffer to be
considered the first buffer of a frame
• This bit is not the same for all protocols on all channels.
[Rev 1.4]
74 of 91
The [TC] Bits
• The Transmit CRC bits work in conjunction with the [L]
bit
• The [TC] bit is ignored if the [L] bit is cleared
• Initializing all [TC] bits to “1” is a good precaution
• Only custom protocols that don’t use hardware generated
CRC’s should have this bit cleared
• This bit is not the same for all protocols on all channels.
[Rev 1.4]
75 of 91
Subtle points on BD’s
• Frames can span buffers
• Buffers never span frames
– Unless you have all hardware support turned off and are running
transparent
• Be careful with small receive buffers that have the [I] bit set
– You’ll get hammered with interrupts
• Turn buffers over to the CPM from last to first
– If an interrupt interferes with the handoff, an underrun / overflow can
occur
• Hands off a BD with the [E/R] bit set
– Unless you like working weekends
[Rev 1.4]
76 of 91
FEC (860T)
• The FEC support:
– 10/100-Mbps Ethernet through an MII
– Designed completely in discrete logic, independent of the CPM
– Operation is similar to SCCs
• Large internal FIFO’s (224 bytes in each direction) reduce the impact
of bus latency and allow discard frames from appearing on the
processor bus.
[Rev 1.4]
77 of 91
FEC Buffer Descriptors
• Identical in format to the SCC’s buffer descriptors
• Except:
– Buffer descriptors, as well as buffers are in main memory
– Pointers to buffer descriptors in the parameter RAM are 32 bits
• Buffer descriptors must still be in consecutive memory locations
• To prevent the FEC from hammering the processor bus, it doesn’t
continually poll the BD’s for the ready bit
– The software must write to a control register to initiate a poll sequence
– The BD’s will continue to be polled in sequence until one is found to be
not ready
[Rev 1.4]
78 of 91
SMC’s
• The SMC’s perform basic UART as well as transparent
mode transmission
• Buffer description operation is identical to the SCC’s
– The status and control word has different bit fields pertaining to the
protocols
– Bit fields controlling protocol independent operation are
unchanged
[Rev 1.4]
79 of 91
Status and Control Definitions
[SMC in UART mode]
Receive
Control & Status
E
- W I
-
- CM ID
-
BR FR PR - OV -
Idle: Close buffer on reception of idles
Continuous mode: [E] bit isn’t cleared on buffer reception
Interrupt: Generate an interrupt after this buffer is used by the CPM.
Wrap: This is the last BD in this set of BD’s.
Empty / Ready:
Transmit
Control & Status
R - W I
-
- CM P
-
-
0 = This buffer is owned by the CPU
1 = This buffer is owned by the CPM
-
-
-
[Rev 1.4]
80 of 91
MII
PQ II
MPC
8260
FCCn
Transmit Error
(Tx_ER)
Transmit Nibble Data (TxD[3:0])
Transmit Enable
(Tx_EN)
Transmit Clock
(Tx_clk)
Collision Detect
(COL)
Receive Nibble Data
(RxD[3:0])
Receive Error
(Rx_ER)
Receive Clock
(Rx_clk)
Receive Data Valid
(Rx_DV)
Carrier Sense output
(CRS)
Management Data Clock (MDC)
Management Data I/O (MDIO)
Fast
Ethernet
PHY
[Rev 1.4]
81 of 91
Debug Considerations






What is BDM
Getting out of reset
The cache is on
CPM Realities
Exception Routines
Tracing at the Bus Cycle Level
[Rev 1.4]
82 of 91
What is BDM?

BDM is a serial command and control port into the
CPU
 Commands are sent into the device through the
BDM port instead of the instruction fetch path
 BDM accesses to memory are in REAL mode
 They don’t go through the MMU or the cache
[Rev 1.4]
83 of 91
BDM connection
• BDM connector allows for full run control of the processor
VLFS0
GND
GND
HRESET
VDD
1
2
3
4
5
6
7
8
9
10
SRESET
DSCK
VLFS1
DSDI
DSDO
FRZ
GND
GND
HRESET
VDD
1
2
3
4
5
6
7
8
9
10
SRESET
DSCK
FRZ
DSDI
DSDO
[Rev 1.4]
84 of 91
Getting out of Reset

Reset Configuration word of vital importance
 What is your Interrupt Prefix?
 Watch for the clock multiplier
 If the multiplier is too large, power must be
removed from the device to recover
 Reset will not recover from too high of a
clock
[Rev 1.4]
85 of 91
The Core and Bus



Four hardware code breakpoints available; two hardware data
breakpoints
Predictive Fetching and speculative loading means what you
see on the bus may not be executed or used.
VLFS pins used for back trace don’t work in split bus mode.
[Rev 1.4]
86 of 91
The Caches are On
• Bus Cycles now appear as bursts
• Fetches are determined by the BIU, not related to instruction
execution
• Cache visibility pins don’t work in split bus mode
• Instrumentation required for accurate debug above a 50MHz
core
• Caution must be exercised when the boot process performs a
code relocation
–
–
–
–
–
Contents are cached as data during the move
Contents are fetched as instructions after the move
The instruction queue doesn’t snoop the data cache
The load/store unit doesn’t snoop the instruction cache
There is no cache coherency
[Rev 1.4]
87 of 91
CPM Realities
•
•
•
•
The CPM operates independently of the CPU
The CPM is not debugged yet.. Expect the unexpected
“Last Buffer Interrupt” occurs at the beginning of transmission
The SPI can overwhelm the CPM and crash the other channels
[Rev 1.4]
88 of 91
Exception Routines

Exception Routines are difficult to debug
 The Recoverability of exceptions is an issue
 Breakpoints and single stepping can’t be used
until the RI bit is set
[Rev 1.4]
89 of 91
Tracing at the Bus Cycle Level

The 860 comes in a BGA package
 Connecting to an emulator
 Connecting to an analyzer
[Rev 1.4]
90 of 91
Connecting to an Emulator (1)
Connection to Emulator
Buffer Board
Original 860
BGA site to pin socket
Target Adaptor
Pin header
Target board
[Rev 1.4]
91 of 91
Connecting to an Emulator (2)
Connection to Emulator
Buffer Board
Original 860
Customer provided connectors
Target board
[Rev 1.4]
92 of 91
Connecting to an Analyzer
Mictor Connectors
860
Target board
[Rev 1.4]
93 of 91
Summary of Undocumented Issues
•Init MMU before turning on caches
•The CPM doesn’t use the MMU’s or the caches
•Don’t single step through moves to or from SPR’s
•ISR’s can not have breakpoints in the first or last few instructions
•Each processor must have it’s own BDM connector
[Rev 1.4]
94 of 91