Goals • • • • Provide an overview of the 860 device Allow a quick start of an 860 design cycle Gain familiarity with debug issues particular to the 860 Create the basis to build further experience [Rev 1.4] 1 of 91 Outline • 860 Architecture • Debug considerations [Rev 1.4] 2 of 91 Outline • 860 Architecture – – – – Device overview Core CPU SIU CPM [Rev 1.4] 3 of 91 PowerPC Core 4 KB I-Cache IMMU 4 KB D-Cache DMMU SYSTEM INTERFACE UNIT Memory Controller Bus Interface Unit System Functions Real Time Clock PCMCIA COMM. PROCESSOR MODULE Internal Four Interrupt Memory MAC Timers Controller Space Parallel I/O 32-bit RISC and Baud Rate Generators Timers Program ROM Serial DMAs Virtual IDMAs FEC (860T) SCC1 SCC2 SCC3 SCC4 Time Slot Assigner SMC1 SMC2 SPI I2 C Serial Interface [Rev 1.4] 4 of 91 CPU • • • • • • Embedded version of the PowerPC core One instruction fetched per clock One instruction issued and retired per clock Up to three instructions in execution per clock Most instructions execute in one clock Branches can execute in zero clocks [Rev 1.4] 5 of 91 Programming Model 32 bits GPR0 GPR1 GPR2 GPR3 GPR4 CR XER FPSCR MSR PVR GPR30 GPR31 CTR LR TBU TBL SRR0 SRR1 DEC SPRn SPRx [Rev 1.4] 6 of 91 MSR Bit 0 is MSB 0 0 0 0 0 0 0 0 0 0 0 0 0 POW 0 ILE EE PR FP ME 0 SE BE 0 Bit 31 is LSB 0 IP IR DR 0 0 RI LE Power management enabled Interrupt little endian mode External interrupt enable Privilege level Floating point available Machine check enable Floating point exception mode [0,1] Single step trace enabled Branch trace enabled Exception [interrupt] prefix Instruction address translation enabled Data address translation enabled Recoverable exception Little endian mode [Rev 1.4] 7 of 91 CPU Overview Sequential Fetcher Instruction Queue Dispatch Integer Unit /,+,*,XER GPR File Branch Processing CTR CR LR Inst. MMU Instruction Unit R0-R31 GP Rename Regs Completion Unit Inst. Cache Load/Store Unit Main Memory Data MMU Data Cache [Rev 1.4] 8 of 91 Execution Units • Execution units operate in parallel – – – – Fetch / Branch Integer Load / Store Completion [Rev 1.4] 9 of 91 Fetch / Dispatch • • • • Instructions are fetched individually Non-branch instructions enter the instruction queue Branch instructions are redirected to the branch unit One instruction can be sent to the execution units and one to the branch unit for a total of two issued instructions per clock • All instructions “appear” to execute sequentially [Rev 1.4] 10 of 91 On each CPU clock: 32 bit wide transfer from instruction cache Instruction Cache Instructions fall through to first open location in queue Instruction Instruction Instruction Instruction Instruction Branch instruction closest to the bottom of the queue is issued to the branch unit on each clock Instruction Bottom non-branch instruction is dispatched to available execution unit Instruction Execution Unit Branch Processing CTR CR LR [Rev 1.4] 11 of 91 Branch • Branches are pre-executed, giving an effective execution time of zero clocks • Instruction queue provides look ahead to determine data dependencies • Unresolved conditional branches are statically predicted under control of the compiler [Rev 1.4] 12 of 91 Subroutine Control Flow Software maintained stack Address of this instruction is placed into the Link Register by the branch function GPR1 Branch to sub LR Instructions save the LR to the stack to allow nested function calls Branch to sub The LR is reused for another call LR Branch to LR The LR is recalled from the stack to allow a return from subroutine Branching to the contents of the LR is a return instruction [Rev 1.4] 13 of 91 Integer • Integer unit directly accesses the GPR file • Rename registers prevent stalls and allow instructions to be un-executed • Most instructions execute in one clock [Rev 1.4] 14 of 91 Load/Store • Responsible for all transfers between the GPR file and main memory • Speculative loads are placed in the rename registers • Speculative stores remain in the store queue [Rev 1.4] 15 of 91 Completion • Holds instructions executed in parallel or out of order until they can be retired in order • Retiring an instruction commits it’s results to the processor state • Simply discarding an instruction from the completion queue effectively un-executes it • One instruction can be retired per clock [Rev 1.4] 16 of 91 Instruction Set • 68K instructions were based on an accumulator, direct memory model add (0x00035300).L, D4 D0 D1 D2 D3 D4 D5 D6 D7 0x00035300 + [Rev 1.4] 17 of 91 Instruction Set • PowerPC instructions are based on a triadic, load/store model lwz add r2,0x00035300 r6,r2,r4 GPR0 GPR1 GPR2 GPR3 GPR4 GPR5 GPR6 GPR7 0x00035300 + GPR31 [Rev 1.4] 18 of 91 Exceptions • All exceptions cause processing to vector to a predetermined memory location • The base address of the vector table is controlled by the [IP] bit in the MSR • Each vector is placed at a page boundary • • • • • • 64 instructions can be placed at a vector before hitting the next vector Reset = 0xnnn00100 Machine Check = 0xnnn00200 External Interrupt = 0xnnn00500 Decrementer = 0xnnn00900 Etc. [Rev 1.4] 19 of 91 Exceptions Flash MSR[IP] = 1 FFF00100 Instruction 64 instructions External 500 Instruction Instruction 64 instructions ISI 400 Instruction Instruction 64 instructions DSI 300 Instruction Instruction 64 instructions RAM 00000100 MSR[IP] = 0 Machine Check 200 Instruction Instruction 64 instructions Reset 100 Instruction [Rev 1.4] 20 of 91 Exceptions • Only the Decrementer and the External Interrupt can be masked by the [EE] bit in the MSR • Machine Check exceptions can vector to a routine or force Checkstop state • All other exceptions are synchronous (caused by instruction execution) and are unmaskable [Rev 1.4] 21 of 91 Nesting Exceptions • When an exception occurs, return state is stored in the processor • • • • There is no automated stacking of critical registers The address of the return instruction is stored in SRR0 The MSR prior to the exception is in SRR1 The [EE] bit of the MSR is cleared • The processor must save these registers and any other GPR’s to a software maintained stack • The EABI specifies GPR1 to be the stack pointer • The [RI] bit in the MSR is set by software when enough information is saved to allow recovery from a nested exception [Rev 1.4] 22 of 91 Exception Control Flow Address of this instruction is placed into SRR0 by the hardware An exception after the completion of this instruction causes flow to be directed to the Software maintained stack GPR1 ISR SRR0 SRR1 Instructions save the SRR’s to the stack to allow nested exceptions The MSR[RI] bit is cleared by the exception hardware and set by software after the SRR’s have been saved An exception while MSR[RI] is cleared causes a machine check event The MSR[RI] bit is cleared by the software just before the SRR’s are restored by the software It is safe for exceptions to occur in this section of code Breakpoints Are Exceptions! The SRR’s is recalled from the stack to allow a return from subroutine rfi [Rev 1.4] 23 of 91 Cache • Independent instruction and data caches implements an internal Harvard Architecture • Each cache is 4Kbyte, two way set associative – The 860P has an 8K, four way set associative instruction cache • Caching of separate memory areas is controlled by the MMU [Rev 1.4] 24 of 91 Cache Organization 0 Stored in address tag (20) 128 sets Set select (7) 31 Word Byte Way 0 Block 254 Way 1 Block 255 Way 0 Address Tag 0 State Words 0-7 Block 0 Way 1 Address Tag 1 State Words 0-7 Block 1 [Rev 1.4] 25 of 91 Cache Operation • Each cache block (or line) can be in one of three state (MEI protocol) – M = modified (or dirty) • Resides in cache and is different than memory – E = exclusive (resident and clean) • Resides in cache and is identical to memory – I = invalid (not resident) • The “shared” state of the full MESI protocol is not supported – Would allow synchronization of multiply cached blocks • There is no cache snooping to monitor external masters [Rev 1.4] 26 of 91 Cache control • Hardware implementation dependent registers (HIDn) control cache function – Enabling – Invalidate – Locking • Supervisor instructions provide block level control – Allocate, flush, invalidate, store, touch, zero • Ability to store a given block of memory into the cache is controlled by the MMU – Each block or page in the MMU has WIG bits • (Write-through, Inhibited, Guarded) [Rev 1.4] 27 of 91 MMU • The MMU provides for both memory translation and access control • The system boots in Real (un-translated) mode • To effectively use the caches, the MMU must be used in page mode – Effectively, a null translation is performed [Rev 1.4] 28 of 91 Protection • The primary use of the MMU in embedded applications is for cache control and access protection • The WIG bits are set for each page – W = write-through (applicable only to data cache) – I = inhibited – G = guarded (indicates that memory is ill-behaved) • I/O spaces • No speculative reads or pre-fetches [Rev 1.4] 29 of 91 Translation • Page translation provides a virtual memory space of 252 bytes • System must be debugged with RTOS tools – Emulators and hardware debuggers don’t support it [Rev 1.4] 30 of 91 Real mode 32 Logical address WIG: W = 0: write-back I = 0: cache enable G = 1: memory is guarded 32 Physical address [Rev 1.4] 31 of 91 Page mode 10 Level One Descriptor 10 12 WG 20 Level Two Descriptor 20 Logical address 10 00 I 12 Physical address [Rev 1.4] 32 of 91 Reset operation Reset Source Power-on reset External hard reset Debug hard reset Loss of lock Software watchdog Bus monitor Checkstop External soft reset Debug soft reset Reset PLL System configuration sampled Clock module reset HREST driven Other internal logic reset SREST driven Core reset yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes [Rev 1.4] 33 of 91 Reset Types • Power-on reset is used to align all logic from a chaotic state after Vcc stabilizes – The PLL then begins to lock • Hard reset is analogous to the normal reset on other processors – The PLL is not affected • Soft reset can be used to initiate a warm start – Not commonly used – Not driven or monitored by the emulator – Basically, a non-returnable exception to the reset vector [Rev 1.4] 34 of 91 Reset Sequence POR asserted HRESET asserted SREST asserted HREST & SREST asserted HREST & SREST asserted SREST asserted PLL locks RSTCONF sampled RSTCONF sampled Internal logic reset Internal logic reset Internal logic reset HREST & SRESET negated HREST & SRESET negated SRESET negated [Rev 1.4] 35 of 91 Memory Map Startup Boot Map CS0 At boot, CS0 is active for the entire address space. All other chip selects are invalid. Before Software execution Application Target Map Flash Flash Flash IMMR Flash CSi IMMR I/O Flash Flash CSx,y,z RAM Flash [Rev 1.4] 36 of 91 Configuration Word • Configuration word is latched from upper 16 bits of the data bus during reset cycle EARB IIP 00 BPS • • • • • • 0 ISB DBGC DBPC EBDF 0 0000_0000_0000_0000 EARB – External arbitration IIP – Initial core prefix BPS – Boot port size DBGC – Debug pin configuration DBPC – Debug port pins configuration EBDF – External bus division factor [Rev 1.4] 37 of 91 Memory Map Implications •Since the Flash memory access by CS0 occupies the entire address space, boot code can be linked to execute in a number of different locations •Any branches will change the NIA from the boot location to the linked location •All other chip selects are off •IMMR RAM is still available •CS0 must be reduced in scope before activating other chip selects •Be careful no to pull the rug out from under the boot code when reducing CS0 •BSP re-entry issues: •Altering chip select option registers while assuming the value in the Valid bit •Can the chip selects to the RAM and Flash be altered while running out of either? [Rev 1.4] 38 of 91 Memory Map Init Issues •Three different factors can enhance (confuse) the boot process: •The MSR[IP] •The reset vector can be 0x0000_0100 or 0xfff0_0100 •Determined by the Reset Configuration Word •Not changed by an SRESET •CS0 scope •CS0 responds to the entire memory map •It must be changed while it is being used •It may have already been reduced by a previous pass through the BSP •Code link results •Execution can start in code that is linked to a different address than the boot vector •Only the address lines within the memory device are significant •PC Relative addressing will solve this, right? WRONG! •The first branch, will set the NIA MSB’s to the current execution value [Rev 1.4] 39 of 91 RTOS Boot Sequences Compressed application image Flash External application image Boot Code Boot code decompresses and relocates application from flash BSP IMMR Data, stack, heap, etc. I/O Chip Select x Uncompressed application image BSP Boot code loads application over communication channel or backplane Base Register Base Address RAM V Option Register Mask Options [Rev 1.4] 40 of 91 Endian Bus Connections 31 MS Byte Lane 24 7 0 8 Bit 7 LS Byte Lane 0 7 0 8 Bit 0 MS Byte Lane 7 7 0 8 Bit 68K 7 LS Byte Lane 0 31 MS Byte Lane 24 X86 PPC 24 LS Byte Lane 31 [Rev 1.4] 41 of 91 Big Endian Bus 8 Bit 16 Bit 7-0 15-8 0-7 0-7 7-0 32 Bit 31-24 23-16 15-8 7-0 0-7 8-15 8-15 16-23 860 0 7 8 15 16 23 24 31 MS Byte Lane 24-31 Byte Lane Byte Lane LS Byte Lane [Rev 1.4] 42 of 91 SIU • The SIU contains the logic to interface the external system components to the 860 • Contains all of the glue logic needed for a typical embedded application [Rev 1.4] 43 of 91 SIU Overview SYSTEM INTERFACE UNIT Memory Controller Bus Interface Unit System Functions Real Time Clock PCMCIA [Rev 1.4] 44 of 91 860 bus cycle Address Data [Rev 1.4] 45 of 91 Memory Control • 8 banks of memory – Each can be configured for any type of device • Glueless support of SRAM, EPROM, Flash – Using general purpose chip select machine • Two user programmable machines [Rev 1.4] 46 of 91 System control • • • • • • • • Clock synthesis Reset control Interrupt control Real time clock Periodic interrupt timer Bus monitor Bus arbiter Watchdog timer [Rev 1.4] 47 of 91 Interrupt Control Software Watchdog Timer IRQ[0-7] SCC [1:4] SMC [1:4] SPI I2C PIP IDMA [1:2] SDMA RISC Timers IRQ0 Reset Edge / Level Timebase PIT Realtime clock PCMCIA SIU Interrupt Controller CPM Timer[1:4] CPM Interrupt Controller Port C [4:15] Or INT CPU [Rev 1.4] 48 of 91 SIU Interrupt Vectors • All external interrupts cause processing at 0xnnn00500 – There is space for 64 instructions to save processor state and resolve the SIU vector • Vectors are six bits – Indirect addressing is used to decommutate to service routines – A 16 bit load from the long word address of the SIVEC register will point to a 64 entry array of 1K byte (256 instruction) service routines. – An 8 bit load will allow a 64 entry jump table of branch instructions • A shifting operation can alter the size between these two choices [Rev 1.4] 49 of 91 SIU Interrupt Vector Register 5 6 0 Six Bit Interrupt Code 0 7 8 0 0 15 16 0 0 0 0 0 0 0 0 31 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 bit read from address 0xnnnn001C 16 bit read from address 0xnnnn001C 32 bit read from address 0xnnnn001C [Rev 1.4] 50 of 91 SIU Interrupt Vectors 8 bit Read Six Bit Interrupt Code 0 0 Table of branch instructions to ISRs Each vector value points to a different branch instruction in the table ba routine_g ba routine_f ba routine_e ba routine_d ba routine_c ba routine_b ba routine_a _18 _14 _10 _0c _08 _04 _00 [Rev 1.4] 51 of 91 SIU Interrupt Vectors 16 bit Read Six Bit Interrupt Code 0 0 0 0 0 0 0 0 0 0 nnnn0fff Each vector value points to a block of 256 instructions 256 32-bit instructions nnnn0c00 nnnn0bff 256 32-bit instructions nnnn0800 nnnn07ff 256 32-bit instructions nnnn0400 nnnn03ff 256 32-bit instructions nnnn0000 [Rev 1.4] 52 of 91 CPM Interrupt Vector Register 4 5 6 7 8 5 Bit Interrupt Code 0 0 0 0 0 15 16 0 0 0 0 0 0 IACK 0 31 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 bit read from address 0xnnnn0930 16 bit read from address 0xnnnn0930 32 bit read from address 0xnnnn0930 [Rev 1.4] 53 of 91 CPM • Communications processor module • Direct hardware support for all protocol and application interfaces – Ethernet, HDLC, Async HDLC, T1/E1, T3/E3, UART, ISDN, Infrared – Parallel I/O – Full serial and virtual DMA support [Rev 1.4] 54 of 91 IMMR Format • All on-chip peripherals are accessed though a single 64K byte area of memory • The first 8K of address space contains the control registers of the on-chip peripherals and the SI RAM • Within the second 8K of address space, there are three blocks of dual ported RAM [Rev 1.4] 55 of 91 IMMR Area 0xnnnn_0000 CPM/SIU Control Registers [3K] 0xnnnn_0C00 0xnnnn_0E00 0xnnnn_1000 May reside on any 64K boundary [16K] SI RAM [512] FEC Control Registers [512] Reserved [4K] 0xnnnn_2000 Dual ported RAM [4K] 0xnnnn_3000 Reserved for Dual ported RAM expansion [3K] 0xnnnn_3C00 Parameter RAM [1K] 0xnnnn_4000 [Rev 1.4] 56 of 91 0 KB 1 KB Dual ported RAM Buffer Descriptors / uCode / Data 2 KB 3 KB Buffer Descriptors / Data 4 KB 5 KB 6 KB 7 KB Parameter RAM 8 KB [Rev 1.4] 57 of 91 Dual Ported RAM usage • The layout of the Dual Ported RAM is determined by the uCode in the CPM • When the CPM is not in operation, it is nothing more than internal memory – During the boot sequence, stack, global data, and heap can reside in this memory – Initialization code can be written in C++! – A multi-layered boot process can be used • First code resides in flash, uses internal RAM to setup chip selects • Second code resides in another section of flash and uses external RAM to load main application over a CPM channel • Third level is the main application – Each level has it’s own crt0.s function and initializes the EABI from scratch [Rev 1.4] 58 of 91 CPM Overview COMM. PROCESSOR MODULE Internal Four Interrupt Memory MAC Timers Controller Space Parallel I/O 32-bit RISC and Baud Rate Generators Timers Program ROM Serial DMAs Virtual IDMAs FEC (860T) SCC1 SCC2 SCC3 SCC4 Time Slot Assigner SMC1 SMC2 SPI I2 C Serial Interface [Rev 1.4] 59 of 91 DMA’s • Serial DMA’s – Full bi-directional support of all serial channels • Virtual DMA – 4 channels – Uses the serial DMA hardware to generate transfers – Memory to memory or memory to/from I/O [Rev 1.4] 60 of 91 CPM Buffer Structure BD128 IMMR BD3 BD2 BD1 RAM [Rev 1.4] 61 of 91 Buffer Descriptor Format 16 bits Status and Control Data Length High Order Pointer Low Order Pointer [Rev 1.4] 62 of 91 From Channel to Buffer Location fixed by: - Hardware channel Format fixed by: - Protocol Communication Channel hardware Parameter RAM Dual ported RAM (Buffer Descriptors) Location determined by: - Value in Buffer Descriptor - Memory controller mapping of Local/603 bus Format determined by: - Protocol Data Buffers Location determined by: - Parameter RAM value Format of control and status determined by Protocol [Rev 1.4] 63 of 91 SCC’s • The SCC’s implement the following protocols: – – – – SDLC/HDLC AppleTalk UART 10-Mbps Ethernet [Rev 1.4] 64 of 91 Ethernet Frame Stored by CPM in Receive buffer Stored by CPU in Transmit buffer Preamble Start Frame Destination Address Source Address Type / Length 7 bytes 1 bytes 6 bytes 6 bytes 2 bytes Data 46 - 1500 bytes Frame Check 4 bytes [Rev 1.4] 65 of 91 Ethernet Buffer Descriptor Receive Control & Status E Transmit Control & Status R Common for Transmit and Receive - W I L F PAD - M - LG NO SH CR OV CL W I L TC DEF HB RC RL RC UN CSL Data Length High Order Pointer Low Order Pointer [Rev 1.4] 66 of 91 Status and Control Definitions Receive Control & Status E - W I L F - M - LG NO SH CR OV CL First in Frame: Set by the CPM to inform the CPU that this is the start of a new frame. Last in Frame: Set by the CPM or the CPU to inform the other that this is the last buffer of a frame. Interrupt: Generate an interrupt after this buffer is used by the CPM. Wrap: This is the last BD in this set of BD’s. Empty / Ready: 0 = This buffer is owned by the CPU 1 = This buffer is owned by the CPM Transmit CRC: Transmit the CRC after this buffer Transmit Control & Status R PAD W I L TC DEF HB RC RL RC UN CSL [Rev 1.4] 67 of 91 Transmit Frames Parameter RAM points to this BD R=0 W=0 I=0 L = 0 TC = 1 R=0 W=0 I=0 L = 0 TC = 1 R=0 W=0 I=0 L = 0 TC = 1 R=0 W=0 I=0 L = 0 TC = 1 R=0 W=0 I=1 L = 1 TC = 1 R=0 W=0 I=0 L = 0 TC = 1 R=0 W=0 I=0 L = 0 TC = 1 R=0 W=0 I=1 L = 1 TC = 1 R=0 W=1 I=1 L = 1 TC = 1 After all buffers are filled, “R” is set to “1” in all BD’s in this list These BD’s are for the next frame for this channel This BD is for a single buffer frame [Rev 1.4] 68 of 91 Receive Frames Parameter RAM points to this BD E=1 W=0 I=0 L= 0 F= 1 E=1 W=0 I=0 L= 0 F= 0 E=1 W=0 I=0 L= 0 F= 0 E=1 W=0 I=0 L= 0 F= 0 E=1 W=0 I=0 L= 1 F= 0 E=1 W=0 I=0 L= 0 F= 1 E=1 W=0 I=0 L= 0 F= 0 E=1 W=0 I=0 L= 1 F= 0 E=1 W=1 I=0 L= 1 F= 1 After all buffers are filled, “E” is set to “1” in all BD’s in this list These BD’s are for the next frame for this channel This BD is for a single buffer frame [Rev 1.4] 69 of 91 The [E/R] bits Initial Value Operation Transmit [Ready] 0 Fill with data by CPU Receive [Empty] 1 Fill with data by CPM Changed by Changed to Operation Changed Changed by to CPU 1 CPM transmits buffer CPM 0 CPM 0 CPU reads buffer CPU 1 Polarity can be confusing because the sense is reversed for complementary operations. However, the same level always indicates who [CPU vs. CPM] owns the buffer. This bit is the same for all protocols on all channels. [Rev 1.4] 70 of 91 The [W] bits • The Wrap bit is always set to indicate the last buffer descriptor for the channel • It does not delineate frames! • The value of the first buffer descriptor is stored in the channel’s parameter RAM – The list of BD’s is bounded by the parameter RAM and the [W] bit • Any BD past a BD with the [W] bit set, that’s not pointed to by parameter RAM is inaccessible by the CPM • This bit is the same for all protocols on all channels. [Rev 1.4] 71 of 91 The [I] Bits • The Interrupt bits generate an interrupt to the CPU when the CPM hands the BD to the CPU – Whenever the CPM flips the [E/R] bit to “0” • A redundant phrase, the CPM can only flip that bit to “0”, right? • For transmit, it’s common to only receive an interrupt at the end of transmission of the last buffer • For receive, the last buffer is not known, so it’s more common to receive an interrupt for most buffers on non-frame oriented protocols – If a buffer is small enough that it can’t contain an entire frame, then this bit might be cleared • The CPU has to stay ahead of the CPM to know when a wrap occurred – On Ethernet, the end of frame interrupt is more efficient • This bit is the same for all protocols on all channels. [Rev 1.4] 72 of 91 The [L] Bits • The Last bits indicate the end of a frame within the list of buffer descriptors • Set and cleared by the CPU on transmit frames – The CPM only reads this bit for transmit • Set by the CPM on receive frames – Should be cleared by the CPU before the [E] is used to hand the buffer to the CPM • This bit is not the same for all protocols on all channels. [Rev 1.4] 73 of 91 The [F] Bits • The First bit is only present in receive frames • Set by the CPM to tell the CPU that this buffer starts a frame – An underrun, late collision, or aborted frame can cause a new frame in the next buffer without the [L] bit being set in the previous BD • Not needed for transmit – The CPU will control the state of the CPM with the [L] bit – An [L] bit set or an underrun will cause the next buffer to be considered the first buffer of a frame • This bit is not the same for all protocols on all channels. [Rev 1.4] 74 of 91 The [TC] Bits • The Transmit CRC bits work in conjunction with the [L] bit • The [TC] bit is ignored if the [L] bit is cleared • Initializing all [TC] bits to “1” is a good precaution • Only custom protocols that don’t use hardware generated CRC’s should have this bit cleared • This bit is not the same for all protocols on all channels. [Rev 1.4] 75 of 91 Subtle points on BD’s • Frames can span buffers • Buffers never span frames – Unless you have all hardware support turned off and are running transparent • Be careful with small receive buffers that have the [I] bit set – You’ll get hammered with interrupts • Turn buffers over to the CPM from last to first – If an interrupt interferes with the handoff, an underrun / overflow can occur • Hands off a BD with the [E/R] bit set – Unless you like working weekends [Rev 1.4] 76 of 91 FEC (860T) • The FEC support: – 10/100-Mbps Ethernet through an MII – Designed completely in discrete logic, independent of the CPM – Operation is similar to SCCs • Large internal FIFO’s (224 bytes in each direction) reduce the impact of bus latency and allow discard frames from appearing on the processor bus. [Rev 1.4] 77 of 91 FEC Buffer Descriptors • Identical in format to the SCC’s buffer descriptors • Except: – Buffer descriptors, as well as buffers are in main memory – Pointers to buffer descriptors in the parameter RAM are 32 bits • Buffer descriptors must still be in consecutive memory locations • To prevent the FEC from hammering the processor bus, it doesn’t continually poll the BD’s for the ready bit – The software must write to a control register to initiate a poll sequence – The BD’s will continue to be polled in sequence until one is found to be not ready [Rev 1.4] 78 of 91 SMC’s • The SMC’s perform basic UART as well as transparent mode transmission • Buffer description operation is identical to the SCC’s – The status and control word has different bit fields pertaining to the protocols – Bit fields controlling protocol independent operation are unchanged [Rev 1.4] 79 of 91 Status and Control Definitions [SMC in UART mode] Receive Control & Status E - W I - - CM ID - BR FR PR - OV - Idle: Close buffer on reception of idles Continuous mode: [E] bit isn’t cleared on buffer reception Interrupt: Generate an interrupt after this buffer is used by the CPM. Wrap: This is the last BD in this set of BD’s. Empty / Ready: Transmit Control & Status R - W I - - CM P - - 0 = This buffer is owned by the CPU 1 = This buffer is owned by the CPM - - - [Rev 1.4] 80 of 91 MII PQ II MPC 8260 FCCn Transmit Error (Tx_ER) Transmit Nibble Data (TxD[3:0]) Transmit Enable (Tx_EN) Transmit Clock (Tx_clk) Collision Detect (COL) Receive Nibble Data (RxD[3:0]) Receive Error (Rx_ER) Receive Clock (Rx_clk) Receive Data Valid (Rx_DV) Carrier Sense output (CRS) Management Data Clock (MDC) Management Data I/O (MDIO) Fast Ethernet PHY [Rev 1.4] 81 of 91 Debug Considerations What is BDM Getting out of reset The cache is on CPM Realities Exception Routines Tracing at the Bus Cycle Level [Rev 1.4] 82 of 91 What is BDM? BDM is a serial command and control port into the CPU Commands are sent into the device through the BDM port instead of the instruction fetch path BDM accesses to memory are in REAL mode They don’t go through the MMU or the cache [Rev 1.4] 83 of 91 BDM connection • BDM connector allows for full run control of the processor VLFS0 GND GND HRESET VDD 1 2 3 4 5 6 7 8 9 10 SRESET DSCK VLFS1 DSDI DSDO FRZ GND GND HRESET VDD 1 2 3 4 5 6 7 8 9 10 SRESET DSCK FRZ DSDI DSDO [Rev 1.4] 84 of 91 Getting out of Reset Reset Configuration word of vital importance What is your Interrupt Prefix? Watch for the clock multiplier If the multiplier is too large, power must be removed from the device to recover Reset will not recover from too high of a clock [Rev 1.4] 85 of 91 The Core and Bus Four hardware code breakpoints available; two hardware data breakpoints Predictive Fetching and speculative loading means what you see on the bus may not be executed or used. VLFS pins used for back trace don’t work in split bus mode. [Rev 1.4] 86 of 91 The Caches are On • Bus Cycles now appear as bursts • Fetches are determined by the BIU, not related to instruction execution • Cache visibility pins don’t work in split bus mode • Instrumentation required for accurate debug above a 50MHz core • Caution must be exercised when the boot process performs a code relocation – – – – – Contents are cached as data during the move Contents are fetched as instructions after the move The instruction queue doesn’t snoop the data cache The load/store unit doesn’t snoop the instruction cache There is no cache coherency [Rev 1.4] 87 of 91 CPM Realities • • • • The CPM operates independently of the CPU The CPM is not debugged yet.. Expect the unexpected “Last Buffer Interrupt” occurs at the beginning of transmission The SPI can overwhelm the CPM and crash the other channels [Rev 1.4] 88 of 91 Exception Routines Exception Routines are difficult to debug The Recoverability of exceptions is an issue Breakpoints and single stepping can’t be used until the RI bit is set [Rev 1.4] 89 of 91 Tracing at the Bus Cycle Level The 860 comes in a BGA package Connecting to an emulator Connecting to an analyzer [Rev 1.4] 90 of 91 Connecting to an Emulator (1) Connection to Emulator Buffer Board Original 860 BGA site to pin socket Target Adaptor Pin header Target board [Rev 1.4] 91 of 91 Connecting to an Emulator (2) Connection to Emulator Buffer Board Original 860 Customer provided connectors Target board [Rev 1.4] 92 of 91 Connecting to an Analyzer Mictor Connectors 860 Target board [Rev 1.4] 93 of 91 Summary of Undocumented Issues •Init MMU before turning on caches •The CPM doesn’t use the MMU’s or the caches •Don’t single step through moves to or from SPR’s •ISR’s can not have breakpoints in the first or last few instructions •Each processor must have it’s own BDM connector [Rev 1.4] 94 of 91
© Copyright 2025 Paperzz