Chapter 2 TMS320C6000 Architectural Overview Learning Objectives Chapter 2, Slide 2 Describe C6000 CPU architecture. Introduce some basic instructions. Describe the C6000 memory map. Provide an overview of the peripherals. Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 General DSP System Block Diagram Internal Memory Internal Buses External Memory Central Processing Unit Chapter 2, Slide 3 P E R I P H E R A L S Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Implementation of Sum of Products (SOP) It has been shown in Chapter 1 that SOP is the key element for most DSP algorithms. So let’s write the code for this algorithm and at the same time discover the C6000 architecture. N Y = an * xn n = 1 = a1 * x1 + a2 * x2 +... + aN * xN Two basic operations are required for this algorithm. (1) Multiplication (2) Addition Therefore two basic instructions are required Chapter 2, Slide 4 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Implementation of Sum of Products (SOP) So let’s implement the SOP algorithm! N Y = an * xn n = 1 = a1 * x1 + a2 * x2 +... + aN * xN The implementation in this module will be done in assembly. Two basic operations are required for this algorithm. (1) Multiplication (2) Addition Therefore two basic instructions are required Chapter 2, Slide 5 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Multiply (MPY) N Y = an * xn n = 1 = a1 * x1 + a2 * x2 +... + aN * xN The multiplication of a1 by x1 is done in assembly by the following instruction: MPY a1, x1, Y This instruction is performed by a multiplier unit that is called “.M” Chapter 2, Slide 6 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Multiply (.M unit) 40 Y = an * xn n = 1 .M The . M unit performs multiplications in hardware MPY .M a1, x1, Y Note: 16-bit by 16-bit multiplier provides a 32-bit result. 32-bit by 32-bit multiplier provides a 64-bit result. Chapter 2, Slide 7 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Addition (.?) 40 Y = an * xn n = 1 .M .? Chapter 2, Slide 8 MPY .M a1, x1, prod ADD .? Y, prod, Y Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Add (.L unit) 40 Y = an * xn n = 1 .M .L MPY .M a1, x1, prod ADD .L Y, prod, Y RISC processors such as the C6000 use registers to hold the operands, so lets change this code. Chapter 2, Slide 9 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Register File - A 40 Register File A A0 A1 a1 x1 A2 A3 Y = an * xn n = 1 prod Y .M . . . .L MPY .M a1, x1, prod ADD .L Y, prod, Y A15 32-bits Let us correct this by replacing a, x, prod and Y by the registers as shown above. Chapter 2, Slide 10 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Specifying Register Names 40 Register File A A0 A1 a1 x1 A2 A3 Y = an * xn n = 1 prod Y .M . . . .L MPY .M A0, A1, A3 ADD .L A4, A3, A4 A15 32-bits The registers A0, A1, A3 and A4 contain the values to be used by the instructions. Chapter 2, Slide 11 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Specifying Register Names 40 Register File A A0 A1 a1 x1 A2 A3 Y = an * xn n = 1 prod Y .M . . . .L MPY .M A0, A1, A3 ADD .L A4, A3, A4 A15 32-bits Register File A contains 16 registers (A0 -A15) which are 32-bits wide. Chapter 2, Slide 12 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Data loading Register File A A0 A1 a1 x1 A2 A3 Q: How do we load the operands into the registers? prod Y .M . . . .L A15 32-bits Chapter 2, Slide 13 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Load Unit “.D” Register File A A0 A1 a1 x1 A2 A3 Q: How do we load the operands into the registers? prod Y .M . . . .L A: The operands are loaded into the registers by loading them from the memory using the .D unit. .D A15 32-bits Data Memory Chapter 2, Slide 14 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Load Unit “.D” Register File A A0 A1 a1 x1 A2 A3 It is worth noting at this stage that the only way to access memory is through the .D unit. prod Y .M . . . .L .D A15 32-bits Data Memory Chapter 2, Slide 15 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Load Instruction Register File A A0 A1 a1 x1 A2 A3 Q: Which instruction(s) can be used for loading operands from the memory to the registers? prod Y .M . . . .L .D A15 32-bits Data Memory Chapter 2, Slide 16 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Load Instructions (LDB, LDH,LDW,LDDW) Register File A A0 A1 a1 x1 A2 A3 Q: Which instruction(s) can be used for loading operands from the memory to the registers? prod Y .M . . . .L A: The load instructions. .D A15 32-bits Data Memory Chapter 2, Slide 17 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Using the Load Instructions Before using the load unit you have to be aware that this processor is byte addressable, which means that each byte is represented by a unique address. Data address 00000000 00000002 00000004 00000006 00000008 Also the addresses are 32-bit wide. FFFFFFFF 16-bits Chapter 2, Slide 18 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Using the Load Instructions The syntax for the load instruction is: LD *Rn,Rm Where: Rn is a register that contains the address of the operand to be loaded Data address a1 x1 00000000 00000002 00000004 00000006 00000008 prod Y and Rm is the destination register. FFFFFFFF 16-bits Chapter 2, Slide 19 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Using the Load Instructions The syntax for the load instruction is: LD *Rn,Rm The question now is how many bytes are going to be loaded into the destination register? Data address a1 x1 00000000 00000002 00000004 00000006 00000008 prod Y FFFFFFFF 16-bits Chapter 2, Slide 20 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Using the Load Instructions The syntax for the load instruction is: LD *Rn,Rm The answer, is that it depends on the instruction you choose: Data address a1 x1 00000000 00000002 00000004 00000006 00000008 prod Y • LDB: loads one byte (8-bit) • LDH: loads half word (16-bit) • LDW: loads a word (32-bit) • LDDW: loads a double word (64-bit) Note: LD on its own does not exist. Chapter 2, Slide 21 FFFFFFFF 16-bits Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Using the Load Instructions The syntax for the load instruction is: Data 1 0 0xA 0xB 0xC 0xD 0x2 0x1 Example: 0x4 0x3 If we assume that A5 = 0x4 then: 0x6 0x5 (1) LDB *A5, A7 ; gives A7 = 0x00000001 0x8 0x7 LD *Rn,Rm address 00000000 00000002 00000004 00000006 00000008 (2) LDH *A5,A7; gives A7 = 0x00000201 (3) LDW *A5,A7; gives A7 = 0x04030201 (4) LDDW *A5,A7:A6; gives A7:A6 = 0x0807060504030201 FFFFFFFF 16-bits Chapter 2, Slide 22 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Using the Load Instructions The syntax for the load instruction is: address Data LD *Rn,Rm Question: If data can only be accessed by the load instruction and the .D unit, how can we load the register pointer Rn in the first place? 0xA 0xB 0xC 0xD 0x2 0x1 0x4 0x3 0x6 0x5 0x8 0x7 00000000 00000002 00000004 00000006 00000008 FFFFFFFF 16-bits Chapter 2, Slide 23 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Loading the Pointer Rn The instruction MVKL will allow a move of a 16-bit constant into a register as shown below: MVKL .? a, A5 (‘a’ is a constant or label) How many bits represent a full address? 32 bits So why does the instruction not allow a 32-bit move? All instructions are 32-bit wide (see instruction opcode). Chapter 2, Slide 24 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Loading the Pointer Rn To solve this problem another instruction is available: MVKH eg. MVKH .? a, A5 (‘a’ is a constant or label) Chapter 2, Slide 25 ah al a ah x A5 Finally, to move the 32-bit address to a register we can use: MVKL a, A5 MVKH a, A5 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Loading the Pointer Rn Always use MVKL then MVKH, look at the following examples: Example 1 A5 = 0x87654321 MVKL 0x1234FABC, A5 A5 = 0xFFFFFABC (sign extension) MVKH 0x1234FABC, A5 A5 = 0x1234FABC ; OK Example 2 MVKH A5 = 0x12344321 Chapter 2, Slide 26 0x1234FABC, A5 MVKL 0x1234FABC, A5 A5 = 0xFFFFFABC ; Wrong Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 LDH, MVKL and MVKH Register File A A0 A1 a x A2 A3 A4 prod Y .M . . . .L .D A15 MVKL MVKH pt1, A5 pt1, A5 MVKL MVKH pt2, A6 pt2, A6 LDH .D *A5, A0 LDH .D *A6, A1 MPY .M A0, A1, A3 ADD .L A4, A3, A4 32-bits pt1 and pt2 point to some locations Data Memory Chapter 2, Slide 27 in the data memory. Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Creating a loop So far we have only implemented the SOP for one tap only, i.e. Y= a1 * x1 So let’s create a loop so that we can implement the SOP for N Taps. Chapter 2, Slide 28 MVKL MVKH pt1, A5 pt1, A5 MVKL MVKH pt2, A6 pt2, A6 LDH .D *A5, A0 LDH .D *A6, A1 MPY .M A0, A1, A3 ADD .L A4, A3, A4 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Creating a loop So far we have only implemented the SOP for one tap only, i.e. Y= a1 * x1 With the C6000 processors there are no dedicated instructions such as block repeat. The loop is created using the B instruction. So let’s create a loop so that we can implement the SOP for N Taps. Chapter 2, Slide 29 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 What are the steps for creating a loop 1. Create a label to branch to. 2. Add a branch instruction, B. 3. Create a loop counter. 4. Add an instruction to decrement the loop counter. 5. Make the branch conditional based on the value in the loop counter. Chapter 2, Slide 30 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 1. Create a label to branch to loop Chapter 2, Slide 31 MVKL MVKH pt1, A5 pt1, A5 MVKL MVKH pt2, A6 pt2, A6 LDH .D *A5, A0 LDH .D *A6, A1 MPY .M A0, A1, A3 ADD .L A4, A3, A4 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 2. Add a branch instruction, B. loop Chapter 2, Slide 32 MVKL MVKH pt1, A5 pt1, A5 MVKL MVKH pt2, A6 pt2, A6 LDH .D *A5, A0 LDH .D *A6, A1 MPY .M A0, A1, A3 ADD .L A4, A3, A4 B .? loop Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Which unit is used by the B instruction? Register File A A0 A1 a x .S prod Y .M .M . . . .L .L A2 A3 loop .D .D A15 MVKL MVKH pt1, A5 pt1, A5 MVKL MVKH pt2, A6 pt2, A6 LDH .D *A5, A0 LDH .D *A6, A1 MPY .M A0, A1, A3 ADD .L A4, A3, A4 B .? loop 32-bits Data Memory Chapter 2, Slide 33 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Which unit is used by the B instruction? Register File A A0 A1 a x .S prod Y .M .M . . . .L .L A2 A3 loop .D .D A15 MVKL .S MVKH .S pt1, A5 pt1, A5 MVKL .S MVKH .S pt2, A6 pt2, A6 LDH .D *A5, A0 LDH .D *A6, A1 MPY .M A0, A1, A3 ADD .L A4, A3, A4 B .S loop 32-bits Data Memory Chapter 2, Slide 34 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 3. Create a loop counter. Register File A A0 A1 a x .S prod Y .M .M . . . .L .L A2 A3 loop .D .D A15 32-bits MVKL .S MVKH .S pt1, A5 pt1, A5 MVKL .S MVKH .S MVKL .S pt2, A6 pt2, A6 count, B0 LDH .D *A5, A0 LDH .D *A6, A1 MPY .M A0, A1, A3 ADD .L A4, A3, A4 B .S loop B registers will be introduced later Data Memory Chapter 2, Slide 35 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 4. Decrement the loop counter Register File A A0 A1 a x .S prod Y .M .M . . . .L .L A2 A3 loop .D .D A15 32-bits MVKL .S MVKH .S pt1, A5 pt1, A5 MVKL .S MVKH .S MVKL .S pt2, A6 pt2, A6 count, B0 LDH .D *A5, A0 LDH .D *A6, A1 MPY .M A0, A1, A3 ADD .L A4, A3, A4 SUB .S B0, 1, B0 B .S loop Data Memory Chapter 2, Slide 36 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 5. Make the branch conditional based on the value in the loop counter What is the syntax for making instruction conditional? [condition] Instruction [B1] loop Label e.g. B (1) The condition can be one of the following registers: A1, A2, B0, B1, B2. (2) Any instruction can be conditional. Chapter 2, Slide 37 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 5. Make the branch conditional based on the value in the loop counter The condition can be inverted by adding the exclamation symbol “!” as follows: [!condition] Instruction Label [!B0] B loop ;branch if B0 = 0 [B0] B loop ;branch if B0 != 0 e.g. Chapter 2, Slide 38 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 5. Make the branch conditional MVKL .S2 pt1, A5 MVKH .S2 pt1, A5 Register File A A0 A1 a x .S prod Y .M .M . . . .L .L A2 A3 MVKL .S2 pt2, A6 MVKH .S2 pt2, A6 MVKL .S2 count, B0 loop .D .D A15 32-bits [B0] LDH .D *A5, A0 LDH .D *A6, A1 MPY .M A0, A1, A3 ADD .L A4, A3, A4 SUB .S B0, 1, B0 B .S loop Data Memory Chapter 2, Slide 39 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 More on the Branch Instruction (1) With this processor all the instructions are encoded in a 32-bit. Therefore the label must have a dynamic range of less than 32-bit as the instruction B has to be coded. 32-bit B Case 1: 21-bit relative address B .S1 label Relative branch. Label limited to +/- 220 offset. Chapter 2, Slide 40 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 More on the Branch Instruction (2) By specifying a register as an operand instead of a label, it is possible to have an absolute branch. This will allow a dynamic range of 232. 32-bit 5-bit register B Case 2: code B .S2 register Absolute branch. Operates on .S2 ONLY! Chapter 2, Slide 41 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Testing the code MVKL .S2 pt1, A5 MVKH .S2 pt1, A5 MVKL .S2 pt2, A6 MVKH .S2 pt2, A6 MVKL .S2 count, B0 This code performs the following operations: loop LDH .D *A5, A0 LDH .D *A6, A1 MPY .M A0, A1, A3 However, we would like to perform: ADD .L A4, A3, A4 a0*x0 + a1*x1 + a2*x2 + … + aN*xN SUB .S B0, 1, B0 B .S loop a0*x0 + a0*x0 + a0*x0 + … + a0*x0 [B0] Chapter 2, Slide 42 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Modifying the pointers MVKL .S2 pt1, A5 MVKH .S2 pt1, A5 MVKL .S2 pt2, A6 MVKH .S2 pt2, A6 MVKL .S2 count, B0 The solution is to modify the pointers loop A5 and A6. [B0] Chapter 2, Slide 43 LDH .D *A5, A0 LDH .D *A6, A1 MPY .M A0, A1, A3 ADD .L A4, A3, A4 SUB .S B0, 1, B0 B .S loop Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Indexing Pointers Syntax Description *R Pointer Pointer Modified No In this case the pointers are used but not modified. R can be any register Chapter 2, Slide 44 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Indexing Pointers Syntax Description *R *+R[disp] *-R[disp] Pointer + Pre-offset - Pre-offset Pointer Modified No No No In this case the pointers are modified BEFORE being used and RESTORED to their previous values. [disp] specifies the number of elements size in DW (64-bit), W (32-bit), H (16-bit), or B (8-bit). disp = R or 5-bit constant. R can be any register. Chapter 2, Slide 45 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Indexing Pointers Syntax Description *R *+R[disp] *-R[disp] *++R[disp] *--R[disp] Pointer + Pre-offset - Pre-offset Pre-increment Pre-decrement Pointer Modified No No No Yes Yes In this case the pointers are modified BEFORE being used and NOT RESTORED to their Previous Values. Chapter 2, Slide 46 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Indexing Pointers Syntax Description *R *+R[disp] *-R[disp] *++R[disp] *--R[disp] *R++[disp] *R--[disp] Pointer + Pre-offset - Pre-offset Pre-increment Pre-decrement Post-increment Post-decrement Pointer Modified No No No Yes Yes Yes Yes In this case the pointers are modified AFTER being used and NOT RESTORED to their Previous Values. Chapter 2, Slide 47 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Indexing Pointers Syntax Description *R *+R[disp] *-R[disp] *++R[disp] *--R[disp] *R++[disp] *R--[disp] Pointer + Pre-offset - Pre-offset Pre-increment Pre-decrement Post-increment Post-decrement Chapter 2, Slide 48 Pointer Modified No No No Yes Yes Yes Yes [disp] specifies # elements - size in DW, W, H, or B. disp = R or 5-bit constant. R can be any register. Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Modify and testing the code MVKL .S2 pt1, A5 MVKH .S2 pt1, A5 MVKL .S2 pt2, A6 MVKH .S2 pt2, A6 MVKL .S2 count, B0 This code now performs the following loop operations: a0*x0 + a1*x1 + a2*x2 + ... + aN*xN [B0] Chapter 2, Slide 49 LDH .D *A5++, A0 LDH .D *A6++, A1 MPY .M A0, A1, A3 ADD .L A4, A3, A4 SUB .S B0, 1, B0 B .S loop Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Store the final result MVKL .S2 pt1, A5 MVKH .S2 pt1, A5 MVKL .S2 pt2, A6 MVKH .S2 pt2, A6 MVKL .S2 count, B0 This code now performs the following loop operations: a0*x0 + a1*x1 + a2*x2 + ... + aN*xN [B0] Chapter 2, Slide 50 LDH .D *A5++, A0 LDH .D *A6++, A1 MPY .M A0, A1, A3 ADD .L A4, A3, A4 SUB .S B0, 1, B0 B .S loop STH .D A4, *A7 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Store the final result MVKL .S2 pt1, A5 MVKH .S2 pt1, A5 MVKL .S2 pt2, A6 MVKH .S2 pt2, A6 MVKL .S2 count, B0 loop The Pointer A7 has not been initialised. [B0] Chapter 2, Slide 51 LDH .D *A5++, A0 LDH .D *A6++, A1 MPY .M A0, A1, A3 ADD .L A4, A3, A4 SUB .S B0, 1, B0 B .S loop STH .D A4, *A7 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Store the final result MVKL .S2 pt1, A5 MVKH .S2 pt1, A5 MVKL .S2 pt2, A6 MVKH .S2 pt2, A6 MVKL .S2 pt3, A7 MVKH .S2 pt3, A7 MVKL .S2 count, B0 The Pointer A7 is now initialised. loop [B0] Chapter 2, Slide 52 LDH .D *A5++, A0 LDH .D *A6++, A1 MPY .M A0, A1, A3 ADD .L A4, A3, A4 SUB .S B0, 1, B0 B .S loop STH .D A4, *A7 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 What is the initial value of A4? MVKL .S2 pt1, A5 MVKH .S2 pt1, A5 MVKL .S2 pt2, A6 MVKH .S2 pt2, A6 A4 is used as an accumulator, so it needs to be reset to zero. loop [B0] Chapter 2, Slide 53 MVKL MVKH MVKL ZERO LDH .S2 .S2 .S2 .L .D pt3, A7 pt3, A7 count, B0 A4 *A5++, A0 LDH .D *A6++, A1 MPY .M A0, A1, A3 ADD .L A4, A3, A4 SUB .S B0, 1, B0 B .S loop STH .D A4, *A7 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Increasing the processing power! Register File A A0 A1 A2 A3 A4 .S1 .M1 . . . .L1 How can we add more processing power to this processor? .D1 A15 32-bits Data Memory Chapter 2, Slide 54 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Increasing the processing power! Register File A A0 A1 A2 A3 A4 .S1 .M1 . . . (1) Increase the clock frequency. (2) Increase the number of Processing units. .L1 .D1 A15 32-bits Data Memory Chapter 2, Slide 55 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 To increase the Processing Power, this processor has two sides (A and B or 1 and 2) Register File A A0 A1 A2 A3 A4 . . . A15 Register File B .S1 .S2 .M1 .M2 .L1 .L2 .D1 .D2 32-bits B0 B1 B2 B3 B4 . . . B15 32-bits Data Memory Chapter 2, Slide 56 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Can the two sides exchange operands in order to increase performance? Register File A A0 A1 A2 A3 A4 . . . A15 Register File B .S1 .S2 .M1 .M2 .L1 .L2 .D1 .D2 32-bits B0 B1 B2 B3 B4 . . . B15 32-bits Data Memory Chapter 2, Slide 57 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 The answer is YES but there are limitations. To exchange operands between the two sides, some cross paths or links are required. What is a cross path? A cross path links one side of the CPU to the other. There are two types of cross paths: Chapter 2, Slide 58 Data cross paths. Address cross paths. Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Data Cross Paths Data cross paths can also be referred to as register file cross paths. These cross paths allow operands from one side to be used by the other side. There are only two cross paths: Chapter 2, Slide 59 one path which conveys data from side B to side A, 1X. one path which conveys data from side A to side B, 2X. Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 TMS320C67x Data-Path Chapter 2, Slide 60 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Data Cross Paths Data cross paths only apply to the .L, .S and .M units. The data cross paths are very useful, however there are some limitations in their use. Chapter 2, Slide 61 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Data Cross Path Limitations <dst> .L1 .M1 .S1 A <src> <src> 2x (1) The destination register must be on same side as unit. (2) Source registers - up to one cross path per execute packet per side. 1x B Execute packet: group of instructions that execute simultaneously. Chapter 2, Slide 62 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Data Cross Path Limitations <dst> eg: ADD MPY SUB || ADD .L1 .M1 .S1 .L1x .M1x .S1x .L1x A <src> <src> A0,A1,B2 A0,B6,A9 A8,B2,A8 A0,B0,A2 2x 1x B || Means that the SUB and ADD belong to the same fetch packet, therefore execute simultaneously. Chapter 2, Slide 63 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Data Cross Path Limitations <dst> eg: ADD MPY SUB || ADD .L1 .M1 .S1 .L1x .M1x .S1x .L1x A <src> <src> A0,A1,B2 A0,B6,A9 A8,B2,A8 A0,B0,A2 2x 1x B NOT VALID! Chapter 2, Slide 64 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Data Cross Paths for both sides <dst> <dst> Chapter 2, Slide 65 .L1 .M1 .S1 .L2 .M2 .S2 A <src> <src> 2x <src> <src> 1x B Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Address cross paths Data Addr A .D1 (1) The pointer must be on the same side of the unit. LDW.D1T1 *A0,A5 STW.D1T1 A5,*A0 Chapter 2, Slide 66 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Load or store to either side Data1 DA1 = T1 DA2 = T2 Data2 Chapter 2, Slide 67 A5 .D1 *A0 LDW.D1T1 *A0,A5 LDW.D1T2 *A0,B5 A B B5 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Standard Parallel Loads Data1 DA1 = T1 DA2 = T2 A5 .D1 *A0 .D2 *B0 A B LDW.D1T1 *A0,A5 || LDW.D2T2 *B0,B5 Chapter 2, Slide 68 B5 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Parallel Load/Store using address cross paths Data1 DA1 = T1 DA2 = T2 A5 .D1 *A0 .D2 *B0 A B LDW.D1T2 *A0,B5 || STW.D2T1 A5,*B0 Chapter 2, Slide 69 B5 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Fill the blanks ... Does this work? Data1 DA1 = T1 DA2 = T2 .D1 *A0 .D2 *B0 A B LDW.D1__ *A0,B5 || STW.D2__ B6,*B0 Chapter 2, Slide 70 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Not Allowed! Parallel accesses: both cross or neither cross Data1 DA2 = T2 .D1 *A0 .D2 *B0 B5 A B LDW.D1T2 *A0,B5 B6 || STW.D2T2 B6,*B0 Chapter 2, Slide 71 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Conditions Don’t Use Cross Paths If a conditional register comes from the opposite side, it does NOT use a data or address cross-path. Examples: [B2] [A1] Chapter 2, Slide 72 ADD LDW .L1 .D2 A2,A0,A4 *B0,B5 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 ‘C62x Data-Path Summary CPU Ref Guide Full CPU Datapath (Pg 2-2) Chapter 2, Slide 73 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 ‘C67x Data-Path Summary Chapter 2, Slide 74 ‘C67x Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Cross Paths - Summary Data Address Chapter 2, Slide 75 Destination register on same side as unit. Source registers - up to one cross path per execute packet per side. Use “x” to indicate cross-path. Pointer must be on same side as unit. Data can be transferred to/from either side. Parallel accesses: both cross or neither cross. Conditionals Don’t Use Cross Paths. Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Code Review (using side A only) 40 Y = n = 1 MVK loop: LDH LDH MPY ADD SUB [A2] B STH .S1 .D1 .D1 .M1 .L1 .L1 .S1 .D1 an * xn 40, A2 *A5++, A0 *A6++, A1 A0, A1, A3 A3, A4, A4 A2, 1, A2 loop A4, *A7 ; A2 = 40, loop count ; A0 = a(n) ; A1 = x(n) ; A3 = a(n) * x(n) ; Y = Y + A3 ; decrement loop count ; if A2 0, branch ; *A7 = Y Note: Assume that A4 was previously cleared and the pointers are initialised. Chapter 2, Slide 76 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Let us have a look at the final details concerning the functional units. Consider first the case of the .L and .S units. Chapter 2, Slide 77 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Operands - 32/40-bit Register, 5-bit Constant Operands can be: 5-bit constants (or 16-bit for MVKL and MVKH). 32-bit registers. 40-bit Registers. However, we have seen that registers are only 32-bit. So where do the 40-bit registers come from? Chapter 2, Slide 78 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Operands - 32/40-bit Register, 5-bit Constant A 40-bit register can be obtained by concatenating two registers. However, there are 3 conditions that need to be respected: Chapter 2, Slide 79 The registers must be from the same side. The first register must be even and the second odd. The registers must be consecutive. Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Operands - 32/40-bit Register, 5-bit Constant All combinations of 40-bit registers are shown below: 40-bit Reg : odd even 8 Chapter 2, Slide 80 32 40-bit Reg : odd even 8 32 A1:A0 B1:B0 A3:A2 B3:B2 A5:A4 B5:B4 A7:A6 B7:B6 A9:A8 B9:B8 A11:A10 B11:B10 A13:A12 B13:B12 A15:A14 B15:B14 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Operands - 32/40-bit Register, 5-bit Constant instr .unit 32-bit Reg <src>, <src>, <dst> 5-bit Const 32-bit Reg 40-bit Reg < src > < src > .L or .S < dst > 32-bit Reg Chapter 2, Slide 81 40-bit Reg Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Operands - 32/40-bit Register, 5-bit Constant instr 32-bit Reg 5-bit Const .unit <src>, <src>, <dst> 32-bit Reg 40-bit Reg < src > < src > .L or .S < dst > 32-bit Reg Chapter 2, Slide 82 40-bit Reg Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Operands - 32/40-bit Register, 5-bit Constant instr 32-bit Reg 5-bit Const .unit <src>, <src>, <dst> 32-bit Reg 40-bit Reg < src > < src > .L or .S OR.L1 A0, A1, A2 < dst > 32-bit Reg Chapter 2, Slide 83 40-bit Reg Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Operands - 32/40-bit Register, 5-bit Constant instr 32-bit Reg 5-bit Const .unit 32-bit Reg .L or .S < dst > Chapter 2, Slide 84 40-bit Reg < src > < src > 32-bit Reg <src>, <src>, <dst> OR.L1 ADD.L2 A0, A1, A2 -5, B3, B4 40-bit Reg Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Operands - 32/40-bit Register, 5-bit Constant instr 32-bit Reg 5-bit Const .unit 32-bit Reg .L or .S < dst > Chapter 2, Slide 85 40-bit Reg < src > < src > 32-bit Reg <src>, <src>, <dst> OR.L1 ADD.L2 ADD.L1 A0, A1, A2 -5, B3, B4 A2, A3, A5:A4 40-bit Reg Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Operands - 32/40-bit Register, 5-bit Constant instr 32-bit Reg 5-bit Const .unit 32-bit Reg .L or .S < dst > Chapter 2, Slide 86 40-bit Reg < src > < src > 32-bit Reg <src>, <src>, <dst> 40-bit Reg OR.L1 ADD.L2 ADD.L1 SUB.L1 A0, -5, A2, A2, A1, A2 B3, B4 A3, A5:A4 A5:A4, A5:A4 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Operands - 32/40-bit Register, 5-bit Constant instr 32-bit Reg 5-bit Const .unit 32-bit Reg .L or .S < dst > Chapter 2, Slide 87 40-bit Reg < src > < src > 32-bit Reg <src>, <src>, <dst> 40-bit Reg OR.L1 ADD.L2 ADD.L1 SUB.L1 ADD.L2 A0, A1, A2 -5, B3, B4 A2, A3, A5:A4 A2, A5:A4, A5:A4 3, B9:B8, B9:B8 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Register to register data transfer Chapter 2, Slide 88 To move the content of a register (A or B) to another register (B or A) use the move “MV” Instruction, e.g.: MV A0, B0 MV B6, B7 To move the content of a control register to another register (A or B) or vice-versa use the MVC instruction, e.g.: MVC IFR, A0 MVC A0, IRP Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 TMS320C6000 Instruction Set Chapter 2, Slide 89 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 'C62x Instruction Set (by category) Arithmetic Logical ABS ADD ADDA ADDK ADD2 MPY MPYH NEG SMPY SMPYH SADD SAT SSUB SUB SUBA SUBC SUB2 ZERO AND CMPEQ CMPGT CMPLT NOT OR SHL SHR SSHL XOR Bit Mgmt CLR EXT LMBD NORM SET Data Mgmt LDB/H/W MV MVC MVK MVKL MVKH MVKLH STB/H/W Program Ctrl B IDLE NOP Note: Refer to the 'C6000 CPU Reference Guide for more details. Chapter 2, Slide 90 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 'C62x Instruction Set (by unit) .L Unit .S Unit ADD ADDK ADD2 AND B CLR EXT MV MVC MVK MVKL MVKH MVKLH NEG NOT OR SET SHL SHR SSHL SUB SUB2 XOR ZERO .M Unit MPY MPYH SMPY SMPYH Other NOP Chapter 2, Slide 91 IDLE ABS ADD AND CMPEQ CMPGT CMPLT LMBD MV NEG NORM NOT OR SADD SAT SSUB SUB SUBC XOR ZERO .D Unit ADD ADDA LDB/H/W MV NEG STB/H/W SUB SUBA ZERO Note: Refer to the 'C6000 CPU Reference Guide for more details. Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 ' C6700: Superset of Fixed-Point (by unit) .S Unit .S .L .D ADD ADDK ADD2 AND B CLR EXT MV MVC MVK MVKL MVKH NEG NOT OR SET SHL SHR SSHL SUB SUB2 XOR ZERO ABSSP ABSDP CMPGTSP CMPEQSP CMPLTSP CMPGTDP CMPEQDP CMPLTDP RCPSP RCPDP RSQRSP RSQRDP SPDP .D Unit .M Chapter 2, Slide 92 ADD NEG ADDAB (B/H/W) STB (B/H/W) ADDAD SUB LDB (B/H/W) SUBAB (B/H/W) LDDW ZERO MV .L Unit ABS ADD AND CMPEQ CMPGT CMPLT LMBD MV NEG NORM NOT OR SADD SAT SSUB SUB SUBC XOR ZERO ADDSP ADDDP SUBSP SUBDP INTSP INTDP SPINT DPINT SPRTUNC DPTRUNC DPSP ‘C67x .M Unit MPY MPYH MPYLH MPYHL SMPY SMPYH MPYSP MPYDP MPYI MPYID No Unit Used NOP IDLE Note: Refer to the 'C6000 CPU Reference Guide for more details. Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Superset of Fixed-Point Control Registers Instruction Dispatch Advanced Instruction Packing Instruction Decode Emulation Advanced Emulation Interrupt Control Instruction Fetch Registers (A0 - A15) Registers (B0 - B15) Registers (A16 - A31) Registers (B16 - B31) L1 + + + + S1 + + + + M1 x x x x X D1 + + X D2 + M2 X + X S2 L2 + + x + x + x + + x + + ‘C62x: Dual 32-Bit Load/Store ‘C64x: Dual 64-Bit Load/Store ‘C67x: Dual 64-Bit Load/32-Bit Store Chapter 2, Slide 93 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 'C64x: Superset of ‘C62x .S .D Dual/Quad Arith SADD2 SADDUS2 SADD4 Data Pack/Un PACK2 PACKH2 PACKLH2 PACKHL2 Bitwise Logical UNPKHU4 ANDN UNPKLU4 Shifts & Merge SWAP2 SPACK2 SHR2 SPACKU4 SHRU2 SHLMB SHRMB Dual Arithmetic Mem Access ADD2 LDDW SUB2 LDNW LDNDW Bitwise Logical STDW AND STNW ANDN STNDW OR XOR Load Constant MVK (5-bit) Address Calc. ADDAD Chapter 2, Slide 94 Compares CMPEQ2 CMPEQ4 CMPGT2 CMPGT4 .L Branches/PC BDEC BPOS BNOP ADDKPC Dual/Quad Arith ABS2 ADD2 ADD4 MAX MIN SUB2 SUB4 SUBABS4 Bitwise Logical ANDN .M Average AVG2 AVG4 Shifts ROTL SSHVL SSHVR Data Pack/Un PACK2 PACKH2 PACKLH2 PACKHL2 PACKH4 PACKL4 UNPKHU4 UNPKLU4 SWAP2/4 Multiplies MPYHI Shift & Merge MPYLI SHLMB MPYHIR SHRMB MPYLIR Load Constant MPY2 MVK (5-bit) SMPY2 Bit Operations DOTP2 DOTPN2 BITC4 DOTPRSU2 BITR DOTPNRSU2 DEAL DOTPU4 SHFL DOTPSU4 Move GMPY4 MVD XPND2/4 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 TMS320C6000 Memory Chapter 2, Slide 95 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Memory size per device Devices C6201, C6204, C6701 C6205 Internal EMIFA P D = = 64 kB 64 kB C6202 P D = = 256 kB 128 kB C6203 P D C6211 C6711 = = 52M Bytes (32-bits wide) EMIFB N/A 384 kB 512 kB 128M Bytes (32-bits wide) L1P L1D L2 = = = 4 kB 4 kB 64 kB C6713 L1P L1D L2 = = = 4 kB 4 kB 256 kB 128M Bytes (32-bits wide) N/A C6411 DM642 L1P L1D L2 = = = 16 kB 16 kB 256 kB 128M Bytes (32-bits wide) N/A C6414 C6415 C6416 L1P L1D L2 = = = 16 kB 16 kB 1 MB 256M Bytes (64-bits wide) C6712 Chapter 2, Slide 96 N/A 64M Bytes (16-bits wide) 64M Bytes (16-bits wide) Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Internal Memory Summary Devices Internal (L2) External C6211 C6711 C6713 64 kB 512M (32-bit wide) C6712 256 kB 512M (16-bit wide) Devices Internal (L2) C6414 C6415 C6416 1 MB DM642 256 kB C6411 256 kB External A: 1GB (64-bit) B: 256kB (16-bit) 1GB (64-bit) 256MB (32-bit) LINK: TMS320C6000 DSP Generation Chapter 2, Slide 97 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 TMS320C6000 Peripherals Chapter 2, Slide 98 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 'C6x System Block Diagram Memory External Memory Internal Buses .M1 .M2 .L1 .L2 .S1 .S2 Control Regs Regs (B0-B15) Regs (A0-A15) .D1 .D2 P E R I P H E R A L S CPU Chapter 2, Slide 99 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 ‘C6x Internal Buses Internal Memory A D External Interface A D x32 Peripherals A D ‘C67x x32 Program Addr x32 Program Data x256 Data Addr - T1 x32 Data Data - T1 x32/64 Data Addr - T2 x32 Data Data - T2 x32/64 DMA Addr - Read x32 DMA Data - Read x32 DMA Addr - Write x32 DMA Data - Write x32 PC A regs B regs DMA can perform 64-bit data loads. Chapter 2, Slide 100 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 'C6x System Block Diagram Memory Internal Buses EMIF .M1 .M2 .L1 .L2 .S1 .S2 Control Regs Regs (B0-B15) Regs (A0-A15) Ext’l Memory .D1 .D2 P E R I P H E R A L S CPU Chapter 2, Slide 101 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 'C6x System Block Diagram Program RAM Data Ram Addr Internal Buses D (32) EMIF - Async .M1 .M2 .L1 .L2 .S1 .S2 Control Regs Regs (B0-B15) - Sync Regs (A0-A15) Ext’l Memory .D1 .D2 P E R I P H E R A L S CPU Chapter 2, Slide 102 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 'C6000 Peripherals Parallel Comm Internal Memory GPIO External Memory EMIF Internal Buses Serial Ethernet Video Ports VCP / TCP .D1 .D2 .M1 .M2 .L1 .L2 Register Set B Timers Register Set A DMA, EDMA (Boot) .S1 .S2 CPU PLL Chapter 2, Slide 103 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 EMIF Async SDRAM EMIF Internal Buses .D1 .D2 .M1 .M2 .L1 .L2 Register Set B Register Set A SBSRAM Internal Memory External Memory Interface (EMIF).S1 .S2 Glueless access to async/sync memory CPU Works with PC100/133 SDRAM (cheap, fast, and easy!) Byte-wide data access 16, 32, or 64-bit bus widths Chapter 2, Slide 104 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 HPI / XBUS / PCI Parallel Comm External Memory EMIF Internal Memory Internal Buses Parallel Communication Interfaces .M1 .M2 HPI: Register Set B Register Set A .D1 .D2 Dedicated, slave-only, async 16/32-bit bus allows host-P access to C6000 memory .L1 .L2 XBUS: Similar to HPI but provides … .S1 .S2 Master/slave and sync modes Glueless i/f to FIFOs (up to single-cycle CPUxfer rate) PCI: Standard 32-bit, 33MHz/66MHz PCI interface These interfaces provide means to bootstrap the C6000 Chapter 2, Slide 105 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 GPIO Parallel Comm Internal Memory GPIO External Memory EMIF Internal Buses .M1 .M2 .L1 .L2 Register Set B Register Set A .D1 .D2 .S1 .S2 General Purpose Input/Output (GPIO) CPU C64x and C6713 provide 8-16 bits of general purpose bit I/O Use to observe or control the signal of a single-pin Chapter 2, Slide 106 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 McBSP and Utopia Multi-Channel Buffered Serial Port (McBSP) Parallel Internal 2 (or 3) full-duplex, synchronous serial-ports Comm Memory Up to 100 Mb/sec performance GPIO operation (T1, E1, MVIP, …) Supports multi-channel External Memory EMIF Internal Buses Serial Register Set B Register Set A .D1 .D2 Multi-Channel Audio Serial Port.M1 (McASP) .M2 McBSP features plus more … Up to 8 stereo lines (16 channels) IIC support On DM642, C6713 .L1 .L2 .S1 .S2 CPU Utopia (C64x) ATM connection 50 MHz wide area network connectivity Chapter 2, Slide 107 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 DMA / EDMA Parallel Comm Internal Memory GPIO External Memory EMIF Internal Buses Serial .D1 .D2 .M1 .M2 .L1 .L2 Direct Memory Access (DMA / EDMA) Transfers any set of memory locations to another .S1 .S2 4 / 16 / 64 channels (transfer parameter sets) Transfers can be triggered by any interrupt (sync) CPU Operates independent of CPU On reset, provides bootstrap from memory Chapter 2, Slide 108 Register Set B Register Set A DMA, EDMA (Boot) Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Timer/Counter Parallel Comm Internal Memory GPIO External Memory EMIF Internal Buses Serial Timers Timer / Counter .M1 .M2 .L1 .L2 .S1 .S2 Two (or three) 32-bit timer/counters Can generate interrupts Both input and output pins Chapter 2, Slide 109 .D1 .D2 Register Set B Register Set A DMA, EDMA (Boot) CPU Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Ethernet MAC Parallel Comm Internal Memory GPIO External Memory EMIF Internal Buses Serial Ethernet Video Ports Ethernet (DM642 only) VCP MAC / TCP 10/100 Ethernet .D1 .D2 .M1 .M2 .L1 .L2 Register Set B Timers Register Set A DMA, EDMA (Boot) .S1 .S2 CPU Pins are muxedPLL with PCI TCP/IP stack available from TI Chapter 2, Slide 110 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Video Ports Parallel Comm Internal Memory GPIO External Memory EMIF Internal Buses Serial only) Video Ports (DM642 Register Set B Each configurable for Capture or Display .D1 .D2 DMA, EDMA Dual 8/10-bit BT656 or raw modes (Boot) .M1 .M2 16/20-bit raw modes and 20-bit Y/C for high definition Timers Horz Scaling and Chroma Resampling Support for 8-bit modes .L1 .L2 Supports transport interface mode Ethernet Register Set A Video Ports VCP / TCP .S1 .S2 CPU PLL Chapter 2, Slide 111 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 VCP / TCP -- 3G Wireless Parallel Comm Internal Memory GPIO External Internal Buses EMIF Memory Turbo Coprocessor (TCP) (C6416 only) DMA, EDMA Viterbi Coprocessor (VCP) (C6416 only) .M1 .M2 Register Set B Register Set A Supports 35 data channels at 384 kbps McBSP’s 3GPP / IS2000 Turbo coder .D1and .D2frame length Utopia Programmable parameters include mode, rate (Boot)channels at 8 kbps Supports > 500 voice .L1 .L2 length, Programmable decoder parameters include constraint Timers code rate, and frame length .S1 .S2 Video Ports VCP / TCP CPU PLL Chapter 2, Slide 112 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Phase Locked Loop (PLL) Parallel Comm Internal Memory GPIO External Memory EMIF Internal Buses Serial Timers (Boot) Clock multiplier Ethernet Reduces EMI and cost Timers Rate is Pin selectable Video Ports VCP / TCP Output.L1 .L2 PLL CLKIN .M1 .M2 Register Set B Input .D1 .D2 Register Set A DMA, EDMA (Boot) DMA, EDMA CLKOUT1 .S1 .S2 CLKOUT2 (reduced rate clkout) CPU PLL Chapter 2, Slide 113 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Clock Cycle What is a clock cycle? The time between successive instructions C6000 CLKOUT1 (C6000 clock cycle) CLKIN PLL CLKOUT2 (½, ¼, or 1/6 CLKOUT1) When we talk about cycles ... CLKIN (MHz) PLL Rate CPU Clock Frequency CPU Clock Cycle Time MIPs (max) 60 x12 720 MHz 1.39 ns 5760 30 x10 300 MHz 3.33 ns 2400 50 x4 200 MHz 5 ns 1600 25 x4 100 MHz 10 ns 800 Chapter 2, Slide 114 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 'C6000 Peripherals Summary Parallel Comm Internal Memory GPIO External Memory EMIF Internal Buses Serial Ethernet Timers Video Ports VCP / TCP .D1 .D2 .M1 .M2 .L1 .L2 Register Set B Timers Register Set A DMA, EDMA (Boot) .S1 .S2 CPU PLL Chapter 2, Slide 115 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 ‘C6x Family Part Numbering Example = TMS320LC6201PKGA200 Chapter 2, Slide 116 TMS320 L C6 2 01 PKG A 200 = TI DSP = Place holder for voltage levels = C6x family = Fixed-point core = Memory/peripheral configuration = Pkg designator (actual letters TBD) = -40 to 85C (blank for 0 to 70C) = Core CPU speed in Mhz Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Module 1 Exam 1. Functional Units a. How many can perform an ADD? Name them. b. Which support memory loads/stores? .M 2. Memory Map .S .D .L a. How many external ranges exist on ‘C6201? Chapter 2, Slide 117 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 3. Conditional Code a. Which registers can be used as cond’l registers? b. Which instructions can be conditional? 4. Performance a. What is the 'C6711 instruction cycle time? b. How can the 'C6711 execute 1200 MIPs? Chapter 2, Slide 118 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 5. Coding Problems a. Move contents of A0-->A1 Chapter 2, Slide 119 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 5. Coding Problems a. Move contents of A0-->A1 b. Move contents of CSR-->A1 c. Clear register A5 Chapter 2, Slide 120 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 5. Coding Problems (cont’d) d. A2 = A02 + A1 e. If (B1 0) then B2 = B5 * B6 f. A2 = A0 * A1 + 10 g. Load an unsigned constant (19ABCh) into register A6. Chapter 2, Slide 121 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 5. Coding Problems (cont’d) h. Load A7 with contents of mem1 and postincrement the selected pointer. Chapter 2, Slide 122 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Module 1 Exam (solution) 1. Functional Units a. How many can perform an ADD? Name them. six; .L1, .L2, .D1, .D2, .S1, .S2 b. Which support memory loads/stores? .M 2. Memory Map .S .D .L a. How many external ranges exist on ‘C6201? Four Chapter 2, Slide 123 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 3. Conditional Code a. Which registers can be used as cond’l registers? A1, A2, B0, B1, B2 b. Which instructions can be conditional? All of them 4. Performance a. What is the 'C6711 instruction cycle time? CLKOUT1 b. How can the 'C6711 execute 1200 MIPs? 1200 MIPs = 8 instructions (units) x 150 MHz Chapter 2, Slide 124 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 5. Coding Problems a. Move contents of A0-->A1 or or MV .L1 ADD .S1 MPY .M1 A0, A1 A0, 0, A1 A0, 1, A1 (what’s the problem with this?) Chapter 2, Slide 125 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 5. Coding Problems a. Move contents of A0-->A1 or or MV .L1 ADD .S1 MPY .M1 A0, A1 A0, 0, A1 A0, 1, A1 (A0 can only be a 16-bit value) b. Move contents of CSR-->A1 MVC CSR, A1 c. Clear register A5 or or or or or Chapter 2, Slide 126 ZERO .S1 SUB .L1 MPY .M1 CLR .S1 MVK .S1 XOR .L1 A5 A5, A5, A5 A5, 0, A5 A5, 0, 31, A5 0, A5 A5,A5,A5 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 5. Coding Problems (cont’d) d. A2 = A02 + A1 MPY.M1 ADD.L1 A0, A0, A2 A2, A1, A2 e. If (B1 0) then B2 = B5 * B6 [B1] MPY.M2 B5, B6, B2 f. A2 = A0 * A1 + 10 MPY A0, A1, A2 ADD 10, A2, A2 g. Load an unsigned constant (19ABCh) into register A6. value .equ 0x00019abc mvkl .s1 0x00019abc,a6 mvkl.s1 value,a6 mvkh .s1 0x00019abc,a6 mvkh.s1 value,a6 Chapter 2, Slide 127 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 5. Coding Problems (cont’d) h. Load A7 with contents of mem1 and postincrement the selected pointer. x16 mem A7 mem1 load_mem1: Chapter 2, Slide 128 10h MVKL .S1 mem1, A6 MVKH .S1 mem1, A6 LDH .D1 *A6++, A7 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Architecture Links: Chapter 2, Slide 129 C6711 data sheet: tms320c6711.pdf C6713 data sheet: tms320c6713.pdf C6416 data sheet: tms320c6416.pdf User guide: spru189f.pdf Errata: sprz173c.pdf Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Chapter 2 TMS320C6000 Architectural Overview - End -
© Copyright 2024 Paperzz