ECE 562/468 – Advanced Computer Architecture Homework #1 Solution– Spring 2015 Instructor – Dr. Honggang Wang 1.1 [10/10/Discussion] <1.5, 1.5> Figure 1.22 gives the relevant chip statistics that influence the cost of several current chips. In the next few exercises, you will be exploring the trade-offs involved between the AMD Opteron, a single-chip processor, and the Sun Niagara, an 8-core chip. a. [10] <1.5> What is the yield for the AMD Opteron? b. [10] <1.5> What is the yield for an 8-core Sun Niagara processor? c. [Discussion] <1.4, 1.6> Why does the Sun Niagara have a worse yield than the AMD Opteron, even though they have the same defect rate? Solution: a.) From Figure 1.22, for AMD Opteron, Defects per unit area = 0.75 per cm2 Die Area = 199 mm2= 1.99 cm2 Wafer Yield = 1 OR 100% Plugging values in formula gives Die Yield = (1 + 0.75 * 1.99/4)-4 = 0.2816 b.) Similarly, using formula for Die Yield, Yield for 8-core Sun Niagara Processor = (1 + 0.75 * 3.8/4)-4 = 0.1163 c.) The die area of 8-core Sun processor is greater than that of AMD Opteron. Hence, for the same defect rate, yield of Sun processor is less than AMD Opteron. 1.4 [20/10/20] <1.6> Figure 1.23 presents the power consumption of several computer system components. In this exercise, we will explore how the hard drive affects power consumption for the system. a. [20] <1.6> Assuming the maximum load for each component, and a power supply efficiency of 70%, what wattage must the server’s power supply deliver to a system with a Sun Niagara 8-core chip, 2 GB 184-pin Kingston DRAM, and two7200 rpm hard drives? b.[10] <1.6> How much power will the 7200 rpm disk drive consume if it is idle rougly 40% of the time? c. [20] <1.6> Assume that rpm is the only factor in how long a disk is not idle (which is an oversimplification of disk performance). In other words, assume that for the same set of requests, a 5400 rpm disk will require twice as much time to read data as a 10,800 rpm disk. What percentage of the time would the 5400 rpm disk drive be idle to perform the same transactions as in part (b)? Solution: a. 2 GB RAM requirement is accomplished using two 1-GB RAMs. Actual power required by system = Sum of power requirements of all components ( Chip + 2 RAM + 2 Hard Drives) Taking peak power requirements: P1 = Power required by Sun Niagara 8 core-chip = 79 W P2 = Power required by two 1-GB 184 pin RAM = 2 * 3.7 = 7.4 W P3 = Power required by two 7200 rpm hard drives = 2 * 7.9 = 15.8W Total Power required by system = P = P1 + P2 + P3 = 79 + 7.4 + 15.8 = 102.2 W Given power supply efficieny = 70 % So, total power to be supplied = 102.2*100/70 = 146 W b. Power consumed by 7200 rpm hard drive: when IDLE = 4 W read/seek = 7.9 W Hence, total power consumed (with 40 % time IDLE and 60 % otherwise) = 0.6 * 7.9 + 0.4 * 4 =4.74 + 1.6 = 6.34 W c. Given, that 5200 rpm hard drive takes 2 times more time for the same job than 10800 rpm hard drive. We can derive the relation: Ratio of RPM = 10800/5400 = 2, For the same amount of work, ratio of time taken by 10800 rpm ( T10800 ) and time taken by 5400 rpm ( T5400) = T10800/T5400=1/2 In general for any two Hard drives A and B, for the same amount of work done. TimeA -------TimeB = RPMB --------RPMA Ratio of RPMs of 7200 and 5400 hard drives = 7200/5400 = 4/3 For 100 units of time, 7200 rpm HDD does ‘W’ amount of work in 60 units of time ( 40% time it is IDLE). Therefore TimeTaken7200 = 60 Time5400 = time taken by 5200 HDD (T) is such that 60/T5200=3/4 Or T5200=80. Hence 5200 rpm is busy for 80 units of time or 80% of the time. Or 5200 rpm HDD is IDLE for 20 % of the time. 1.13 [10/10/Discussion] <1.8> Imagine that your company is trying to decide between a single-processor system and a dual-processor system. Figure 1.26 gives the performance on two sets of benchmarks—a memory benchmark and a processor benchmark. You know that your application will spend 30% of its time on memory-centric computations, and 70% of its time on processor-centric computations. a. [10] <1.8> Calculate the weighted performance of the benchmarks for the Pentium 4 and Athlon 64 X2 3800+. b. [10] <1.8> How much speedup do you anticipate getting if you move from using a Pentium 4 to an Athlon 64 X2 3800+ on a memory-intensive application suite? c. [Discussion] <1.8> You are using a dual-core Athlon processor, and you are choosing between two ways to implement the same algorithm. The first is to create a large lookup table to store 4K words of data. When you need the result, you look up the answer. The second method would be to calculate the result in a very tight loop. What are the advantages and disadvantages of each implementation? Solution: a. It is given that 30% of operations are memory centric and 70% are CPU-centric. Following table gives the weighted execution times for the benchmarks. Weighted Execution Time = MemeoryPerforamnce*0.30 DhrystonePerformance *0.60 Pentium 4: 2731*.3 + 7621*.7 Athlon 64 X2 3800+: 2941*.3 + 17129*.7 b. Speed-up from Pentium 4 to Athlon64 X2 3800+ can be measured as the ratio of their Memory performance: Speed-up =2941/2731 c. You can use either of ways to achieve the same performance given a certain ratio of memory-processor. The choice depends on the tradeoff between the cost and performance. An example is shown as follows: Example: Let the required ratio of memory-processor computation be x . Then, for equal performance, we can consider the following equation: 3501 x 11210 (1 x) 3000 x 15220 (1 x) 4511 x 4010 x 0.8889 Thus, the performance of Pentium 4 570 equals Pentium D 820 when there are 88.89% memory operations and 11.1% processor operations.
© Copyright 2024 Paperzz