ECE 562/468 – Advanced Computer Architecture Homework #1

ECE 562/468 – Advanced Computer Architecture
Homework #1 Solution– Spring 2015
Instructor – Dr. Honggang Wang
1.1
[10/10/Discussion] <1.5, 1.5> Figure 1.22 gives the relevant chip statistics that
influence the cost of several current chips. In the next few exercises, you will be
exploring the trade-offs involved between the AMD Opteron, a single-chip
processor, and the Sun Niagara, an 8-core chip.
a. [10] <1.5> What is the yield for the AMD Opteron?
b. [10] <1.5> What is the yield for an 8-core Sun Niagara processor?
c. [Discussion] <1.4, 1.6> Why does the Sun Niagara have a worse yield than
the AMD Opteron, even though they have the same defect rate?
Solution:
a.)
From Figure 1.22, for AMD Opteron,
Defects per unit area = 0.75 per cm2
Die Area = 199 mm2= 1.99 cm2
Wafer Yield = 1 OR 100%
Plugging values in formula gives Die Yield = (1 + 0.75 * 1.99/4)-4 = 0.2816
b.)
Similarly, using formula for Die Yield,
Yield for 8-core Sun Niagara Processor = (1 + 0.75 * 3.8/4)-4 = 0.1163
c.) The die area of 8-core Sun processor is greater than that of AMD Opteron. Hence, for
the same defect rate, yield of Sun processor is less than AMD Opteron.
1.4 [20/10/20] <1.6> Figure 1.23 presents the power consumption of several computer
system components. In this exercise, we will explore how the hard drive affects power
consumption for the system.
a. [20] <1.6> Assuming the maximum load for each component, and a power supply
efficiency of 70%, what wattage must the server’s power supply deliver to a system with
a Sun Niagara 8-core chip, 2 GB 184-pin Kingston DRAM, and two7200 rpm hard
drives?
b.[10] <1.6> How much power will the 7200 rpm disk drive consume if it is idle rougly
40% of the time?
c. [20] <1.6> Assume that rpm is the only factor in how long a disk is not idle (which is
an oversimplification of disk performance). In other words, assume that for the same set
of requests, a 5400 rpm disk will require twice as much time to read data as a 10,800 rpm
disk. What percentage of the time would the 5400 rpm disk drive be idle to perform the
same transactions as in part (b)?
Solution:
a.
2 GB RAM requirement is accomplished using two 1-GB RAMs.
Actual power required by system = Sum of power requirements of all components
( Chip + 2 RAM + 2 Hard Drives)
Taking peak power requirements:
P1 = Power required by Sun Niagara 8 core-chip
= 79 W
P2 = Power required by two 1-GB 184 pin RAM = 2 * 3.7 = 7.4 W
P3 = Power required by two 7200 rpm hard drives = 2 * 7.9 = 15.8W
Total Power required by system = P = P1 + P2 + P3
= 79 + 7.4 + 15.8 = 102.2 W
Given power supply efficieny = 70 %
So, total power to be supplied = 102.2*100/70 = 146 W
b.
Power consumed by 7200 rpm hard drive:
when IDLE = 4 W
read/seek = 7.9 W
Hence, total power consumed (with 40 % time IDLE and 60 % otherwise)
= 0.6 * 7.9 + 0.4 * 4
=4.74 + 1.6 = 6.34 W
c.
Given, that 5200 rpm hard drive takes 2 times more time for the same job than 10800
rpm hard drive.
We can derive the relation:
Ratio of RPM = 10800/5400 = 2,
For the same amount of work, ratio of time taken by 10800 rpm ( T10800 ) and time taken
by 5400 rpm ( T5400) = T10800/T5400=1/2
In general for any two Hard drives A and B, for the same amount of work done.
TimeA
-------TimeB
=
RPMB
--------RPMA
Ratio of RPMs of 7200 and 5400 hard drives = 7200/5400 = 4/3
For 100 units of time, 7200 rpm HDD does ‘W’ amount of work in 60 units of time (
40% time it is IDLE).
Therefore TimeTaken7200 = 60
Time5400 = time taken by 5200 HDD (T) is such that
60/T5200=3/4
Or T5200=80. Hence 5200 rpm is busy for 80 units of time or 80% of the time.
Or 5200 rpm HDD is IDLE for 20 % of the time.
1.13 [10/10/Discussion] <1.8>
Imagine that your company is trying to decide between a single-processor system
and a dual-processor system. Figure 1.26 gives the performance on two sets of
benchmarks—a memory benchmark and a processor benchmark. You know that
your application will spend 30% of its time on memory-centric computations, and
70% of its time on processor-centric computations.
a. [10] <1.8> Calculate the weighted performance of the benchmarks for the
Pentium 4 and Athlon 64 X2 3800+.
b. [10] <1.8> How much speedup do you anticipate getting if you move from
using a Pentium 4 to an Athlon 64 X2 3800+ on a memory-intensive application
suite?
c. [Discussion] <1.8> You are using a dual-core Athlon processor, and you are
choosing between two ways to implement the same algorithm. The first is to create
a large lookup table to store 4K words of data. When you need the result, you look
up the answer. The second method would be to calculate the result in a very tight
loop. What are the advantages and disadvantages of each implementation?
Solution:
a. It is given that 30% of operations are memory centric and 70% are CPU-centric.
Following table gives the weighted execution times for the benchmarks.
Weighted Execution Time = MemeoryPerforamnce*0.30  DhrystonePerformance *0.60
Pentium 4:
2731*.3 + 7621*.7
Athlon 64 X2 3800+:
2941*.3 + 17129*.7
b. Speed-up from Pentium 4 to Athlon64 X2 3800+ can be measured as the ratio of their
Memory performance:
Speed-up =2941/2731
c. You can use either of ways to achieve the same performance given a certain ratio of
memory-processor. The choice depends on the tradeoff between the cost and
performance. An example is shown as follows:
Example:
Let the required ratio of memory-processor computation be x .
Then, for equal performance, we can consider the following equation:
3501  x  11210  (1  x)  3000  x  15220  (1  x)
 4511  x  4010

x  0.8889
Thus, the performance of Pentium 4 570 equals Pentium D 820 when there are 88.89%
memory operations and 11.1% processor operations.