**Department Of Mathematics**

**Ahmadu Bello University, Zaria**

**First Semester Examination 2011/2012 Session**

**COSC303: Introduction to Computer Architecture**

**Instructions: Answer Any Four Questions. Time Allowed: 2 Hours**

1. Explain the Principle of locality and discuss how caches are used to exploit it for a performance benefit.
2. A computer has a memory hierarchy that consists of four levels: level1-cache, a level2-cache, main memory and a disk used for virtual memory. Let h1, h2, h3, h4 be the miss ratio of level1-cache, level2-cache, main memory and a disk respectively and Let t1, t2, t3, t4 be the access time of level1-cache, level2-cache, main memory and disk respectively.
3. Define an expression for the average memory access time of this system.
4. If t1=20ns, t2=30ns, t3=60ns, t4=1200ns and h1=10%, h2=20%, h3=40%, h4=0% what is the average time in ns required to access a referenced word on this system.
5. Define the terms "structural hazard", "control hazard", and "data hazard" in the context of pipelines. For each of them, suggest a way, either in software or hardware, how the effect could be reduced
6. Consider the execution of the following sequence on a on a five-stage pipeline consisting of IF, ID, OF, IE, and IS.

|  |  |
| --- | --- |
| Instruction number | Instruction |
| 1 | Load -1,R2 |
| 2 | Load 5,R1 |
| 3 | Sub R2,1,R2 |
| 4 | Add R1,R2,R3 |
| 5 | Add R4,R5,R6 |
| 6 | Shift R3 |
| 7 | Add R6,R4,R7 |

1. Identify all the data dependencies and their type between those instructions
2. By using the Gantt’s chart illustrates the progression of these instructions in the pipeline taking into consideration the data dependencies identified in (i) above.
3. What is the speedup gained in introducing pipelining on this machine?
4. What is the throughput?
5. Can data dependency occur in a single processor non pipelined machine? Explain your answer.
6. Describe the fetch-execute cycle
7. Consider a machine with three instruction classes and CPI measurements as shown in (a). Suppose that we measured the code for a given program in two different compilers and obtained the data in (b):

|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| |  |  | | --- | --- | | Instruction Class | CPI of the instruction class | | A | 2 | | B | 5 | | C | 7 |   (a) | |  |  |  |  | | --- | --- | --- | --- | | Code Sequence | Instruction counts(in millions) | | | | A | B | C | | Compiler1 | 15 | 5 | 3 | | Compiler 2 | 25 | 2 | 2 |   (b) |

* 1. What is the CPI for each code sequence?
  2. Assume that the machine’s clock rate is 500 MHz, Which code sequence will execute faster :
     + - According to MIPS?
       - According to execution time?

1. Consider having a program that runs in 50 s on computer A, which has a 3GHz clock rate. We would like to run the same program on another machine, B, in 20 s. If machine B requires 2 times as many clock cycles as machine A for the same program, what clock rate must machine B have in MHz?
2. Suppose that we have two implementations of the same instruction set architecture. Machine A has a clock rate of 2GHZ and a CPI of 4.0 for some program, and machine B has a clock rate of 1.5GHZ and a CPI of 2.5 for the same program. Which machine is faster and by which factor?
3. Briefly Explain what you understand by the following terms:
4. instruction-level parallelism
5. Pipelining
6. Cache Coherency
7. Random Access Memory
8. Three enhancements with the following speedups are proposed for a new machine:

Speedup(a) = 30, Speedup(b) = 20, and Speedup(c) =15.

Assume that for some set of programs, the fraction of use is 25% for enhancement (a), 30% for enhancement (b), and 45% for enhancement (c).

If only one enhancement can be implemented, which should be chosen to maximize the speedup?

1. Consider a non-pipelined machine with 5 execution stages of lengths 40ns, 40 ns, 60ns, 60ns, and 40ns.
   * 1. Find the instruction latency of this machine
     2. How much time does it take to execute 100 instructions?

Suppose we introduce pipelining on this machine. Assume that when introducing pipelining, the clock skew adds 1ns of overhead to each execution stage

* + 1. What is the instruction latency on the pipelined machine?
    2. How much time does the pipelined machine take to execute 100 instructions?
    3. Can we conclude that an increasing number of stages always provide increasing performance?

1. Suppose you own a computer with 7-stage pipelining of equal length that exhibits the following properties on the programs that you run:

* The pipeline can accept a new instruction every cycle
* The cache can provide data every cycle (i.e. no penalty for cache hits)
* The cache miss rate is 2%
* 20% of instructions are memory instructions
* The cache miss penalty is 20 cycles.

1. What is the average number of cycle taken to execute 100 instructions?
2. If the computer frequency is 2GHZ, what is the average time in second required to execute 100 instructions on this system?