CSIS1120/COMP2120 COMPUTER ORGANIZATION
Date: 22 May 2015
1. (a) What is big-endian and little-endian mode? [2]
(b) A machine uses paging for its main memory. Explain why sometimes we get Memory Access Violation for array subscript. out of bound, and sometimes not. [3]
(c) Write down two uses of displacement addressing mode that makes it the most commonly used addressing mode. [2]
(d) Most arithmetic operations require 3 operands, two source operands and one destination operand. Explain where the operands are for two-operand, one-operand, and zero-operand instructions. [3]
(e) A CPU executes in either user mode or supervisor mode. In which mode is a user program and the Operating System running? How can a user program switch to supervisor mode to access system services? [3]
2. (a) Write down the 8-bit one's complement, two's complement and excess-127 rep-resentation of the number 43 and -43. [4]
(b) (i) What is sign extension? [1]
(ii) Write down the algorithm for sign extending a two's complement number from m bits ton bits, m < n. [1]
(iii) Given an 8-bit two's complement value 10110011, extend it to 12-bit. [1]
(iv) Prove that your sign extension algorithm in (ii) provides the correct result. [3]
(c) A machine uses a 36-bit word to represent single-precision floating point num-bers as follows:
The value presented is given by (-1)S1.M x 2E-511.
(i) Write down the bit pattern corresponding to the value -13. 0625 [4]
(ii) Write down the value corresponding to the bit pattern C07EOOOOO [4]
(iii) Explain how we can increase the range of the representation without in-creasing the number of bits for the representation. What is the disadvantage of your proposal. [3]
3. (a) (i) What is a two level cache system? [2]
(ii) A memory hierarchy is to achieve an average memory access time that is close to the upper (faster) level. Based on this, try to explain the perfor-mance enhancement in a two level cache system. [4]
(iii) In a two-level cache system, the level one cache has a hit time of lns (inside the CPU), hit rate of 903, and a miss penalty of 20ns. The level two cache has a hit rate of 953 and a miss penalty of 220ns. What is the average memory access time? [3]
(b) Consider a hypothetical machine with 1024 words of cache memory. They are in two-way set associative organization, with cache block size of 128 words, using LRU replacement algorithm. The cache hit time is 9ns. Suppose the machine can access 4 words of memory in parallel, and the time to transfer the first 4 words from main memory to cache is 50ns, while each subsequent 4 words require lOns.
Consider the following read pattern (in blocks of 128 words, and block id starts from 0):
0 1 2 3 4 3 4 5 3 10 8 7 0 9 2 8 10 7 9 7 11 13
5 4 8 13 15 10 13 12
and assume each block contains an average of 24 memory references.
(i) What is the cache miss penalty (i.e., time to transfer one block of data from main memory to cache memory)? [2]
(ii) Write down the content of the cache memory (for all the blocks) at the end of the memory references, assuming that the cache is empty at the beginning. [5]
(iii) Write down the number of cache misses (the first reading of a block is also considered a miss), and the cache hit rate. [3]
(iv) Calculate the average memory access time. [2]
( c) Consider a Hard Disk with an average seek time of 15ms, lms for moving to adja-cent track, and rotation speed of 5400rpm, with an average of 500 sectors/track, and each sector is 512 bytes.
(i) What is the average rotation latency? [2]
(ii) What is the time required for 1 sector to rotate under the read/write head? [2]
(iii) What is the time to read 10 consecutive tracks entirely? [3]
4. Given the data path of a CPU as shown below:
Figure 1: A simplified CPU
In the data path shown, the S1, S2 and D fields in the instruction represent the two source and one destination operands, and are directly connected to the address port of the register file. Register S1 will be read into RFOUT1, while register S2 will be read into RFOUT2. RFIN will be written into register D. Registers A and B are the input registers and C is the output register of the ALU. There is a single CPU bus in the data path.
(a) Describe how the following instructions will be executed inside the CPU:
(i) The one-word instruction ADD R1 ,R2,R3 [i.e. R3 <--- R1 + R2] [4]
(ii) The 2-word instruction, CALL DFF1 (R1) which performs a function call, where DFF1 (Rl) is the displacement addressing mode to specify the address of the function, and OFF1 stored in the second word of the instruction. [8]
(b) (i) Write down the equation governing the relation between instruction exe-cution time and system clock rate. Explain briefly how each factor in the equation in modern computer system affect system performance. [4]
(ii) Is a system running at 3GHz clock rate always faster than another system running at 2.5GHz? Explain. [2]
(iii) The CPI (Clock per Instruction) for an ideal pipeline is 1. However, in real life, we only get CPI value of, say 1.8. Explain why we cannot achieve the CPI of ideal pipeline? [2]
(iv) Is it possible to have a machine with CPI < 1? Explain. [2]
5. (a) A driverless car has two sensors to detect the distance between the car and the car in front, and the speed of the car. It also has a sensor to detect its own speed. These sensors are connected to the onboard computer with a Control and Status Register VCSR and three Buffer Registers VBRS 1 (buffer register for speed of the driver less car), VBRS2 (buffer register for speed of the car in front), and VBRD (distance between the two cars). Assume that the speed is measured in centimeter per second, and the distance in centimeter, all in integers. For simplicity, also assume that there is no special cases, such as no car in front, and distance overflow etc.
The VCSR has the following format:
Bit 0 Own speedometer Ready bit, the own speedometer is ready
Bit 1 Set this bit to start reading the own speedometer.
Bit 2 Reading is available in VBRS1, automatically cleared after data read.
Bit 3 Front speedometer Ready bit, the front speedometer is ready
Bit 4 Set this bit to start reading the front speedometer.
Bit 5 Reading is available in VBRS2, automatically cleared after data read.
Bit 6 Distance sensor ready bit, the distance sensor is ready
Bit 7 Set this bit to start reading the distance sensor.
Bit 8 Reading is available in VBRD, a.utoma.tica.lly cleared after data read.
Bit 9 Apply Brake, with level 0-3, Given by Bit 10 and 11.
Bit 10-11 Brake level.
Write a.n assembly language program using Programmed I/O to read the speed of the two cars, and their distance. If their distance is more than 4 seconds, (i.e. distance > (own speed - front speed) x 4 ), set the brake level to 0.
If their distance is more than 3 seconds, set the brake level to 1. If their distance is more than 2 seconds, set the brake level to 2. If their distance is more than 1 seconds, set the brake level to 3.
(You ma.y invent your own instruction set as long a.s it is reasonable, a.nd may assume there is an integer multiply instruction.) Comment your program so that it can be understood.) [10]
(b) (i) There is a. lot of overheads for Programmed 1/0. What are the other two I/O techniques which requires less CPU intervention? Explain briefly what they a.re in no more than three lines. [4]
(ii) Which technique in part (i) is more suitable for a Gigabit Ethernet Network Controller. Give a. one sentence explanation. [2]