lw $2, 100($3)solution
lw $4, 200($5)
add $6, $2, $4
sw $6, 300($7)
addi $2, $2, 4
addi $5, $5, 4
addi $7, $7, 4
0 1 2 3 4 5 6 7 8 9 0
nop |_|_|_|_|
lw $2, 100($3) |_|_|_|_|_|
nop |_|_|_|_|
lw $4, 200($5) |_|_|_|_|_|
add $6, $2, $4 |_|___|_|_|
sw $6, 300($7) |_|___|_|_|_|
addi $2, $2, 4 |_|_|_|_|
addi $5, $5, 4 |_|_|_|_|_|
addi $7, $7, 4 |_|_|_|_|
nop |_|_|_|_|_|
0 1 2 3 4 5 6 7 8 9 0 (total: 10 cycles)
Show how the following code would be executed on a CDC 6600. Assume all additions are floating-point.
X1 <- X6 + X7Solution
X1 <- X2 * X3
X5 <- X1 * X4
X4 <- X3 + X2
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3
X1 <- X6 + X7 |_|_|_|_|
X1 <- X2 * X3 |_|_|_|_|_|_|_|_|_|_|
X5 <- X1 * X4 |_________________|_|_|_|_|_|_|_|_|_|
X4 <- X3 + X2 |_|_|_|___________|
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 (total: 23 cycles)
Solution : (No spatiality is explored. Each time we bring only one word into cache.)
Address Decomposition:
|
||||||
(Note: Since the references are word addresses and smallest room in cache is a word, so we just do not touch the byte offsets and act as if the addresses are 30 bits long.) |
Hit or Miss:
Reference | tag | index | Miss/Hit |
2 | 0 | 2 | M |
3 | 0 | 3 | M |
11 | 0 | 11 | M |
16 | 1 | 0 | M |
21 | 1 | 5 | M |
13 | 0 | 13 | M |
64 | 4 | 0 | M |
48 | 3 | 0 | M |
19 | 1 | 3 | M |
11 | 0 | 11 | H |
3 | 0 | 3 | M |
22 | 1 | 6 | M |
4 | 0 | 4 | M |
27 | 1 | 11 | M |
6 | 0 | 6 | M |
11 | 0 | 11 | M |
Cache history:
Index | References |
0 | 16 64 48 |
1 | |
2 | 2 |
3 | 3 19 3 |
4 | 4 |
5 | 21 |
6 | 22 6 |
7 | |
8 | |
9 | |
10 | |
11 | 11 11 27 11 |
12 | |
13 | 13 |
14 | |
15 |
Final content:
Index | Valid bit | Tag | Data |
0 | 1 | 3 | mem[48] |
1 | 0 | ||
2 | 1 | 0 | mem[2] |
3 | 1 | 1 | mem[19] |
4 | 1 | 0 | mem[4] |
5 | 1 | 1 | mem[21] |
6 | 1 | 0 | mem[6] |
7 | 0 | ||
8 | 0 | ||
9 | 0 | ||
10 | 0 | ||
11 | 1 | 1 | mem[11] |
12 | 0 | ||
13 | 1 | 0 | mem[13] |
14 | 0 | ||
15 | 0 |
2. Text, question 7.10: Do the same exercise as those in the
previous question except for a direct-mapped cache with four-word blocks
and a total size of 16 words.
Solution : (This time we take advantages of spatiality. Each
time we bring 4 words into cache.)
Address Decomposition:
|
||||||||
(Note: Since the references are word addresses and smallest room in cache is a word, so we just do not touch the byte offsets and act as if the addresses are 30 bits long.) |
Hit or Miss:
Reference | tag | index | block offset | Miss/Hit |
2 | 0 | 0 | 2 | M |
3 | 0 | 0 | 3 | H |
11 | 0 | 2 | 3 | M |
16 | 1 | 0 | 0 | M |
21 | 1 | 1 | 1 | M |
13 | 0 | 3 | 1 | M |
64 | 4 | 0 | 0 | M |
48 | 3 | 0 | 0 | M |
19 | 1 | 0 | 3 | M |
11 | 0 | 2 | 3 | H |
3 | 0 | 0 | 3 | M |
22 | 1 | 1 | 2 | H |
4 | 0 | 1 | 0 | M |
27 | 1 | 2 | 3 | M |
6 | 0 | 1 | 2 | H |
11 | 0 | 2 | 3 | M |
Cache history:
Index | word 0 | word 1 | word 2 | word 3 |
0 | 4 16 64 48 20 4 | 5 17 65 49 21 5 | 2 18 66 50 22 6 | 3 19 67 51 19 3 |
1 | 24 4 | 21 5 | 22 6 | 23 7 |
2 | 12 28 12 | 13 29 13 | 14 30 14 | 11 27 11 |
3 | 16 | 13 | 14 | 15 |
Final content:
Index | tag | word 0 | word 1 | word 2 | word 3 |
0 | 0 | mem[4] | mem[5] | mem[6] | mem[3] |
1 | 0 | mem[4] | mem[5] | mem[6] | mem[7] |
2 | 0 | mem[12] | mem[13] | mem[14] | mem[11] |
3 | 0 | mem[16] | mem[13] | mem[14] | mem[15] |
3. Also, perform the same exercise as above except assuming
a two-way set-associative cache using LRU replacement, with a total of 16
one-word blocks.
Solution : We take advantages of associativity. Each time
we bring one words into cache. But we use LRU placement policy instead of
direct mapping. There are 16/2=8 sets of blocks in the cache.
Address Decomposition:
|
||||||
(Note: Since the references are word addresses and smallest room in cache is a word, so we just do not touch the byte offsets and act as if the addresses are 30 bits long.) |
Hit or Miss:
Reference | tag | set index | Miss/Hit |
2 | 0 | 2 | M |
3 | 0 | 3 |
M |
11 | 1 | 3 | M |
16 | 2 | 0 | M |
21 | 2 | 5 | M |
13 | 1 | 5 | M |
64 | 8 | 0 | M |
48 | 6 | 0 | M |
19 | 2 | 3 | M |
11 | 1 | 3 | H |
3 | 0 | 3 | M |
22 | 2 | 6 | M |
4 | 0 | 4 | M |
27 | 3 | 3 | M |
6 | 0 | 6 | M |
11 | 1 | 3 | M |
Cache history:
Index | Tag | Data | Tag | Data |
0 | 6 | 16 48 | 8 | 64 |
1 | ||||
2 | 0 | 2 | ||
3 | 1 | 3 19 3 11 | 3 | 11 11 27 |
4 | 0 | 4 | ||
5 | 2 | 21 | 1 | 13 |
6 | 2 | 22 | 0 | 6 |
7 |
Final content:
Index | Tag | Data | Tag | Data |
0 | 6 | mem[48] | 8 | mem[64] |
1 | ||||
2 | 0 | mem[2] | ||
3 | 1 | mem[11] | 3 | mem[27] |
4 | 0 | mem[4] | ||
5 | 2 | mem[21] | 1 | mem[13] |
6 | 2 | mem[22] | 0 | mem[6] |
7 |