assume we have a computer where the cpi is 1.0 when all memory accesses (including data and instruction accesses) hit in the cache. the cache is a unified (data instruction) cache of size 256 kb, 4-way set associative, with a block size of 64 bytes. the data accesses (loads and stores) constitute 50% of the instructions. the unified cache has a miss penalty of 25 clock cycles and a miss rate of 2% assume 32-bit instruction and data addresses. 3.1 what are the number of bits used for block offset? 3.2 what are the number of sets in the cache? 3.3 what are the number of bits for the cache index? 3.4 what are the number of bits for the tag? 3.5 calculate the number of stall cycles per instruction 3.6 how much faster would the computer be if all memory accesses were cache hits?