Assignment-
(a) Write the MIPS-like assembly code for the following program segment to run on the 5-stage pipelined processor which you have developed.
sum=0;
for (i=0; i<=7; i=i+1)
{
sum = sum + x[i];
}
Convert that assembly code to machine language format, and execute them on the 5-stage pipelined processor which you have developed and Find the number of clock cycles and execution time to execute those machine instructions.
(b) Perform maximal loop unrolling as well as instruction reordering of the assembly code segment obtained for part (a). Convert that assembly code segment to machine language format, and run on the 5-stage pipelined processor to find the number of clock cycles and execution time.
(c) Compare the execution times found for part (a) and part (b), and explain the effect of loop-carried dependency on the execution time.
(d) Convert the assembly code in part (a) to machine language format and find the number of clock cycles and execution times to find sum of 4 arrays A, B, C, and D, where each of the arrays consists of 8 integer elements.
(e) Using loop fusion of the form given below and write another MIPS-like code for the program segment to run on the 5-stage pipelined processor which you have developed.
sum1=0;
sum2=0;
sum3=0;
sum4=0;
for (i=0; i<=7; i=i+1)
{
sum1 = sum1 + A[i];
sum2 = sum2 + B[i];
sum3 = sum3 + C[i];
sum4 = sum4 + D[i];
}
Convert the assembly code to machine language format; perform necessary instruction reordering to minimize the pipeline stalls, execute them on the 5-stage pipelined processor, and find the number of clock cycles and execution time.
(f) Compare the execution times found for part (d) and part (e). Explain the result.
CE/CZ 3001: Lab Project
For the rest of the lab work after Lab-3 you are required to do a project. The project consists of 3 parts. You are required to do coding and synthesis, and to demonstrate each part of the project. Write a project report to briefly describe the working of the design of each part. Report should also include the timing report and the waveform generated by simulating the testbench of each part of the project. For Part-3 you will be required to find the minimum execution time and the reduction in CPI which you achieve for the given program.
Project Part-1: Modify the 4-stage pipelined processor of Lab-3 to include BEQ, LW, and SW instructions and convert that to a 5-stage pipelined processor.
Project Part-2: Modify the processor designed in Part-1 of the project to include jump register (jr), jump (J), and jump & link (jal) instructions.
Project Part-3: Each group of students will be given a program which gets slowed due to pipeline stalls. You are required to modify the program to remove the hazards so as to reduce the number of pipeline stalls. Finally, you will estimate the reduction in the CPI and execution time which you achieve.
PROJECT- GROUP REPORT FORMAT
Part 1
1. Provide the testbench screen shots for the execution of a program which involves LW, SW and BEQ instructions along with R & I instructions.
2. Explain the working of LW, SW, and BEQ instructions in the same program.
Part 2
1. Provide the testbench screen shots for the execution of a program involves J, JR, and JAL instructions along with R & I instructions and part 1 instructions.
2. Explain the working of J, JR, and JAL instructions in the same program.
Part 3
Section-1
1. Assembly code of the original program segment given in the assignment.
2. Machine code (The input to the Imem txt file)
3. Report the number of clock cycles taken for the execution of the program (where hazards are taken care of)
4. Execution time of the given program segment
Section- II
5. Modification of the given program segment (loop unrolling and loop fusion separately) as mentioned of the question.
a. Modified assembly code and Machine code (The input to the Imem txt file).
b. Report the number of clock cycles taken for the modified program segment (where hazards are taken care of).
c. Execution time of the modified code.
6. Speed up of modified code when compared to original code in Section-1
Bonus topics-
Any substantial improvement made in the hardware to improve the performance of the system.
a. Data forwarding
b. Control hazard removal
c. Cache implementation
Attachment:- Assignment.zip