Solved: Write a two-pass assembler for a subset of the mips, Assembly Language

Write a two-pass assembler for a subset of the mips

Project

This assignment will reinforce your knowledge of the assembly process. You will need to go through all of the steps of converting an assembly source file to object code.

Your goal is to write a two-pass assembler for a subset of the MIPS instruction set. It should be able to read an assembly file from the command line and write the object code to standard output. You can make the following assumptions:

- The code segment will precede the data segment
- The source file will contain no more than 32768 distinct instructions
- The source file will define no more than 32768B of data
- The source file will not contain comments
- There will be no whitespace between arguments in each instruction
- Each line may have a symbolic label, terminated with a colon

Table 1 provides a list of the assembly directives that your assembler must recognize. Table 2 provides a list of the instructions that your assembler must recognize. Be sure that you note the arguments for each instruction. It may be helpful to refer to Appendix A.10 when writing your parser.

Table 1. List of Assembly Directives

Directive	Explanation
.text	Place items following this directive in the user text segment
.data	Place items following this directive in the data segment
.word w1,w2,...,wn	Store n 32b integer values in successive words in memory
.space n	Allocate n bytes of space in memory, initialized to zero

Table 2. List of MIPS Instructions

Mnemonic	Format	Args	Descriptions
addiu	I		Add immediate with no overflow
addu	R	3 (rd, rs, rt)	Add with no overflow
and	R	3 (rd, rs, rt)	Bitwise logical AND
beq	I		Branch when equal
bne	I		Branch when not equal
div	R	2 (rs, rt)	Signed integer divide
j	J		Jump
lw	I		Load 32b word
mfhi	R	1 (rd)	Move from hi register
mflo	R	1 (rd)	Move from low register
mult	R	2 (rs, rt)	Signed integer multiply
or	R	3 (rd, rs, rt)	Bitwise logical OR
slt	R	3 (rd, rs, rt)	Set when less than
subu	R	3 (rd, rs, rt)	Subtract with no overflow
sw	I		Store 32b word
syscall	R	0	System call

In addition to the instructions above, your assembler must be able to resolve symbolic labels. These labels may be targets used for changes in the control flow (branch or jump instructions) or as names for memory elements. The way labels are handled differs depending on their usage. Targets for branch instructions should be referenced as the location of the target in memory relative to the current instruction (remember that the PC points to the next instruction). For example, consider the code below:

00400400 :

400400: 400404:	1100000c 00000000	beqz nop	t0,400434
400408: 40040c: 400410:	01084021 1100fffc 00000000	addu beqz nop	t0,t0,t0 t0,400400
400414: 400418: 40041c:	01084021 1100fff9 00000000	addu beqz nop	t0,t0,t0 t0,400400
400420: 400424:	01084021 1100fff6	addu beqz	t0,t0,t0 t0,400400
400428:	00000000	nop
40042c:	11000001	beqz	t0,400434
400430:	00000000	nop

00400434 :

400434: 00000000 nop

You can see that the forward branches to L5 (in pink) have distances of 12 and 1. If you count the instructions from the two branch instructions, you can see that the actual numbers of instructions are 13 and 2 - the PC will have already advanced to the next instruction. The same is true for the backward branches to L4 (the non-colored branches). The branches use two's complement for the target calculations, so the first branch, 0x1100fffc, is at an offset of 0xfffc from the target. If you calculate the decimal value, you should get -4, which is the distance of the label from the PC.

Targets for jump instructions should use the absolute location of the target. For example, assume that label L1 is located in memory at 0x400370. The instruction j L1 will resolve to j 400370.

Data labels should be referenced by their offset from the global pointer, $gp, which is assumed to point to the start of the data segment.

You should use the linprog servers for all of your compilation and testing. Your output should match mine exactly. You can determine if the results are identical by calculating the md5sum or by using diff. You must use C/C++ as your language and your solution should be a single file (e.g. ch03c.pr01.c or ch03c.pr01.cpp). You should submit this file through Blackboard. Your program should have comments inline and a header at the top. For example:

/**
* @file main.cpp
* @author hughes <>, (C) 2014, 2015, 2016
* @date 05/11/16
* @brief Simple MIPS assembler
*
* @section DESCRIPTION
* This program implements an assembler for a subset
* of the MIPS assembly language. Can compile with debug
* by including -DDEBUG in the compiler options.
************************************************************/

Please test your output against the results from the sample binary before submission. The test script uses md5 and diff to compare your output with the baseline. Your submissions will also be processed for plagiarism. The script will use the following for compilation: g++ -Werror -mtune=generic -O0 -std=c++11

If you write it in C instead of C++, the script will use gcc -Werror -mtune=generic -O0 -std=c11

You can access my binary using the following command:
~chughes/cda3101/assembler

There is an example assembly program below in Figure 1 along with the machine code. You can access the assembly source at ~chughes/cda3101/test01.s and the object code at ~chughes/cda3101/test01.obj. You should note that the machine code is in hexadecimal.

.text

addu $s0,$zero,$zero addu $s1,$zero,$zero addiu $v0,$zero,5 syscall

sw $v0,n($gp)

L1:

lw $s2,n($gp) slt $t0,$s1,$s2 beq $t0,$zero,L2

addiu $v0,$zero,5 syscall

addu $s0,$s0,$v0 addiu $s1,$s1,1 j L1

L2:

addu $a0,$s0,$zero addiu $v0,$zero,1 syscall

addiu $v0,$zero,10 syscall

.data n: .word 0

m: .word 1,9,12

q: .space 10

00008021

00008821

24020005

0000000c af820000 8f920000

0232402a

11000005

24020005

0000000c

02028021

26310001

08000005

02002021

24020001

0000000c

2402000a

0000000c

00000000

00000001

00000009

0000000c

00000000

Figure 1 - Sample source code (left) and object code (right)

A second test file is included in the directory and is named test02.s. These are samples and are not the inputs that will be used for grading. Feel free to write your own inputs and share them via the discussion boards. If you find an error in assembler, please let me know (extra credit)!

While you are free to use any string parsing method you choose, you may find it helpful to use the getline function. getline extracts characters from an input stream and stores them in a string until a delimiter is reached or a newline character is found.

istream& getline (istream& is, string& str);

For example, the code below discards whitespace at the current pointer, reads a line from the input, and pushes the line to a list as a string type.

do
{
std::ws(asmFile); std::getline(asmFile, lineIn);

sourceCode.push_back(lineIn); //add to the list of instructions from source
}while(asmFile.eof() == 0);

You may also find the Boost tokenizer class useful. The tokenizer will parse the input sequence and break the sequence into pieces, depending on a delimiter. The code below takes an input string, input, and seperates it based on the characters defined in delimeter. The for-loop then iterates through those tokens.

boost::char_separator delimeter(", ()");
boost::tokenizer< boost::char_separator< char > > tokens(input, delimeter);

for(boost::tokenizer< boost::char_separator >::iterator it = tokens.begin(); it != tokens.end(); it++)
{
//stuff
}

These are just some of the tools that I used in my solution; you are not required to use them! C/C++ has plenty of functions that you may find useful such as fgets and sscanf. Be creative!

I don't know how many pages it would be since it is programming and the details in the file i uploaded

View Complete Question

Solution Preview :

Prepared by a verified Expert

Assembly Language: Write a two-pass assembler for a subset of the mips

Reference No:- TGS01411030

Now Priced at $130 (50% Discount)

Recommended (98%)

Rated (4.3/5)

Have a Question? (oR Write a Review)

Write atleast 100 words!!

Solution Preview :

Prepared by a verified Expert

Assembly Language: Write a two-pass assembler for a subset of the mips

Reference No:- TGS01411030

Have a Question? (oR Write a Review)

Recent Questions Asked Assembly Language

Q : Transaction exposure can be mitigated by taking forward

Q : What is a functional dependency what are the possible

Q : Larry davis borrows 74000 at 10 percent interest toward the

Q : Mark ventura has just purchased an annuity to begin payment

Q : Write a two-pass assembler for a subset of the mips

Q : Weston industries has a debt-equity ratio of 15 its wacc is

Q : This week you were tasked with creating an informative

Q : With this information justin calculated the total standard

Q : Wision provides the highest operating margin

What my bike has taught me about white privilege

Illustrate social issue faced by clients at your agency

Review-racism in fine print: how old housing policies impact

How community development can be defined

How volunteer personal experience affect collective action

Describe the different community action modalities

Key elements in the process of community engagement

Solution Preview :

Prepared by a verified Expert

Assembly Language: Write a two-pass assembler for a subset of the mips

Reference No:- TGS01411030

Recent Questions Asked Assembly Language

Q : Transaction exposure can be mitigated by taking forward

Q : What is a functional dependency what are the possible

Q : Larry davis borrows 74000 at 10 percent interest toward the

Q : Mark ventura has just purchased an annuity to begin payment

Q : Write a two-pass assembler for a subset of the mips

Q : Weston industries has a debt-equity ratio of 15 its wacc is

Q : This week you were tasked with creating an informative

Q : With this information justin calculated the total standard

Q : Wision provides the highest operating margin

Asked Questions