Programming Assignment: Linker
You are to implement a two-pass linker and submit the source code, which we will compile and run. Submit your source code together with a Makefile as a ZIP file with directory through NYU Classes assignment. Please do not submit inputs or outputs. Your program must take one input parameter which will be the name of an input file to be processed. All output should go to standard output. The languages of choice for this first lab are C/C++/Java. All subsequent labs will be C/C++ only. You may develop your lab on any machine you wish, but you must ensure that it compiles and runs on the NYU system assigned to the course (energon1/2) where it will be graded. It is your responsibility to make sure it executes on those machines. Note, when you work on energon1 or energon2 the default GCC/G++ compiler is v4.4.7. If you use advanced features there is a version 4.6 and 4.8; use gcc46, gcc48, g++46 or g++48 instead. We realize you code on your own machine and transferring to energon exposes occasionally some linker errors. In that case use static linking (such as not finding the appropriate libraries at runtime).
In general, a linker takes individually compiled code/object modules and creates a single executable by resolving external symbol references (e.g. variables and functions) and module relative addressing by assigning global addresses after placing the modules' object code at global addresses.
We assume a target machine with the following properties: (a) word addressable, (b) addressable memory of 512 words, and (c) each word consisting of 4 decimal digits. [I know that is a really strange machine].
Other requirements: error detection, limits, and space used.
To receive full credit, you must check the input for various errors. All errors/warnings should follow the message catalog provided below. We will do a textual difference against a reference implementation to grade your program. Any reported difference will indicate a non-compliance with the instructions provided and is reported as an error and result in deductions.
You should continue processing after encountering an error/warning (other than a syntax error) and you should be able to detect multiple errors in the same run.
1. You should stop processing if a syntax error is detected in the input, print a syntax error message with the line number and the character offset in the input file where observed. A syntax error is defined as a missing token (e.g. 4 used symbols are defined but only 3 are given) or an unexpected token. Stop processing and exit.
2. If a symbol is defined multiple times, print an error message and use the value given in the first definition. Error message to appear as part of printing the symbol table (following symbol=value printout on the same line)
3. If a symbol is used in an E-instruction but not defined, print an error message and use the value zero.
4. If a symbol is defined but not used, print a warning message and continue.
5. If an address appearing in a definition exceeds the size of the module, print a warning message and treat the address given as 0 (relative to the module).
6. If an external address is too large to reference an entry in the use list, print an error message and treat the address as immediate.
7. If a symbol appears in a use list but it not actually used in the module (i.e., not referred to in an E-type address), print a warning message and continue.
8. If an absolute address exceeds the size of the machine, print an error message and use the absolute value zero.
9. If a relative address exceeds the size of the module, print an error message and use the module relative value zero (that means you still need to remap "0" that to the correct absolute address).
10. If an illegal immediate value (I) is encountered (i.e. more than 4 numerical digits), print an error and convert the value to 9999.
11. If an illegal opcode is encountered (i.e. more than 4 numerical digits), print an error and convert the to 9999.
The following exact limits are in place.
a) Accepted symbols should be upto 16 characters long (not including terminations e.g. '\0'), any longer symbol names are erroneous.
b) a uselist or deflist should support 16 definitions, but not more and an error should be raised.
c) number instructions are unlimited (hence the two pass system), but in reality they are limited to the machine size.
d) Symbol table should support at least 256 symbols.
Attachment:- Assignment.rar