Discuss the below:
In the first part of the project, you need implement a simple lexical analyzer using C++
1. Input is a text file with source code in it.
2. Space, tab, new line and comment (/* .... */) should be ignored by the lexical analyzer
3. The lexical analyzer should be able to identify
a) Integer literals, e.g. 34
b) Following keywords: for, while, do, if, else, public, private
c) Any user defined name, e.g. balance, a, b
d) Other single character punctuation, symbols, e.g. %, +, =, ; and so on
e) Special multi-character symbols, including ==, <=, >= (only these three)
4. In the program, tokens for all the key words and user defined names need to be stored in a simple table. No matter how many times one key word or user defined name appear in the code, its token should appear only once in the table
5. Token object structure and display format
|
Integer Literal
|
Keywords
|
User defined Name
|
Single character symbol
|
Multi-character symbol
|
Object Contents
|
tag (integer)
|
256
|
257
|
258
|
ASCII
|
259, 260, 261 for ==, <=, >= respectively
|
v (integer)
|
Numeric value
|
N/A
|
NA
|
NA
|
NA
|
s (String)
|
NA
|
Keyword string
|
Name string
|
NA
|
NA
|
Display Format
|
Display format
|
|
|
|
<+>
|
<==>
|
5. Output to screen:
a) List the tokens in the table
b) Display the token sequence derived from the input file on the screen following the format in the table above.