Your program must read 8-bit ascii strings from standard


Programming Assignment-

1. Program Specification

1. Your program must read 8-bit ASCII strings from standard input -- for instance, using the cin object in C++, or stdin in C. You must consume all input from standard input.

2. You are expected to tokenize the input stream using the input specification except that:

a. You are no longer going to be outputting tokens (you'll use that token stream as input to your parser).

b. You need to recognize two new token types:

Lexical Pattern

Numeric Type

English Type

Value of output

'('

33

(

None

')'

34

)

None

3. You then are expected to use the token stream and the BNF specification on the next page to parse the input into a parse tree.

4. Note the following about the above specification:

a. Literals are put in single quotes, and all match back to the tokens.

b. Non-terminals are in angle-brackets.

c. The character ε signifies the "empty string", meaning the rule matches on no input.

d. IF there are any ambiguities in the grammar where multiple productions could be valid for a parse, you are expected to parse using the top-most production.

e. The grammar above cannot be directly parsed using recursive descent. I recommend you try to refactor the grammar in places where recursive descent wouldn't work (per our discussion in class).

5. If the input string would have (in Assignment 1) generated an ERR* token, then your program should output the string: "Lex error" (without the quotes), followed by a single newline, and then it should exit with error code 0 (success). This string is case sensitive, and must be output to standard output.

6. If the token stream contains COMMENT tokens, NEWLINE tokens or WS tokens, ignore those tokens. They are not errors, and should never generate any output if they exist in a valid input.

7. If the token stream is valid per Assignment 1 (i.e., no ERR tokens would have been generated), but the resulting token stream does have tokens that are NOT USED by the grammar above, then your program should output the string: "Unimplemented error" (without the quotes), followed by a single newline, and then it should exit with error code 0 (success). This string is case sensitive, and must be output to standard output.

8. If the input string is not a string in the language represented by the grammar above (i.e.., if parsing fails), then you should output the string: "Parse error" (without the quotes), followed by a single newline, and then it should exit with error code 0 (success). This string is case sensitive, and must be output to standard output.

9. All specified output is CASE SENSITIVE.

10. Assuming the input is valid, your program must output a parse tree. The parse trees you output must have the following properties:

a. Each node must either represent either a terminal or a non-terminal in the grammar above (or an equivalent transformation of the above grammar - you may make any transformation of the grammar for your own purposes that does not change the language matched - you can rename non-terminals and you can add productions that do not change the semantics-for instance, if needed to make the language parsable via recursive descent).

b. Terminals must be either your tokens or the empty string (ε). Note that if you do not need the empty string in your graphs, that is fine!

c. Terminal nodes must not have any children in the graph.

d. There must be a single root node, and all nodes in the parse must be reachable from that root node.

e. The graph must have left-to-right ordering-when you output the graph and I traverse it from left to right, I must be able to find all the tokens used in the parse IN ORDER.

f. Your tree must be an unambiguous representation of the correct parse of the program. For instance, if you are parsing the expression: 1 + 2 * 3, it must be possible to determine that your graph properly represents the rules of precedence that are encoded in the grammar above.

11. Graphs will be output in a subset of XML described below. White space (spaces, horizontal tabs and newlines) in your output is unimportant between XML tokens, unless explicitly mentioned below.

12. All XML objects will be represented in the form: Value

a. The element name, and must match the regular expression: [_a-zA-Z][_a-zA-Z0-9]*

b. The "close tag" (e.g., must always be present-there is no "implicit close" allowed in your output.

13. When asked to output a terminal node representing ε, output:

14. When asked to output a terminal node representing a token from the input, then output a node entity in the form: "%s", where you should substitute %s with the following required elements (which may appear in any order):

a. id, which must contain the Token ID for the associated token, as set per Assignment 1. E.g., the first token (assuming it is not a whitespace token), should output: 1

b. typenum. The entity's contents should be the integer representing the token type from the chart in Assignment 1 (or in the supplemental table above).

c. typename. The contents should be the type name associated with the integer token type.

d. position. The contents should be the integer representing the offset of the start of the token into the original input string, as per Assignment 1.

e. length. The contents should be the integer representing the length of the token, as per assignment 1.

f. value. The contents should be the same as the associated output in Assignment 1. If there would have been no value output in Assignment 1, then this element is optional in your output. But if you do provide it, the contents must be empty.

15. For the above elements, they may exist no more than once in a single terminal node.

16. Additional elements may exist if you desire, but will be ignored.

17. For the above elements, all must exist in every token node, except as otherwise specified in 14f.

18. When outputting the value element for a token of type STR, you must not add additional white space into the element. For instance, if the string is "foo", your value element must be exactly: foo, and may not be: foo . This helps ensure that I can accurately recreate the original string contents.

19. When asked to output a non-terminal node, then output a node entity of the form: "%s", where you should substitute %s with the following required elements (which may appear in any order):

a. nt, which must contain a string representing the name of your non-terminal (you can use names from the provided grammar, or change them if necessary).

b. children, which must contain the full representation of all children nodes in the tree, IN THEIR LEFT TO RIGHT ORDER.

20. When your program finishes parsing, using the above rules, print the root node of your parse tree to stdout. Since the above specification is recursive, you will actually end up printing the entire parse tree.

21. When you have output your entire parse tree, your program must exit with error code 0 (success).

22. Command line arguments to your program shall be IGNORED.

2. Other Requirements

You will receive a 0 on this if any of these requirements are not met!

23. The assignment is due on March 14 at 8am Eastern time. Late assignments will lose one letter grade per 24 hours.

24. The program must be written entirely in C or C++.

25. You must submit a single source code file, unless you choose to use multiple files, in which case you must submit a single ZIP file, and nothing else.

26. If submitting a ZIP file, when the file unzips, your source files must unzip into the same directory (including any header files you need).

27. If submitting a ZIP file, there must not be ANY other files contained within the ZIP file. Again, you will get a 0 if there are.

28. If your program is written in C, it must compile ON MY REFERENCE ENVIRONMENT into an executable with the following command line: cc *.c -o assignment2

29. If your program is written in C++, it must compile ON MY REFERENCE ENVIRONMENT into an executable with the following command line (note this is slightly different than last time): c++ -std=c++11 *.cpp -o assignment2

30. You may not use any 3rd party code for parser generation. I will inspect assignments to see if they are generated by any common tools.

31. Your program should print nothing to stderr under any circumstances.

32. Your program's output will be tested in the reference environment only. Even if it works on your desktop, if it doesn't work in the reference environment, you will get a 0.

With C and C++ this is a common occurrence due to memory errors, so be sure to test in the reference environment!

33. You must submit the homework through the course website, unless otherwise pre-approved by the professor.

34. You may not give or receive any help from other people on this assignment.

35. You may NOT use code from any other program, no matter who authored it.

3. Test Cases

Below are five sample test cases for you, which I will use in my testing. Note that, because you will need to transform the input grammar, the trees that you output will probably look different to the ones below. However, they must still correctly capture the syntax of the input string in tree form.

I will definitely use the below cases. I strongly recommend you create your own test harness and come up with a large number of test cases to help you get the best possible grade.

For test cases, what one would type on the command line is BLACK, input is in GREEN, and output is in BLUE.

Note that all the below test cases end with a final newline, and have no spaces before or after each line of input.

Attachment:- Programming Assignment.rar

Request for Solution File

Ask an Expert for Answer!!
C/C++ Programming: Your program must read 8-bit ascii strings from standard
Reference No:- TGS01600151

Expected delivery within 24 Hours