Lexical Analyzer:
Symbol Table - Holds the symbols accepted by the lexical analyzer or parser.
Each symbol may be a terminal or a non-terminal. Terminal symbols are listed first in the table ahead of non-terminals.
Attributes:
- Count - Number of symbols in the table
Child items:
Symbol - Individual symbol accepted by the lexical analyzer or parser.
Attributes:
- Index - Unique symbol identifier in the symbol table
- Name - Name assigned to symbol in the grammar
- Kind - Indicates the type of symbols. (0- nonterminal, 1 - terminal, other types are self evident)
Char Sets - Groups of characters that can trigger one (or more) DFA state transitions.
Attributes:
- Index - Unique identifier for a given character set
- Count - Number of characters in a given character set
Child items:
Char - Specifies the individual character (no array needed)
Attributes:
- UnicodeIndex - Character code in Unicode (ASCII)
DFA Table - Holds all the DFA states that make up the lexical analyzer.
Attributes:
- Count - Number of DFA states
- Initial state
Child Items:
DFA State - Enumerates each DFA state and the edges that transition from that state.
Attributes:
- Index - Unique identifier (order in the DFA table)
- EdgeCount - Number of edges from the current state to other DFA states
- AcceptSymbol - Is this an accept state?
- -1=false
- Otherwise, it gives the number of the symbol index the state accepts
Child items:
DFA Edge - Transition from one DFA state to another
Attributes:
- Char Set Index - Points to the character set whose input causes the transition. Thus, a single transition may be cause by multiple characters.
- Target - Points to the destination state for the transition
LALR Parser:
Rule Table - Contains definitions of production rules within the grammar.
Attributes:
- Count - Number of production rules in the table
Child items:
Rule - Defines a given production rule in the grammar. Each production rule has a Left Hand Side (LHS) which is a single non-terminal) and a Right Hand Side (RHS), which can be a mix of terminals and non-terminals. This structure can be used to determine how many and which symbols to pop off the parser stack during a reduction. The LHS symbol must be placed on the parser stack to complete the rule reduction.
Attributes:
- Index - Unique identifier of rule in the rule table
- NonTerminalIndex - Provides the index for the non-terminal symbol (LHS) in the symbol table.
- Symbol Count - number of symbols in the RHS of the production
Child items:
Rule Symbol - Defines a terminal or non-terminal in the RHS of the rule. Symbols are listed in order of appearance on the RHS (from leftmost to rightmost).
Attributes:
- SymbolIndex - Provides the index into the symbol table for the particular symbol on the RHS of the production
LALR Table - This structure represents the LALR parse table. Recall from class that the LALR table has symbols (terminals and non-terminals) on its X-Axis and LALR states on its Y-Axis (refer to class slides). This table is defined in terms of states, which call out the appropriate symbols that are applicable to that state.
Attributes:
- Count - Total number of LALR states in the table
- Initial State - Start state for the parser (nominally state 0)
Child items:
LALR State - This item defines a given state in the LALR parse table. Each state is mapped to one or more symbols. For each mapping, an action is specified (shift, reduce, goto, or accept). If a state does not reference a symbol, then no action is specified when the parser is in that state and the symbol is at the front of the input queue-in short, this is a parser error.
Attributes:
- Index - State number within the table
- Action count - Number of actions associated with a given state (at least one)
Child items:
LALR Action - This item defines the action for a given state-symbol pair. Since the LALR Action is a child of LALR State, all listed actions are associated with the parent state. Each pair will have a single mapping associated with it of type: shift, reduce, goto, or accept.
Attributes:
- SymbolIndex - The index of the symbol mapped to the LALR state for this action.
- Action - The type of action being accomplished. Actions: 1- Shift, 2- Reduce, 3 - Goto, 4 - Accept.
- Value - The meaning of this value is dependent on the action type. For a shift or goto action, the value refers to the next LALR state. For a reduce action, the value refers to the rule index in the Rule Table. For an accept action, the value is meaningless (nominally 0) as it ends the parsing process for a given statement.
Best Programming Strategy:
- Mimic the GPB XML structure
- Implement each structure that has items as an array of objects