Source programme --> Front-end --> IR --> Back-end --> Target Programme
Source programme --> Lexical analysis --> Marked --> Syntax analysis --> AST --> Semantic analysis --> Target
enum kind {IF, LPAREN, ID, INTLIT, ...}
struct token {
enum kind k;
char *lexeme;
}
Vary complex and easy to error
But it is very influence
It can generated very fast.
But it can not control details.
Example
token nextToken()
c = getChar();
switch (c)
case '<' : c = getChar();
switch (c)
case '=': return LE;
case '>': return NE;
default: rollback(); return LT;
case '=' : return EQ;
case '>' : c = getChar();
switch (c):
case '=': return GE;
default: rollback(); return GT;
Keyword is a part of identifier
Keyword: if, while, else, …
We can use a hash table to check keywords.