Overview
The Parser class performs syntactic analysis by verifying that tokens are in the correct order according to the language grammar, and constructs an Abstract Syntax Tree (AST). It uses a recursive descent parsing technique.
Class Definition
class Parser:
def __init__(self, tokens: List[Token])
Constructor Parameters
List of tokens produced by the Scanner
Attributes
tokens (List[Token]): The input token list
actual (int): Index of the current token being processed
errores (List[str]): List of syntax errors found
Public Methods
parsear()
Parses the entire token list and returns the Abstract Syntax Tree.
def parsear(self) -> Programa
The complete AST representing the program structure
Example:
scanner = Scanner("let x = 5 + 3; print x;")
tokens = scanner.escanear_tokens()
parser = Parser(tokens)
programa = parser.parsear()
if parser.errores:
for error in parser.errores:
print(error)
else:
print(f"Successfully parsed {len(programa.sentencias)} statements")
Language Grammar
The Parser implements the following grammar:
program → statement*
statement → declaration | print_stmt
declaration → "let" IDENTIFIER "=" expression ";"
print_stmt → "print" expression ";"
expression → addition
addition → multiplication (('+' | '-') multiplication)*
multiplication → primary (('*' | '/') primary)*
primary → NUMBER | IDENTIFIER | '(' expression ')'
AST Node Types
Statements
DeclaracionVariable
Represents a variable declaration.
@dataclass
class DeclaracionVariable(Sentencia):
token_let: Token # The 'let' keyword token
nombre: Token # The variable name token
expresion: Expresion # The assigned value expression
Example: let x = 10 + 5;
SentenciaPrint
Represents a print statement.
@dataclass
class SentenciaPrint(Sentencia):
token_print: Token # The 'print' keyword token
expresion: Expresion # The expression to print
Example: print x + 5;
Expressions
NumeroLiteral
A literal number in the code.
@dataclass
class NumeroLiteral(Expresion):
token: Token # The number token
valor: int # The numeric value
Example: 42 in let x = 42;
Identificador
A variable reference.
@dataclass
class Identificador(Expresion):
token: Token # The identifier token
nombre: str # The variable name
Example: x in print x;
ExpresionBinaria
A binary operation (two operands with an operator).
@dataclass
class ExpresionBinaria(Expresion):
izquierda: Expresion # Left operand
operador: Token # Operator (+, -, *, /)
derecha: Expresion # Right operand
Example: 5 + 3 creates:
ExpresionBinaria(
izquierda=NumeroLiteral(5),
operador=Token(SUMA, '+'),
derecha=NumeroLiteral(3)
)
ExpresionAgrupada
An expression in parentheses.
@dataclass
class ExpresionAgrupada(Expresion):
expresion: Expresion # The inner expression
Example: (5 + 3) in let x = (5 + 3) * 2;
Operator Precedence
The Parser implements correct operator precedence:
- Highest: Parentheses
( )
- High: Multiplication
*, Division /
- Low: Addition
+, Subtraction -
Example:
# Expression: 3 + 4 * 2
# Parsed as: 3 + (4 * 2) = 11
# Not as: (3 + 4) * 2 = 14
parser = Parser(tokens_from("let x = 3 + 4 * 2;"))
ast = parser.parsear()
# The multiplication is evaluated first due to higher precedence
Error Handling
Error Recovery
When a syntax error is detected:
- The error is recorded in the
errores list
- The parser synchronizes to a safe recovery point
- Parsing continues to detect multiple errors
code = """
let x = ; // Syntax error
let y = 10; // This will still be parsed
"""
scanner = Scanner(code)
parser = Parser(scanner.escanear_tokens())
programa = parser.parsear()
for error in parser.errores:
print(error)
# Output: Error de sintaxis en línea 1, columna 9: Se esperaba una expresión. Se encontró ';'
ErrorSintaxis Exception
class ErrorSintaxis(Exception):
pass
Thrown internally when a syntax error is detected. The Parser catches these exceptions and adds them to the error list.
Reserved Words Handling
The reserved words leo and diego are recognized but ignored:
code = """
leo
let x = 5;
diego
print x;
"""
parser = Parser(scanner.escanear_tokens())
programa = parser.parsear()
# Only the 'let' and 'print' statements are included in the AST
Usage Example
from compfinal import Scanner, Parser
# Source code
code = """
let a = 5;
let b = 10;
let c = a + b * 2;
print c;
"""
# Scan and parse
scanner = Scanner(code)
tokens = scanner.escanear_tokens()
if scanner.errores:
print("Lexical errors found!")
exit(1)
parser = Parser(tokens)
programa = parser.parsear()
if parser.errores:
print("Syntax errors found:")
for error in parser.errores:
print(f" - {error}")
exit(1)
print(f"✓ Successfully parsed {len(programa.sentencias)} statements")
# Access the AST
for sentencia in programa.sentencias:
if isinstance(sentencia, DeclaracionVariable):
print(f"Variable: {sentencia.nombre.lexema}")
elif isinstance(sentencia, SentenciaPrint):
print("Print statement found")
AST Visualization
The AST for let x = 5 + 3; looks like:
Programa
└── DeclaracionVariable
├── nombre: 'x'
└── valor:
└── ExpresionBinaria
├── operador: '+'
├── izquierda:
│ └── NumeroLiteral(5)
└── derecha:
└── NumeroLiteral(3)
See Also