Discussion forum for David Beazley

Parser return value in case of a syntax error

Hi there,
When I invoke the YACC parser and it encounters a syntax error in the input text, it calls the error function. If I don’t implement any error recovery, the return value of the parser call is None in this case.
Is there a possibility to return something more meaningful, like a tuple
(None, <error message>)
or the like?
Thanks, for any help.

Have you tried writing a top-level error matching rule in the grammar? For example::

def p_program(p):
      '''
      program : error
      '''
      p[0] = None

(Note: not sure if this will work or not, but in error recovery, the parser does try to match against the error token. So, it might work.).

Thanks for the suggestion. Unfortunately, it doesn’t work - or I don’t do it correctly. Actually I think, the problem is that the error rule can’t return anything, that would be taken by the top level rule?

My error rule looks like this:

def p_error(t):
    print("Syntax error", t)
    if not t:
        return
    print (" in line %d at position %d (%s) while reading token %s " % (t.lineno, t.lexpos, t.value, t.type))
    while True:
        tok = parser.token()             # Get the next token
        if not tok or tok.type == 'COLON':
            break
    parser.restart()

How can it return something, that can be used by the top level rule?

The parser always discards all successfully parsed commands up to the error.

If the p_error() function needs to communicate some piece of data for use in a grammar rule, you’ll need to save it someplace. Put it in a global variable (or store it on an instance) for example.

Hi Dave, thanks for the suggestion. I should have come up with this on my own, sorry.

Actually my original idea of communicating back to the caller not only regular parser output but also error messages looks a lot like a workaround to me now.

Wouldn’t it be useful to have a method to call like this

result = parser.parse(input_string)
if result is None:
    errormsg = parser.get_error()

For that, instead of printing out messages in the error rule (or anywhere else), there would have to be another method to append messages, maybe with a severity flag to an internal buffer in the parser, like

parser.put_msg(SEVERITY_ERROR, "Syntax error in line %d as pos %d" % (t.lineno, t.lexpos))

Hi Dave, I tried out to assign to a global variable in my p_error function, but if a syntax error occurs, the toplevel rule of my grammar is not processed, which prevents me from using the contents of the global variable in the parser return value. Instead the parser just returns None.

My error function looks like this:

def p_error(t):
    global err_msg
    err_msg = "Syntax error"

Why isn’t the toplevel rule not processed?

The top-level grammar rule isn’t going to be processed because there was no successful parse. You might be able to get a top-level rule that matches the “error” token to run (see earlier suggestion in this thread). If that doesn’t work, you would need to check err_msg after the call to parse() (basically outside the parser itself).