[Django]-Python regex to match SQL INSERT statements

0👍

If the format of the statements is fixed, there is little point in using regular expressions. Just use simple string parsing:

parts = statement.split(' ', 4)

print(parts[2])
print(parts[3][1:-1].split(','))
print(parts[4][7:-2].split(','))

Example output:

ref_geographic_region
['continent_id', 'name']
['8', "'Europe (Western)'"]

4👍

Stop trying to parse SQL with regex. This is roughly as bad as parsing HTML with regex, since SQL is a context-free language that regexes are ill-equipped to handle. This can be accomplished far more easily with a proper parsing module like PyParsing

from pyparsing import Regex, QuotedString, delimitedList

# Object names and numbers match these regular expression
object_name = Regex('[a-zA-Z_]+')
number = Regex('-?[0-9]+')
# A string is just something with quotes around it - PyParsing has a built in
string = QuotedString("'") | QuotedString('"')

# A term is a number or a string
term = number | string

# The values we want to capture are either delimited lists of expressions we know about...
column_list = (delimitedList(object_name)).setResultsName('columns')
term_list = (delimitedList(term)).setResultsName('terms')

# Or just an expression we know about by itself
table_name = object_name.setResultsName('table')

# And an SQL statement is just all of these pieces joined together with some string between them
sql_stmt = "INSERT INTO " + table_name + "(" + column_list + ") VALUES(" + term_list + ");"


if __name__ == '__main__':
    res = sql_stmt.parseString("""INSERT INTO ref_geographic_region (continent_id,name) VALUES(8,'Europe (Western)');""")
    print res.table         # ref_geographic_region
    print list(res.columns) # ['continent_id', 'name']
    print list(res.terms)   # ['8', 'Europe (Western)']

This is a quick half-hour strawman – I’d recommend reading through its docs and getting a proper understanding of how it works. In particular, PyParsing has some strange behaviour with whitespace that it’s worth understanding before you properly strike out.

Leave a comment