Discussion forum for David Beazley

Parsing a huge zipped file in SLY

Hi,

I have a huge zipped file to parse. The following code will consume very large memory

from sly import Lexer
import gzip

class MyLexer(Lexer):

if name == ‘main’:
lexer = MyLexer()
f = gzip.open(“huge_zipped_file.gz”, “rb”)
data = f.read()
for tok in lexer.tokenize(data):

f.close()

because it reads a whole zipped file and pass the context to Lexer.

Is there a way to save the memory?

Thanks a lot.

Paul