flex error "start-condition stack underflow" with "%option full" on parsing files with chinese character -
i have flex scanner run correctly long time, on files chinese characters. want make faster, , add "%option full", indeed 3x faster. may fail on files comments contain chinese characters.
the error message "start-condition stack underflow".
i add printing statement lex source code, , find scanner print error in start condiiton sc, have not run code segment contain "yy_push_state(sc)". think there may overflow in flex buffer.
so do?
for historical reasons (or something), if use %option full
or %option fast
, flex default producing 7-bit scanner (i.e. %option 7bit
).
that's unsafe, since 7-bit scanner not attempt verify scanned text consists of 7-bit ("ascii") characters, , behaviour in case encounters character high-order bit set undefined. happen if input utf-8 or multibyte.
so need specify %option 8bit full
. increase size of scanner tables, these days might not matter much. might want try %option 8bit full ecs
intermediate setting. (i'm pretty sure 8bit
not necessary ecs
, can't hurt.)
Comments
Post a Comment