c++ - Correctly parsing comment in regex -
i creating compiler , having trouble handling commenting when dealing multi-line comments (/* */). issue regex needs fixing. believe looks opening comment token (/*) accept closing comment token (*/) might not part of comment scope.
also issue within string, still try comment out. issue haven't yet implemented appreciated.
the regex using is:
[/][*](.|\n)*[*][/] examples:
input:
int main(/* text */) { int = 0; /* hello world */ return 1; } output:
int main( return 1; } and strings, input be:
int main() { printf("/* hi there */\n"); return 1; } output:
int main() { printf("\n"); return 1; }
i'm not sure regex library you're using, need what's called non-greedy match.
try this:
\/\*(.|\n)*?\*\/ the ? after .* makes match ungreedy.
you can visualize working here.
note perl-compatible regular expression (pcre) syntax, assuming you're using. if you're using posix regular expressions, won't work.
you don't need put / , * inside character class ([...]); need escape them.
you can use pcre_dotall flag make . match \n or \r well, can simplify regex.
pcre_dotall if bit set, dot metacharacter in pattern matches char- acter of value, including 1 indicates newline. however, ever matches 1 character, if newlines coded crlf. without option, dot not match when current position @ newline. option equivalent perl's /s option, , can changed within pattern (?s) option setting. negative class such [^a] matches newline characters, independent of set- ting of option. then, our regex be:
\/\*.*?\*\/ you can make entire regex ungreedy using pcre_ungreedy flag:
pcre_ungreedy option inverts "greediness" of quantifiers not greedy default, become greedy if followed "?". not compatible perl. can set (?u) option setting within pattern. in case, this work:
\/\*.*\*\/
Comments
Post a Comment