.Net正则表达式将$与字符串的末尾而不是行的末尾匹配,即使启用了多行

我试图突出显示降价代码,但是遇到了.NET regex多行选项的这种奇怪行为。

The following expression: ^(#+).+$ works fine on any online regex testing tool:

enter image description here

但它拒绝使用.net:

enter image description here

它似乎并没有考虑到$标记,只是突出显示了所有内容,直到字符串结尾,无论如何。这是我的C#

RegExpression = new Regex(@"^(#+).+$", RegexOptions.Multiline)

我想念什么?

评论
  • 圈圈那叉叉
    圈圈那叉叉 回复

    What you have is good. The only thing you're missing is that . doesn't match newline characters, even with the multiline option. You can get around this in two different ways.

    The easiest is to use the RegexOptions.Singleline flag which cause newlines to be treated as characters. That way, ^ still matches the start of the string, $ matches the end of the string and . matches everything including newlines.

    The other way to fix this (although I wouldn't recomend it for your use case) is to modify your regex to explicitly allow newlines. To do this you can just replace any . with (?:.|\n) which means either anycharacter or a newline. For your example, you would end up with ^(#+)(?:.|\n)+$. If you want to ensure that there's a non-linebreak character first, add an extra dot: ^(#+).(?:.|\n)+$

  • cut
    cut 回复

    It is clear your text contains a linebreak other than LF. In .NET regex, a dot matches any char but LF (a newline char, \n).

    See Multiline Mode MSDN regex reference

    By default, $ matches only the end of the input string. If you specify the RegexOptions.Multiline option, it matches either the newline character (\n) or the end of the input string. It does not, however, match the carriage return/line feed character combination. To successfully match them, use the subexpression \r?$ instead of just $.

    所以用

    @"^(#+).+?\r?$"
    

    The .+?\r?$ will match lazily any one or more chars other than LF up to the first CR (that is optional) right before a newline.

    或者只使用否定的字符类:

    @"^(#+)[^\r\n]+"
    

    The [^\r\n]+ will match one or more chars other than CR/LF.