关于正则表达式的使用

Cheatsheet

Character classes

  • . any character except newline
  • \w\d\s word, digit, whitespace
  • \W\D\S not word, digit, whitespace
  • [abc] any of a, b, or c
  • [^abc] not a, b, or c
  • [a-g] character between a & g

Anchors

  • ^abc$ start / end of the string
  • \b \B word, not-word boundary

Escaped characters

  • \. \* \\ escaped special characters
  • \t \n \r tab, linefeed, carriage return

Groups & Lookaround

  • (abc) capture group
  • \1 backreference to group #1
  • (?:abc) non-capturing group
  • (?=abc) positive lookahead
  • (?!abc) negative lookahead

Quantifiers & Alternation

  • a* a+ a? 0 or more, 1 or more, 0 or 1
  • a{5} a{2,} exactly five, two or more
  • a{1,3} between one & three
  • a+? a{2,}? match as few as possible
  • ab|cd match ab or cd

Flags

  • /g 全局匹配,若多个则匹配多个,不加/g匹配到第一个就停止匹配。
  • /m 执行多行匹配,不加/m,遇到\n、\r这样的换行符就会停止匹配。
  • /i 大小写不敏感的匹配。
  • /i - (ignore case) - Makes the whole expression case-insensitive. For example, /aBc/i would match AbC.
  • /g - (global search) - Retain the index of the last match, allowing subsequent searches to start from the end of the previous match. Without the global flag, subsequent searches will return the same match. RegExr only searches for a single match when the global flag is disabled to avoid infinite match errors.
  • /m - (multiline) - When the multiline flag is enabled, beginning and end anchors (^ and $) will match the start and end of a line, instead of the start and end of the whole string. Note that patterns such as /^[\s\S]+$/m may return matches that span multiple lines because the anchors will match the start/end of any line.
  • /u - (unicode) - When the unicode flag is enabled, you can use extended unicode escapes in the form \x{FFFFF}. It also makes other escapes stricter, causing unrecognized escapes (ex. \j) to throw an error.
  • /y - (sticky) - The expression will only match from its lastIndex position and ignores the global (g) flag if set. Because each search in RegExr is discrete, this flag has no further impact on the displayed results.
  • /s - (dotall) - Dot (.) will match any character, including newline.