






| Application | Unwanted String |
|---|---|
| Microsoft Word | "Microsoft Word -" |
| Notepad | " - notepad" |
| Metacharacter | Description | Example/Comment |
|---|---|---|
| . | matches exactly one character | "r.t" matches "rat", "rut", "r t", but not "root" |
| ? | matches the preceding character or subexpression zero or one times | "data?.dat" matches "datax.dat" and "data2.dat" |
| * | matches the preceding character or subexpression zero or more times | "zo*" matches "z" and "zoo" |
| + | matches the preceding character or subexpression one or more times | "ab+c" matches "abc", "abbc", but not "ac" |
| \ | escape character, used to find an instance of a metacharacter like a period | "[0-9]\+" matches a digit followed by a plus character, but [0-9]+ matches one or more digits |
| | | an OR operator to separate two expressions | "x|y" matches an instance of "x" or "y" |
| $ | matches the position at the end of the input string | In Zan Image Printer, all newline characters are removed before parsing, so the $ metacharacter is not used |
| ^ | a negative character set | "^[abc]" matches all characters except "a", "b" and "c" |
| - | indicates a range of characters | "[a-z]" matches any lowercase alphabetic character in the range 'a' through 'z' |
| [ ] | indicates character class, matches any character inside the brackets | "[abc]" matches "a", "b" and "c" |
| {x} | matches must occur exactly x times | "[0-9]{3}-[0-9]{4}" matches a regular 7 digit phone number |
| {x,} | matches must occur at least x times | "x{3,}" matches on at least 3 occurrences of x |
| {x,y} | matches must occur at least x times, but no more than y times | "[0-9]{4,6}" matches any sequence of 4, 5 or 6 digits |
| ( ) |
indicates a match group (backreference)for later reuse, each captured match group
is stored as it is encountered from left to right. In Zan Image Printer, "[%0]" is the substring that matched the entire pattern, "[%1]" is the substring that matched the pattern enclosed in the first set of parentheses, and so on. |
"www\.([a-z]+)\.com" matches "www.mycompany.com" and sets the back reference to mycompany. |
| \d | matches a digit character | "\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}" matches any IP address |
| \D | matches a non-digit character | "a\Dz" matches "abz", "aTz", but not "a2z" |
| \s | matches a white space character | "a\sz" matches any three-character string starting with "a" and ending with "z" and whose second character is a space |
| \S | matches a non-whitespace character | "a\Sz" matches any three-character string starting with "a" and ending with "z" and whose second character is not a space |
| \w | matches a word boundary | "\wcomput" matches "computer", "computing", but not "supercomputer" |
| \W | matches a nonword boundary | "\Wcomput" matches "supercomputer", but not "computing" |