Sideway
output.to from Sideway
Draft for Information Only

Content

Regular Expression Features
  Regular Expression Features
   Literals

Regular Expression Features

Regular Expression Features

The features of patterns used in regular expression can be grouped into serveral categories.

Literals

In a general sense, literals may be considered as printable or unprintable characters. Literals used in regular expression can be any single character of alphanumeric characters, ACSII characters, octal characters, hexadecimal character, UNICODE character, or other special escaped characters. A regular expression engine tries to match each individual literal with the searched string. An individual literal is used for matching one literal to one single substring only. In other words, each literal in the expression should match with one character of the searched string accordingly.

  • Alphanumericto matches alphabetical and numerical characters literally and accordingly. i.e. A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z, a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9.
  • \* to matches the special metacharacter * literally.
  • \+ to matches the special metacharacter + literally.
  • \? to matches the special metacharacter ? literally.
  • \^ to matches the special metacharacter ^ literally.
  • \$ to matches the special metacharacter $ literally.
  • \. to matches the special metacharacter . literally.
  • \| to matches the special metacharacter | literally.
  • \{ to matches the special metacharacter { literally.
  • \} to matches the special metacharacter } literally.
  • \\ to matches the special metacharacter \ literally.
  • \[ to matches the special metacharacter [ literally.
  • \] to matches the special metacharacter ] literally.
  • \( to matches the special metacharacter ( literally.
  • \) to matches the special metacharacter ) literally.
  • \f to matches a nonprinting form feed character
  • \n to matches a nonprinting new line character
  • \r to matches a nonprinting carriage return character
  • \t to matches a nonprinting horizontal tab character
  • \v to matches a nonprinting vertical tab character
  • \xn to matches the ASCII character specified by n literally. n must be a hexadecimal escape value of an ASCII code with exactly two digits.  i.e. \x00-\xFF
  • \cx to matches the ASCII control character specified by x literally. x must be in the range of A-Z or a-z, otherwise c is assumed to be a literal "c" character, that is a simple escaped character. i.e. \cA-\cZ and \ca-\cz. The range of represented ASCII characters is equivalent to \x01-\x1A. Lowercase letter is equivalent to Uppercase letter with no difference in meaning.
  • \nml to matches the extended ASCII character specified by nml literally unless there is a backreference corresponding to the specified number. As an octal escape character, nml must be an octal escape value of an extended ASCII character. i.e. 0-377 or 000-377
  • \un to matches the Unicode character specified by n literally. n must be a hexadecimal escape value of a Unicode code with exactly four digits. i.e. 0000-FFFF

A table of extended ASCII literals can be compared as following and n/a stands for not applicable.

      literals
DEC Description of Unprintable Control Codes SYM char escaped char HEX \xn \cx OCT \nml UNICODE \un
0 Null char NUL     00   000 0000
1 Start of Heading SOH     01 A, a 001 0001
2 Start of Text STX     02 B,b 002 0002
3 End of Text ETX     03 C,c 003 0003
4 End of Transmission EOT     04 D,d 004 0004
5 Enquiry ENQ     05 E,e 005 0005
6 Acknowledgment ACK     06 F,f 006 0006
7 Bell BEL     07 G,g 007 0007
8 Back Space BS     08 H,h 010 0008
9 Horizontal Tab HT   \t 09 I,i 011 0009
10 Line Feed LF   \n 0A J,j 012 000A
11 Vertical Tab VT   \v 0B K,k 013 000B
12 Form Feed FF   \f 0C L,l 014 000C
13 Carriage Return CR   \r 0D M,m 015 000D
14 Shift Out / X-On SO     0E N,n 016 000E
15 Shift In / X-Off SI     0F O,o 017 000F
16 Data Line Escape DLE     10 P,p 020 0010
17 Device Control 1 (oft. XON) DC1     11 Q,q 021 0011
18 Device Control 2 DC2     12 R,r 022 0012
19 Device Control 3 (oft. XOFF) DC3     13 S,s 023 0013
20 Device Control 4 DC4     14 T,t 024 0014
21 Negative Acknowledgement NAK     15 U,u 025 0015
22 Synchronous Idle SYN     16 V,v 026 0016
23 End of Transmit Block ETB     17 W,w 027 0017
24 Cancel CAN     18 X,x 030 0018
25 End of Medium EM     19 Y,y 031 0019
26 Substitute SUB     1A Z,z 032 001A
27 Escape ESC     1B   033 001B
28 File Separator FS     1C   034 001C
29 Group Separator GS     1D   035 001D
30 Record Separator RS     1E   036 001E
31 Unit Separator US     1F   037 001F
DEC Description of Printable Codes except char 127 SYM char escaped char HEX \xn \cx OCT \nml UNICODE \un
32 Space       20   040 0020
33 Exclamation mark !     21   041 0021
34 Double quotes (or speech marks) "     22   042 0022
35 Number #     23   043 0023
36 Dollar $   \$ 24   044 0024
37 Procenttecken %     25   045 0025
38 Ampersand &     26   046 0026
39 Single quote '     27   047 0027
40 Open parenthesis (or open bracket) (   \( 28   050 0028
41 Close parenthesis (or close bracket) )   \) 29   051 0029
42 Asterisk *   \* 2A   052 002A
43 Plus +   \+ 2B   053 002B
44 Comma ,     2C   054 002C
45 Hyphen -     2D   055 002D
46 Period, dot or full stop .   \. 2E   056 002E
47 Slash or divide /     2F   057 002F
48 Zero 0 0 n/a 30   060 0030
49 One 1 1 n/a 31   061 0031
50 Two 2 2 n/a 32   062 0032
51 Three 3 3 n/a 33   063 0033
52 Four 4 4 n/a 34   064 0034
53 Five 5 5 n/a 35   065 0035
54 Six 6 6 n/a 36   066 0036
55 Seven 7 7 n/a 37   067 0037
56 Eight 8 8 n/a 38   070 0038
57 Nine 9 9 n/a 39   071 0039
58 Colon :     3A   072 003A
59 Semicolon ;     3B   073 003B
60 Less than (or open angled bracket) <     3C   074 003C
61 Equals =     3D   075 003D
62 Greater than (or close angled bracket) >     3E   076 003E
63 Question mark ?   \? 3F   077 003F
64 At SYM @     40   100 0040
65 Uppercase A A A   41   101 0041
66 Uppercase B B B n/a 42   102 0042
67 Uppercase C C C   43   103 0043
68 Uppercase D D D n/a 44   104 0044
69 Uppercase E E E   45   105 0045
70 Uppercase F F F   46   106 0046
71 Uppercase G G G   47   107 0047
72 Uppercase H H H   48   110 0048
73 Uppercase I I I   49   111 0049
74 Uppercase J J J   4A   112 004A
75 Uppercase K K K   4B   113 004B
76 Uppercase L L L   4C   114 004C
77 Uppercase M M M   4D   115 004D
78 Uppercase N N N   4E   116 004E
79 Uppercase O O O   4F   117 004F
80 Uppercase P P P   50   120 0050
81 Uppercase Q Q Q   51   121 0051
82 Uppercase R R R   52   122 0052
83 Uppercase S S S n/a 53   123 0053
84 Uppercase T T T   54   124 0054
85 Uppercase U U U   55   125 0055
86 Uppercase V V V   56   126 0056
87 Uppercase W W W n/a 57   127 0057
88 Uppercase X X X   58   130 0058
89 Uppercase Y Y Y   59   131 0059
90 Uppercase Z Z Z   5A   132 005A
91 Opening bracket [   \[ 5B   133 005B
92 Backslash \   \\ 5C   134 005C
93 Closing bracket ]   \] 5D   135 005D
94 Caret - circumflex ^   \^ 5E   136 005E
95 Underscore _     5F   137 005F
96 Grave accent `     60   140 0060
97 Lowercase a a a   61   141 0061
98 Lowercase b b b n/a 62   142 0062
99 Lowercase c c c n/a 63   143 0063
100 Lowercase d d d n/a 64   144 0064
101 Lowercase e e e   65   145 0065
102 Lowercase f f f n/a 66   146 0066
103 Lowercase g g g   67   147 0067
104 Lowercase h h h   68   150 0068
105 Lowercase i i i   69   151 0069
106 Lowercase j j j   6A   152 006A
107 Lowercase k k k   6B   153 006B
108 Lowercase l l l   6C   154 006C
109 Lowercase m m m   6D   155 006D
110 Lowercase n n n n/a 6E   156 006E
111 Lowercase o o o   6F   157 006F
112 Lowercase p p p   70   160 0070
113 Lowercase q q q   71   161 0071
114 Lowercase r r r n/a 72   162 0072
115 Lowercase s s s n/s 73   163 0073
116 Lowercase t t t n/a 74   164 0074
117 Lowercase u u u n/a 75   165 0075
118 Lowercase v v v n/a 76   166 0076
119 Lowercase w w w n/a 77   167 0077
120 Lowercase x x x n/a 78   170 0078
121 Lowercase y y y   79   171 0079
122 Lowercase z z z   7A   172 007A
123 Opening brace {   \{ 7B   173 007B
124 Vertical bar |   \| 7C   174 007C
125 Closing brace }   \} 7D   175 007D
126 Equivalency sign - tilde ~     7E   176 007E
127 Delete     7F   177 007F
DEC Description of extended ASCII codes (Windows-1252 & ISO 8859-1) SYM char escaped char HEX \xn \cx OCT \nml UNICODE \un
128 Euro sign     80   200 20AC
129         81   201  
130 Single low-9 quotation mark     82   202 201A
131 Latin small letter f with hook ƒ     83   203 0192
132 Double low-9 quotation mark     84   204 201E
133 Horizontal ellipsis     85   205 2026
134 Dagger     86   206 2020
135 Double dagger     87   207 2021
136 Modifier letter circumflex accent ˆ     88   210 02C6
137 Per mille sign     89   211  
138 Latin capital letter S with caron Š     8A   212  
139 Single left-pointing angle quotation     8B   213  
140 Latin capital ligature OE Œ     8C   214  
141         8D   215  
142 Latin captial letter Z with caron Ž     8E   216  
143         8F   217  
144         90   220  
145 Left single quotation mark     91   221  
146 Right single quotation mark     92   222  
147 Left double quotation mark     93   223  
148 Right double quotation mark     94   224  
149 Bullet     95   225  
150 En dash     96   226  
151 Em dash     97   227  
152 Small tilde ˜     98   230  
153 Trade mark sign     99   231  
154 Latin small letter S with caron š     9A   232  
155 Single right-pointing angle quotation mark     9B   233  
156 Latin small ligature oe œ     9C   234  
157         9D   235  
158 Latin small letter z with caron ž     9E   236  
159 Latin capital letter Y with diaeresis Ÿ     9F   237  
160 Non-breaking space       A0   240 00A0
161 Inverted exclamation mark ¡     A1   241 00A1
162 Cent sign ¢     A2   242 00A2
163 Pound sign £     A3   243 00A3
164 Currency sign ¤     A4   244 00A4
165 Yen sign ¥     A5   245 00A5
166 Pipe, Broken vertical bar ¦     A6   246 00A6
167 Section sign §     A7   247 00A7
168 Spacing diaeresis - umlaut ¨     A8   250 00A8
169 Copyright sign ©     A9   251 00A9
170 Feminine ordinal indicator ª     AA   252 00AA
171 Left double angle quotes «     AB   253 00AB
172 Not sign ¬     AC   254 00AC
173 Soft hyphen ­     AD   255 00AD
174 Registered trade mark sign ®     AE   256 00AE
175 Spacing macron - overline ¯     AF   257 00AF
176 Degree sign °     B0   260 00B0
177 Plus-or-minus sign ±     B1   261 00B1
178 Superscript two - squared ²     B2   262 00B2
179 Superscript three - cubed ³     B3   263 00B3
180 Acute accent - spacing acute ´     B4   264 00B4
181 Micro sign µ     B5   265 00B5
182 Pilcrow sign - paragraph sign     B6   266 00B6
183 Middle dot - Georgian comma ·     B7   267 00B7
184 Spacing cedilla ¸     B8   270 00B8
185 Superscript one ¹     B9   271 00B9
186 Masculine ordinal indicator º     BA   272 00BA
187 Right double angle quotes »     BB   273 00BB
188 Fraction one quarter ¼     BC   274 00BC
189 Fraction one half ½     BD   275 00BD
190 Fraction three quarters ¾     BE   276 00BE
191 Inverted question mark ¿     BF   277 00BF
192 Latin capital letter A with grave À     C0   300 000C
193 Latin capital letter A with acute Á     C1   301 00C1
194 Latin capital letter A with circumflex      C2   302 00C2
195 Latin capital letter A with tilde à     C3   303 00C3
196 Latin capital letter A with diaeresis Ä     C4   304 00C4
197 Latin capital letter A with ring above Å     C5   305 00C5
198 Latin capital letter AE Æ     C6   306 00C6
199 Latin capital letter C with cedilla Ç     C7   307 00C7
200 Latin capital letter E with grave È     C8   310 00C8
201 Latin capital letter E with acute É     C9   311 00C9
202 Latin capital letter E with circumflex Ê     CA   312 00CA
203 Latin capital letter E with diaeresis Ë     CB   313 00CB
204 Latin capital letter I with grave Ì     CC   314 00CC
205 Latin capital letter I with acute Í     CD   315 00CD
206 Latin capital letter I with circumflex Î     CE   316 00CE
207 Latin capital letter I with diaeresis Ï     CF   317 00CF
208 Latin capital letter ETH Ð     D0   320 00D0
209 Latin capital letter N with tilde Ñ     D1   321 00D1
210 Latin capital letter O with grave Ò     D2   322 00D2
211 Latin capital letter O with acute Ó     D3   323 00D3
212 Latin capital letter O with circumflex Ô     D4   324 00D4
213 Latin capital letter O with tilde Õ     D5   325 00D5
214 Latin capital letter O with diaeresis Ö     D6   326 00D6
215 Multiplication sign ×     D7   327 00D7
216 Latin capital letter O with slash Ø     D8   330 00D8
217 Latin capital letter U with grave Ù     D9   331 00D9
218 Latin capital letter U with acute Ú     DA   332 00DA
219 Latin capital letter U with circumflex Û     DB   333 00DB
220 Latin capital letter U with diaeresis Ü     DC   334 00DC
221 Latin capital letter Y with acute Ý     DD   335 00DD
222 Latin capital letter THORN Þ     DE   336 00DE
223 Latin small letter sharp s - ess-zed ß     DF   337 00DF
224 Latin small letter a with grave à     E0   340 00E0
225 Latin small letter a with acute á     E1   341 00E1
226 Latin small letter a with circumflex â     E2   342 00E2
227 Latin small letter a with tilde ã     E3   343 00E3
228 Latin small letter a with diaeresis ä     E4   344 00E4
229 Latin small letter a with ring above å     E5   345 00E5
230 Latin small letter ae æ     E6   346 00E6
231 Latin small letter c with cedilla ç     E7   347 00E7
232 Latin small letter e with grave è     E8   350 00E8
233 Latin small letter e with acute é     E9   351 00E9
234 Latin small letter e with circumflex ê     EA   352 00EA
235 Latin small letter e with diaeresis ë     EB   353 00EB
236 Latin small letter i with grave ì     EC   354 00EC
237 Latin small letter i with acute í     ED   355 00ED
238 Latin small letter i with circumflex î     EE   356 00EE
239 Latin small letter i with diaeresis ï     EF   357 00EF
240 Latin small letter eth ð     F0   360 00F0
241 Latin small letter n with tilde ñ     F1   361 00F1
242 Latin small letter o with grave ò     F2   362 00F2
243 Latin small letter o with acute ó     F3   363 00F3
244 Latin small letter o with circumflex ô     F4   364 00F4
245 Latin small letter o with tilde õ     F5   365 00F5
246 Latin small letter o with diaeresis ö     F6   366 00F6
247 Division sign ÷     F7   367 00F7
248 Latin small letter o with slash ø     F8   370 00F8
249 Latin small letter u with grave ù     F9   371 00F9
250 Latin small letter u with acute ú     FA   372 00FA
251 Latin small letter u with circumflex û     FB   373 00FB
252 Latin small letter u with diaeresis ü     FC   374 00FC
253 Latin small letter y with acute ý     FD   375 00FD
254 Latin small letter thorn þ     FE   376 00FE
255 Latin small letter y with diaeresis ÿ     FF   377 00FF

©sideway

ID: 160800020 Last Updated: 8/14/2016 Revision: 0


Latest Updated LinksValid XHTML 1.0 Transitional Valid CSS!Nu Html Checker Firefox53 Chromena IExplorerna
IMAGE

Home 5

Business

Management

HBR 3

Information

Recreation

Hobbies 8

Culture

Chinese 1097

English 339

Reference 79

Computer

Hardware 249

Software

Application 213

Digitization 32

Latex 52

Manim 205

KB 1

Numeric 19

Programming

Web 289

Unicode 504

HTML 66

CSS 65

SVG 46

ASP.NET 270

OS 429

DeskTop 7

Python 72

Knowledge

Mathematics

Formulas 8

Algebra 84

Number Theory 206

Trigonometry 31

Geometry 34

Coordinate Geometry 2

Calculus 67

Complex Analysis 21

Engineering

Tables 8

Mechanical

Mechanics 1

Rigid Bodies

Statics 92

Dynamics 37

Fluid 5

Fluid Kinematics 5

Control

Process Control 1

Acoustics 19

FiniteElement 2

Natural Sciences

Matter 1

Electric 27

Biology 1

Geography 1


Copyright © 2000-2024 Sideway . All rights reserved Disclaimers last modified on 06 September 2019