Tuesday, June 17, 2008

Regular expression

A regular expression is a pattern describing a certain amount of text or a regular expression is a set of characters that specify a pattern. Regular expressions are used, when you want to search for specify lines of text containing a particular pattern. A regular expression matches specific word or string of characters. You can search for words of a certain size. You can search for a word with four or more vowels that end with an "s."

There are three important parts to a regular expression. Anchors are used to specify the position of the pattern in relation to a line of text (like ^ indicates the character(after ^) should be at the beginning of string and $ indicates character(before $) should be at the end of string ). Character Sets match one or more characters in a single position like [a-z] means any single alphabet(lower case) and [0-9] any digit between 0 to 9 .

Modifiers specify how many times the previous character set is repeated like * represents zero or more times and ? means one or more times.

For example:^AD*B$ matches a string that starts with ‘A’ and ends with ‘B’ and should contain zero or more ‘D’s like AB, ADB, ADDB, ADDDB…..

Regular expressions are case sensitive.

There are some operators like:
\w - Matches a word character
\W - Matches a non word character
\s - Matches a space, a new line character or a tab
\d - Matches a digit character

References:
http://www.regular-expressions.info/tutorial.html http://www.grymoire.com/Unix/Regular.html http://www.2150.com/regexfilter/Documentation/regular_expressions.asp

No comments:

Computers Add to Technorati Favorites Programming Blogs - BlogCatalog Blog Directory