Regular Expressions Tutorial

Regular Expressions use special (and, at first, somewhat confusing) codes to detect patterns in strings of text. For example, if you’re presenting your visitors with an HTML form to enter their details, you might have one field for their phone number. Now let’s face it: some site visitors are better at following instructions than others. Even if you put a little hint next to the text field indicating the required format of the phone number (e.g.: “(XXX) XXX-XXXX” for North American numbers), some people are going to get it wrong. Writing a script to check every character of the entered string to ensure that all the numbers are where they belong, with parentheses and a dash in just the right spots, would be a pretty tedious bit of code to write. And a telephone number is a relatively simple case! What if you had to check that a user had indeed entered an email address or, worse yet, a URL?

Regular Expressions provide a quick and easy way of matching a string to a pattern. In our phone number example, we could write a simple regular expression and use it to check – in one quick step – whether or not any given string is a properly formatted phone number. We’ll explore this example a little further, once we’ve taken care of a few technical details.

Regular Expressions can look fairly complex at times, but when it comes right down to it they’re actually just text strings themselves. The following, for example, is a regular expression that searches for the text “JavaScript” (without the quotes):


Not much to it, is there? Any string containing the text “JavaScript” is said to match this regular expression. Thus, this regular expression allows us to detect strings containing this particular string of text.

First of all, a caret (^) may be used to indicate the beginning of the string, while a dollar sign ($) is used to mark the end:

Obviously, you may sometimes want to use ^, $, or other special characters to represent the corresponding character in the search string rather than the special meaning implied by regular expression syntax. To remove the special meaning of a character, prefix it with a backslash:

Square brackets may be used to define a set of characters that may match. For example, the following regular expression will match any digit from 1 to 5 inclusive.

Ranges of numbers and letters may also be specified.

By putting a ^ immediately following the opening square bracket, you can invert the set of characters, meaning the set will match any character not listed:

The characters ?, +, and * also have special meanings. Specifically,

? means “the preceding character is optional”,
+ means “one or more of the previous character”,
* means “zero or more of the previous character”.

Parentheses may be used to group strings together to apply ?, +, or * to them as a whole.

Parentheses also let you define several strings that may match, using the pipe (|) character to separate them.

Here are a few special codes that can be used for matching characters in regular expressions:

There are more special codes and syntax tricks for regular expressions, all of which should be covered in any complete reference (such as those mentioned above). For now, we have more than enough for our purposes.

Using regular expressions in JavaScript is so easy that it’s a wonder more people don’t know that it can be done. You can create a regular expression in JavaScript as follows:

Where regexp is the regular expression code, as described above. For example, the following creates the first example regular expression I presented in the previous section, the one that detects the string “JavaScript”:

Similarly, here’s how to create the last example:

By default, JavaScript regular expressions are case sensitive and only search for the first match in any given string. By adding the g (for global) and i (for ignore case) modifiers after the second /, you can make a regular expression search for all matches in the string and ignore case, respectively. Here are a few example regular expressions. For each, I’ve indicated what portion(s) of the string “test1 Test2 TEST3” they would match:

Using a regular expression is easy. Every JavaScript variable containing a text string supports three methods (or functions, if you aren’t used to object-oriented terminology) for working with regular expressions: match(), replace(), and search().


match() takes a regular expression as a parameter and returns an array of all the matching strings found in the string under consideration. If no matches are discovered, then match() returns false. Returning to our original example, let’s say that we wanted a function that can check that a string entered by the user as his or her phone number is of the form (XXX) XXX-XXXX. The following code would do the trick:

As a first order of business, this function defines a regular expression. Let’s break it down to understand how it works. The regular expression begins with ^, to indicate that any match must begin at the start of the string. Next is (, which will just match the opening parenthesis. We prefixed the character with a backslash to remove its special meaning in regular expression syntax (to mark the start of a set of alternatives for matching). As mentioned previously, d is a special code that matches any digit; thus, ddd matches any three digits. We could have written [0-9][0-9][0-9] to achieve the same effect, but this is shorter. The rest of the pattern should be pretty self-explanatory. ) matches the closing parenthesis, the space matches the space that must be left in the phone number, then ddd-dddd matches three digits, followed by a dash, followed by four more digits. Finally, the $ indicates that any match must end at the end of the string.

Incidentally, we could shorten this regular expression to the following, by using another shortcut that we did not mention above. If you can see how this works, you’re a natural!

Our function then checks if phoneNo.match(phoneRE) evaluates to true or false. In other words, it checks whether or not the string contained in phoneNo matches our regular expression (thus returning an array, which in JavaScript will evaluate to true). If a match is detected, our function returns true to certify that the string is indeed a phone number. If not, a message is displayed warning of the problem and the function returns false.

The most common use for this type of function is in validating user input to a form before allowing it to be submitted. By calling our function in the onSubmit event handler for the form, we can prevent the form from being submitted if the information entered is not properly formatted. Here’s a simple example demonstrating the use of our checkPhoneNumber() function:

The user will be unable to submit this form unless a phone number has been entered. Any attempt to do so will produce the error message generated by our checkPhoneNumber() function.


As its name would suggest, replace() lets you replace matches to a given regular expression with some new string. Let’s say you were a spelling nut and wanted to enforce the old adage “I before E, except after C” to correct such misspellings as “acheive” and “cieling”. What we’d need is a function that takes a string and performs two search-and-replace operations. The first would replace “cie” with “cei”. Here’s the code:

Simple enough, right? The first parameter is the regular expression that we’re searching for (notice that we’ve set it to “ignore case” and to be “global” so that it finds all occurrences, not just the first), and the second parameter is the string that we want to replace any matches with.

The second replacement is a little more complicated. We want to replace “xei” with “xie” where ‘x’ is any letter except ‘c’. The regular expression to detect instances of “xei” is fairly easy to understand:

This just detects any letter except ‘c’ (‘a’, ‘b’, and ‘d’ to ‘z’ inclusive), followed by “ei”, and does it in a global, case-insensitive manner.

The complexity comes in defining our replacement string. Obviously, we want to replace the match with “xie”, but the difficulty comes in writing the ‘x’. Remember, we have to replace ‘x’ with whatever letter appears in the matching string. To do this, we need to learn a new trick.

Earlier on, I showed you how parentheses could be used to define a set of alternatives in a regular expression (e.g. ^(ba|na)+$). Well as it turns out, parentheses have another meaning, too. They let us “remember” part of a match, so that we can use it in the replacement string. In this case, we want to remember the portion of the match that corresponds to the [abd-z] in the regular expression. Thus, we surround it with parentheses:

Now, when specifying the replacement string, we put $1 where we want to insert the portion of the string corresponding to the parenthesised portion of the regular expression. Thus, the code for performing the required substitutions is as follows:

To sum it up, here’s the complete function for performing our auto-correction:

Before you go and use this function on your page, realize that there are exceptions to the “I before E except after C” rule. Weird, huh?


The search() function is similar to the well-known indexOf() function, except it takes a regular expression instead of a string. It then searches the string for the first match to the given regular expression and returns an integer indicating the position in the string (e.g. 0 if the match is at the start of the string, 9 if the match begins with the 10th character in the string). If no match is found, the function returns a value of –1.

Summing it up, Regular Expressions are an invaluable tool for verifying user input. By taking advantage of support for regular expressions in JavaScript, that verification can be done without having to resort to complex and potentially costly server-side scripting. In fact, you can make your server-side scripts considerably simpler if you verify user input with JavaScript before allowing that data to be submitted, since your server-side scripts can assume that the data it receives is valid. Simpler scripts run faster, and lighten the load on your Web server.

Leave a Reply

Your email address will not be published. Required fields are marked *