Validating fields

In this section, we will explore how to validate different types of fields manually, such as name, e-mail, website URL, and so on.

Matching a complete name

To get our feet wet, let's begin with a simple name field. It's something we have gone through briefly in the past, so it should give you an idea of how our system will work. The following code goes inside the script tags, but only after everything we have written so far:

function process_name() {
    var field = document.getElementById("name_field");
    var name = field.value;

    var name_pattern = /^(S+) (S*) ?(S+)$/;

    if (name_pattern.test(name) === false) {
        alert("Name field is invalid");
        return false;
    }

    var res = name_pattern.exec(name);
    data.first_name = res[1];
    data.last_name = res[3];

    if (res[2].length > 0) {
        data.middle_name = res[2];
    }

    return true;
}

fns.push(process_name);

We get the name field in a similar way to how we got the form, then, we extract the value and test it against a pattern to match a full name. If the name doesn't match the pattern, we simply alert the user and return false to let the form handler know that the validations have failed. If the name field is in the correct format, we set the corresponding fields on the data object (remember, the middle name is optional here). The last line just adds this function to the array of functions, so it will be called when the form is submitted.

The last thing required to get this working is to add HTML for this form field, so inside the form tags (right before the submit button), you can add this text input:

Name: <input type="text" id="name_field" /><br />

Opening this page in your browser, you should be able to test it out by entering different values into the Name box. If you enter a valid name, you should get the data object printed out with the correct parameters, otherwise you should be able to see this alert message:

Matching a complete name

Understanding the complete name Regex

Let's go back to the regular expression used to match the name entered by a user:

/^(S+) (S*) ?(S+)$/

The following is a brief explanation of the Regex:

  • The ^ character asserts its position at the beginning of a string
  • The first capturing group (S+)
    • S+ matches a non-white space character [^ f]
    • The + quantifier between one and unlimited times
  • The second capturing group (S*)
    • S* matches any non-whitespace character [^ f]
    • The * quantifier between zero and unlimited times
  • " ?" matches the whitespace character
    • The ? quantifier between zero and one time
    •  asserts its position at a (^w|w$|Ww|wW) word boundary
  • The third capturing group (S+)
    • S+ matches a non-whitespace character [^ f]
    • The + quantifier between one and unlimited times
  • $ asserts its position at the end of a string

Matching an e-mail with Regex

The next type of field we may want to add is an e-mail field. E-mails may look pretty simple at first glance, but there are a large variety of e-mails out there. You may just think of creating a [email protected] pattern, but the first section can contain many additional characters besides just letters, the domain can be a subdomain, or the suffix could have multiple parts (such as .co.uk for the UK).

Our pattern will simply look for a group of characters that are not spaces or instances where the @ symbol has been used in the first section. We will then want an @ symbol, followed by another set of characters that have at least one period, followed by the suffix, which in itself could contain another suffix. So, this can be accomplished in the following manner:

/[^s@]+@[^s@.]+.[^s@]+/

Note

The pattern of our example is very simple and will not match every valid e-mail address. There is an official standard for an e-mail address's regular expressions called RFC 5322. For more information, please read http://www.regular-expressions.info/email.html.

So, let's add the field to our page:

Email: <input type="text" id="email_field" /><br />

We can then add this function to verify it:

function process_email() {
    var field = document.getElementById("email_field");
    var email = field.value;

    var email_pattern = /^[^s@]+@[^s@.]+.[^s@]+$/;

    if (email_pattern.test(email) === false) {
        alert("Email is invalid");
        return false;
    }

    data.email = email;
    return true;
}

fns.push(process_email);

Note

There is an HTML5 field type specifically designed for e-mails, but here we are verifying manually, as this is a Regex book. For more information, please refer to http://www.w3.org/TR/html-markup/input.email.html.

Understanding the e-mail Regex

Let's go back to the regular expression used to match the name entered by the user:

/^[^s@]+@[^s@.]+.[^s@]+$/

Following is a brief explanation of the Regex:

  • ^ asserts a position at the beginning of the string
  • [^s@]+ matches a single character that is not present in the following list:
    • The + quantifier between one and unlimited times
    • s matches any white space character [ f ]
    • @ matches the @ literal character
  • [^s@.]+ matches a single character that is not present in the following list:
    • The + quantifier between one and unlimited times
    • s matches a [ f] whitespace character
    • @. is a single character in the @. list, literally
    • . matches the . character literally
  • [^s@]+ match a single character that is not present in the following list:
    • The + quantifier between one and unlimited times
    • s matches [ f] a whitespace character
    • @ is the @ literal character
  • $ asserts its position at end of a string

Matching a Twitter name

The next field we are going to add is a field for a Twitter username. For the unfamiliar, a Twitter username is in the @username format, but when people enter this in, they sometimes include the preceding @ symbol and on other occasions, they only write the username by itself. Obviously, internally we would like everything to be stored uniformly, so we will need to extract the username, regardless of the @ symbol, and then manually prepend it with one, so regardless of whether it was there or not, the end result will look the same.

So again, let's add a field for this:

Twitter: <input type="text" id="twitter_field" /><br />

Now, let's write the function to handle it:

function process_twitter() {
    var field = document.getElementById("twitter_field");
    var username = field.value;

    var twitter_pattern = /^@?(w+)$/;

    if (twitter_pattern.test(username) === false) {
        alert("Twitter username is invalid");
        return false;
    }

    var res = twitter_pattern.exec(username);
    data.twitter = "@" + res[1];
    return true;
}

fns.push(process_twitter);

If a user inputs the @ symbol, it will be ignored, as we will add it manually after checking the username.

Understanding the twitter username Regex

Let's go back to the regular expression used to match the name entered by the user:

/^@?(w+)$/

This is a brief explanation of the Regex:

  • ^ asserts its position at start of the string
  • @? matches the @ character, literally
    • The ? quantifier between zero and one time
  • First capturing group (w+)
    • w+ matches a [a-zA-Z0-9_] word character
    • The + quantifier between one and unlimited times
  • $ asserts its position at end of a string

Matching passwords

Another popular field, which can have some unique constraints, is a password field. Now, not every password field is interesting; you may just allow just about anything as a password, as long as the field isn't left blank. However, there are sites where you need to have at least one letter from each case, a number, and at least one other character. Considering all the ways these can be combined, creating a pattern that can validate this could be quite complex. A much better solution for this, and one that allows us to be a bit more verbose with our error messages, is to create four separate patterns and make sure the password matches each of them.

For the input, it's almost identical:

Password: <input type="password" id="password_field" /><br />

The process_password function is not very different from the previous example as we can see its code as follows:

function process_password() {
    var field = document.getElementById("password_field");
    var password = field.value;

    var contains_lowercase = /[a-z]/;
    var contains_uppercase = /[A-Z]/;
    var contains_number = /[0-9]/;
    var contains_other = /[^a-zA-Z0-9]/;

    if (contains_lowercase.test(password) === false) {
        alert("Password must include a lowercase letter");
        return false;
    }

    if (contains_uppercase.test(password) === false) {
        alert("Password must include an uppercase letter");
        return false;
    }

    if (contains_number.test(password) === false) {
        alert("Password must include a number");
        return false;
    }

    if (contains_other.test(password) === false) {
        alert("Password must include a non-alphanumeric character");
        return false;
    }

    data.password = password;
    return true;
}

fns.push(process_password);

All in all, you may say that this is a pretty basic validation and something we have already covered, but I think it's a great example of working smart as opposed to working hard. Sure, we probably could have created one long pattern that would check everything together, but it would be less clear and less flexible. So, by breaking it into smaller and more manageable validations, we were able to make clear patterns, and at the same time, improve their usability with more helpful alert messages.

Matching URLs

Next, let's create a field for the user's website; the HTML for this field is:

Website: <input type="text" id="website_field" /><br />

A URL can have many different protocols, but for this example, let's restrict it to only http or https links. Next, we have the domain name with an optional subdomain, and we need to end it with a suffix. The suffix itself can be a single word, such as .com or it can have multiple segments, such as.co.uk.

All in all, our pattern looks similar to this:

/^(?:https?://)?w+(?:.w+)?(?:.[A-Z]{2,3})+$/i

Here, we are using multiple noncapture groups, both for when sections are optional and for when we want to repeat a segment. You may have also noticed that we are using the case insensitive flag (/i) at the end of the regular expression, as links can be written in lowercase or uppercase.

Now, we'll implement the actual function:

function process_website() {
    var field = document.getElementById("website_field");
    var website = field.value;

    var pattern = /^(?:https?://)?w+(?:.w+)?(?:.[A-Z]{2,3})+$/i

    if (pattern.test(website) === false) {
        alert("Website is invalid");
        return false;
    }

    data.website = website;
    return true;
}

fns.push(process_website);

At this point, you should be pretty familiar with the process of adding fields to our form and adding a function to validate them. So, for our remaining examples let's shift our focus a bit from validating inputs to manipulating data.

Understanding the URL Regex

Let's go back to the regular expression used to match the name entered by the user:

/^(?:https?://)?w+(?:.w+)?(?:.[A-Z]{2,3})+$/i

This is a brief explanation of the Regex:

  • ^ asserts its position at start of a string
  • (?:https?://)? is a non-capturing group
    • The ? quantifier between zero and one time
    • http matches the http characters literally (case-insensitive)
  • s? matches the s character literally (case-insensitive)
    • The ? quantifier between zero and one time
    • : matches the : character literally
    • / matches the / character literally
    • / matches the / character literally
  • w+ matches a [a-zA-Z0-9_] word character
    • The + quantifier between one and unlimited times
  • (?:.w+)? is a non-capturing group
    • The ? quantifier between zero and one time
    • . matches the . character literally
  • w+ matches a [a-zA-Z0-9_] word character
    • The + quantifier between one and unlimited times
  • (?:.[A-Z]{2,3})+ is a non-capturing group
    • The + quantifier between one and unlimited times
    • . matches the . character literally
  • [A-Z]{2,3} matches a single character present in this list
    • The {2,3} quantifier between2 and 3 times
    • A-Z is a single character in the range between A and Z (case insensitive)
  • $ asserts its position at end of a string
  • i modifier: insensitive. Case insensitive letters, meaning it will match a-z and A-Z.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset