6.11. Numbers with Thousand Separators

Problem

You want to match numbers that use the comma as the thousand separator and the dot as the decimal separator.

Solution

Mandatory integer and fraction:

^[0-9]{1,3}(,[0-9]{3})*.[0-9]+$
Regex options: None
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

Mandatory integer and optional fraction. Decimal dot must be omitted if the fraction is omitted.

^[0-9]{1,3}(,[0-9]{3})*(.[0-9]+)?$
Regex options: None
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

Optional integer and optional fraction. Decimal dot must be omitted if the fraction is omitted.

^([0-9]{1,3}(,[0-9]{3})*(.[0-9]+)?|.[0-9]+)$
Regex options: None
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

The preceding regex, edited to find the number in a larger body of text:

[0-9]{1,3}(,[0-9]{3})*(.[0-9]+)?|.[0-9]+
Regex options: None
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

Discussion

Since these are all regular expressions for matching floating-point numbers, they use the same techniques as the previous recipe. The only difference is that instead of simply matching the integer part with [0-9]+, we now use [0-9]{1,3}(,[0-9]{3})*. This regular expression matches between 1 and 3 digits, followed by zero or more groups that consist of a comma and 3 digits.

We cannot use [0-9]{0,3}(,[0-9]{3})* to make the integer part optional, because that would match numbers with a leading comma (e.g., ,123). It’s the same trap of making everything optional, explained in the previous recipe. To make the integer part optional, we don’t change the part of the regex for the integer, but instead make it optional in its entirety. The last two regexes in the solution do this using alternation. The regex for a mandatory integer and optional fraction is alternated with a regex that matches the fraction without the integer. That yields a regex where both integer and fraction are optional, but not at the same time.

See Also

All the other recipes in this chapter show more ways of matching different kinds of numbers with a regular expression. Recipe 6.12 shows how you can add thousand separators to numbers that don’t have them.

Techniques used in the regular expressions in this recipe are discussed in Chapter 2. Recipe 2.1 explains which special characters need to be escaped. Recipe 2.3 explains character classes. Recipe 2.5 explains anchors. Recipe 2.6 explains word boundaries. Recipe 2.8 explains alternation. Recipe 2.9 explains grouping. Recipe 2.12 explains repetition.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset