Monday, 15 April 2013

Depending on the Java Matcher method you use, your regex may not get what you want

We had a system that you could configure with regexes to parse
incoming data. One of them was a pattern to look at a user's email,
and was configured like:


And it wasn't matching the values we had, even though the values were
of the form:

Then I found that the code we used to find a match was:


Now the javadoc for matches() says:

"Attempts to match the entire region against the pattern."

So if you look at the ENTIRE string, then

would never match

If you use Matcher.find(), then it just looks at any substring that
matches the pattern, and so this would succeed.

The fix was to change the regex to


YES, it also means that invalid username values would be matched, but
we have other filters that would check if the entire string was of
valid email format. All we care about is the domain, for this bit of


No comments: