“Data validation is a common chore in programming any user interface. The Java language’s regular-expression support can make data validation easier. You can define a regular expression that describes valid data and then let the Java runtime see if it matches. But certain types of data have different formats in different locales. The ResourceBundle class lets you work with locale-specific data in an elegant way. This article shows how to combine the two techniques to solve a common data-entry problem.”
Validating using regex’s seems like a risky method. It is quite easy to make a mistake that still compiles.
Much better to use explicit logic. Eg if a string must be at most 5 char’s long:
if (the_string.length() > 5)
the_string.setLength(5);
Rather than:
the_string = the_string.regex(“^(.{,5}).*$”));
or something…
The can become really long and complicated and unreadable.
the_string = the_string.regex(“^(.{,5}).*$”));
How about
the_string.regex(“.{5,}”);
No, because “.{5}” (the extra comma destroys the notion of “exactly 5 chars) would only specify that 5 characters occur somewhere in the string.
Using ^ and $ explicitly states the beginning and end of the string.
My bad. I misread it as “at least 5 characters.”
I do know my regex though
the_string = the_string.regex(“(.{,5}?)”);
should work though, unless there is something about regex in java that i am missing.
You’re missing the beginning and end of string characters. “.{,5}” only states that the string contains upto 5 characters somewhere. A string with ten million characters meets that critera.
You’re right. I’m not thinking today.
I understand what you’re saying, though I’d be seriously worried if someone checked string length using a regex!
However, once you get beyond trivial examples, regexes are very powerful tools – although they come with some responsibility.
For example, I have to deal with things like validation of postcodes, e-mail addresses, URLs, keywords, even price ranges. Having to write explicit checks can often lead to very large and complicated sets of statements. Large and complicated code makes it more likely of introducing bugs, reducing readability and making maintenance more time-consuming.
My colleagues and I have found that a well-written regex can provide a more readable, more maintainable and more efficient way to validate the kinds of data we are having to process. In some rare cases, it can be the only suitable way to express the validation code.
Validating using regex’s seems like a risky method. It is quite easy to make a mistake that still compiles.
Yeah but it is good for checking a lot of data quickly without having to run through it twice in your program. I can image situations where you would want to check the data as a whole first before starting to process it, because you don’t want to get stuck halfway through for reasons of data integrity (I’m thinking banking applications and the like).
Of course you can still include extra checks using explicit logic in the actual processing code.
Also an important part of the article you missed is the fact that the checks here vary depending on local. I don’t think you want to write an ‘if’ in each of your checks for each local, do you ?
You have a problem.
You use regex to solve it.
You now have two problems.
Heh… was I the first person to misread that title as “Vandalize Data with Regular Expressions”? Guess I was scrolling a little too fast.
Truly, there are good uses for regular expressions, but I try to avoid using them for mission-critical data validation. IMHO, much better to think of your data entry attributes as types, and thus use class or data structure definitions to validate your entries. Of course, this still depends on your designing these properly, thinking in terms of domains (the set of all possible values for a type or variable).
Edited 2005-12-27 22:58
I’ve used regular expressions for just such tasks….like validating email and IP addresses because their syntax fit certain patterns.
Good article….very practical (I’ve done these very things in enterprise applications). I just need to get more familiar with Eclipse…
two benefits of regular expressions over hard coding as shown above:
1. writing the logic in the code can be very time consuming and can take lots of code (both of which result in more errors)
2. you can’t easily externalize such code into a properties file like you can with a regular expression. and once it’s in a properties file (rather than the code), it’s easily changed….which is not the case when such logic is in the code
PEOPLE ON ONSEWS DOT COM ARE STUPID I HATE YOU ALL