Tuesday, October 26, 2010

Greasemonkey script for HiFi Regex evaluator

In my last post, I had mentioned that HiFi Regex Evaluator is great for playing around with regular expressions. Some of the UI features that I really like are:

  1. Uncluttered interface: Just like Google search interface, this site has the regex input field right in the top center, focusing my attention
  2. No submit button: After typing in a regular expression, I don't have to click a button to evaluate it on the text. It does that as I keep typing. Like Google Suggest.
  3. Color coding: The matches are split according to capturing groups and displayed separately and color coded. So it's easy to figure out capturing groups' matching.
  4. Cheatsheet: A handy cheat sheet is embedded on the right side for quick reference. So, no switching tabs to refresh my memory about syntax.
With these features, it is a good fit for prototyping/debugging fairly long regular expressions. Almost like when you are in a REPL session, you can change a small part of the regex, get feedback about the change, tweak again and go on till your regex is ready. However, since I was debugging a huge regular expression (some of which was not JS compliant regex syntax) I wanted the ability to evaluate partial chunks of the regular expression. Initially I started off by putting the big regex in a text file and copying and pasting chunks into HiFi Regex and seeing it's effect. However, switching back and forth between HiFi Regex and the text file was distracting. So, I decided to add on to the site's functionality using Greasemonkey.

My idea was to have an additional text field at the top into which I would paste the bigger regular expression, for once. When I selected a chunk from this original, it should automatically transfer the chunk into the original HiFi Regex text field and trigger a keyup event (programatically). This would eliminate the need to switch between apps for copying chunks. With Greasemonkey and jQuery this was easily done.

However, during the development, I learned that handling the DOM can be a bit tricky within Greasemonkey since the DOM objects are wrapped with XPCNativeWrapper objects for security reasons. After reading the excellent articles on avoiding pitfalls, I was able to understand that jQuery trigger would not work as expected on the XPCNativeWrapper object and instead had to use the dispatchEvent method.

If you want to give it a try, please install Greasemonkey add-on for Firefox and install the script and visit the HiFi Regex site. Please let me know if it helps you in regex construction.

Saturday, October 23, 2010

Regex strings in Javascript

This week at work, I was asked whether I could fix some form validation rules in response to the defects found during testing. I found out that whoever created the form had the good sense to include jQuery and jQuery-validate libraries. So I volunteered for the job thinking its going to be a 5 minute piece-of-cake job. But, gosh it took almost 2 days of mine. Thanks to some crazy regexes(I found out that .NET regular expressions may not always be injected as-is into JS because of differences in syntax), changing requirements and silly mistakes I made.

Speaking about silly mistakes here's one I made. Here's a snippet from my regex [0-9]+\W{2,5} and I was trying this.

var str = '100%';
var reg = new RegExp('[0-9]+\W$');
alert(reg.test(str)); // should be true but I was getting false

The above code didn't work... My regex was way bigger than this and I thought my regex was broken somewhere. So I spent a good part of the hour tearing the regex apart to see what's wrong. The online regex checker - http://www.gethifi.com/tools/regex - helped me a lot here. Although I found a few gaps in the regex, I finally got it working in the regex checker for all my test input and they were working OK. I was at my wit's end after waiting an hour. I checked the JS source of the checker and found out the same RegExp object being used and instantiated in the exact same way. Next I fired up firebug and tried a few console logging statements. Ended up being no wiser than I started, since there were no errors and all my logging statements were coming up alright.

I took a break and as usual when I was back, hit a lucky break. I tried logging the RegExp object and behold the backslash before the 'W' was missing. I realized that, when the string was being parsed the \ would have been interpreted as an escape character and hence that character would not be part of the regular expression object when it got created. But, when the regex checker read off the same regex string from a text field, it did not have to do any string parsing, and that's why it was working there. To, resolve it we can use either:

  1. Use the literal form for creating the regex
    var reg = /[0-9]+\W$/;
    
  2. Escape the backslash character
    var reg = new RegExp('[0-9]+\\W$');