A Grab Bag of Intermediate Perl Regex Tricks

Below we begin exploring some of the finer points of Perl’s syntax. The rabbit hole goes much much further down than this, so don’t expect to get your mind blown. This is just a few tricks to help get started in breaking out of the bottom 30% of the language.

Below we count the number of occurrences of a character with the return value of a truncate command. (I’d been doing that the hard way up ’til now.) We demonstrate how to turn off case sensitivity for selected parts of a regex. We show how to comment complex regex’s with the /x option. Finally, we show a few pattern match variables and also how to use the /e option to replace patterns with bona fide perl expressions. (Just for fun we play around with the ‘gee’ option.)

### the translate and substitute commands have return values
### that can occasionally be useful.
my $string = "ABABBCCAA";
my $count = ($string =~ tr/A/A/);
print "The number of A's that appear in \"$string\" is $count.\n";

my $count2 = ($string =~ s/B/X/g);
print "The s changed it to \"$string\" and returned $count2.\n";

### The match command returns a number, too.
my $count3 = ($string =~ m/C/);
print "Matching C's on \"$string\" I get $count3.\n";

### You can turn off case sensitivity for pieces of your regex
### instead of ignoring it for the whole thing.
### (Also, a failed match does not return a zero.)
my $a = "John Jacob JiNgLeHeImEr Smith";
my $count4 = ($a =~ m/(?i:heimer) Smith/);
my $count5 = ($a =~ m/(?i:heimer) SMITH/);
print "The good one returned a $count4 and the bad one a $count5.\n";

### use the /x option to add whitespace and comments to your regex
my $count6 = ($a =~ m/(\w{5})   # a five letter word for \1
                      \s        # a space
                      (\w{12})  # a twelve letter word for \2
                     /x);
print "The count was $count6 and found \"$1 $2\".\n";

### The e option evaluates the right side of the substitution
### as an expression before doing the replacement.
my $text2 = "Hello world!";
$text2 =~ s/world/5 + 2 * 3/;
my $text3 = "Hello world!";
$text3 =~ s/world/5 + 2 * 3/e;
my $value = 3;
my $text4 = "Hello world!";
$text4 =~ s/world/5 + 2 * $value/e;
print "No e: '$text2' ...\nWith e: '$text3' ...\nWith e and variable: '$text4'\n";

### If our expression is stored in a scalar, we can use the
### ee option to force its evaluation after the interpolation.
my $text6= "Hello world!";
my $expression = "5 + 2 * 3";
$text6 =~ s/world/$expression/ee;
print "This is what happened: '$text5.\n";

### You can look at matching information with $`, $&, and $'
### The ee option will actually execute code as well...
### and the pos function will tell you where you've left off.
### By combining these three features we can 'step through' exactly
### what a substitute command is doing:
my $text = "one two three four seventeen";
$text =~ s/\w{4}/print "before[$`] matched[$&] after[$'] position " . pos($text) . ".\n"/gee;

### Each e option after the first is equivalent to calling eval.
### A single e option by itself is a standard perl expression
### and not necessarily evil.
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: