Archive for the ‘Closures’ Category

Learning Perl by Reading and Stealing Code from CPAN…

October 10, 2008

DadHacker has recently posted about interviewing people that simply can’t name the “most interesting code they’ve read.” At work I mostly look at mediocre accounting application code, but I tend to try to stick to cutting SQL and parsing report text files. That doesn’t lead to a lot of interesting code reading. Books give you the basics of syntax and concepts… but the pressures of the real world are very different. There’s definitely a hole in my education here, so it’s time to start surfing source code files instead of programming blogs. CPAN makes it easy. Let’s see what we can turn up today….

Starting with something simple that I’ve implemented many times myself, let’s look at the dice rolling code first….

Philip Newton’s Games::Dice is pretty nifty. His regex is broken down with the /x option to make it easy to understand. I like how he uses ‘||’ to default functions that don’t pull anything from the regex:

$sign = $2 || ”;

In the past I’ve used two or three lines to do the same thing. It might be cryptic to certain classes of non-Perl programmers, but it seems to be typical of more idiomatic Perl. Another thing you see him do with his function variables is declare them all at the top. (I tend to declare them willy nilly….) Finally, you see him explicitly returning undef if the argument criteria aren’t met. I’ll keep that one in my ditty bag….

David Cantrell’s Games::Dice::Advanced uses a pretty flexible object oriented style interface. He blithely tosses in a folding operation in there:

sub foldl {
  my($f, $z, @xs) = @_;
  $z = $f->($z, $_) foreach(@xs);
  return $z;
}

sub sum { foldl(sub { shift() + shift(); }, @_); }
print "This sum is ... " .  sum(1, 2, 3, 4) . "\n";

To make sure I understood what this one was doing I wrote a little function to illuminate what was happening:

sub visible_sum {
    foldl( sub { my $x = shift;
                 my $y = shift;
                 print "adding $x and $y\n";
                 return $x + $y; },
           @_);
}

visible_sum(1, 2, 3, 4);

Here’s a gratuitous attempt to emulate the style of SICP in Perl:

sub myfold {
    my($function, $first, @rest) = @_;
    return $first unless @rest;
    return myfold($function,
                  $function->($first, shift(@rest)),
                  @rest);
}

my $add = sub { $_[0] + $_[1]; };

print "myfold returns " . myfold( $add,
                                  1, 2, 3, 4)
                        . "\n";

Just messing around…. I like how you basically get first and rest functions for “free”. The anonymous function syntax of Perl is perfectly workable, but not very pretty. I checked through Higher Order Perl hoping to find something better than the shifting and the array references we used above, but was disappointed. Sigh. You can’t have it all.

I like David’s idea about making reusable dice objects. Without taking a lot of time to try to read it, I’m not going to get around to understanding his code, though. I’m not too fond of ‘bless’ or Perl’s object system in general…. On the other hand, Philip’s code is running the regex to parse the dice text every single time you call his functions. Hmm….

OOP is overkill here. If ever there’s a time to trot out closures, this is it. In my apps, I’d probably steal Philip’s well-documented regex’s and roll them out with something like this:

sub roll_maker ($) {
    my($line, $num, $type);

    $line = shift;

    return undef unless $line =~ m{
                 ^      # beginning of line
                 (\d+)? # optional count in $1
                 [dD]   # 'd' for dice
                 (      # type of dice in $2:
                    \d+ # either one or more digits
                  |     # or
                    %   # a percent sign for d% = d100
                 )
              }x;       # whitespace allowed

    $num    = $1 || 1;
    $type   = $2;

    $type  = 100 if $type eq '%';

    return sub {
        my $result;
        for( 1 .. $num ) {
            $result += int (rand $type) + 1;
        }
        return $result;
    }
}

my $roller = roll_maker("3d6");
print "Roll 3d6: " . $roller->() . "\n";

Philip Newton’s code allows for some customization by taking on an expression to modify it. It seems strange to have to write an impromptu parser just for this. I think we should use the regex to validate what comes into the function, here… but this may be an acceptable use of eval. Just like the closure above saving the function definition for later, we’ll use eval just a tiny tiny bit here and save it’s results for later use, too. (This needs a couple of tweeks– I’m only handling + and * properly just yet…. This is just to show the idea…. Don’t have time to fix this, now….)

sub define_sub {
  my $text = shift;
  return eval "sub { $text }";
}

sub roll_expression_maker ($) {
    my($line, $dice_string, $sign, $offset, $sum, $roller, $exp);

    $line = shift;

    return undef unless $line =~ m{
                 ^              # beginning of line
                 (              # dice string in $1
                   (?:\d+)?     # optional count
                   [dD]         # 'd' for dice
                   (?:          # type of dice:
                      \d+       # either one or more digits
                    |           # or
                      %         # a percent sign for d% = d100
                   )
                 )
                 (?:            # grouping-only parens
                   ([-+xX*/bB]) # a + - * / b(est) in $2
                   (\d+)        # an offset in $3
                 )?             # both of those last are optional
              }x;               # whitespace allowed

    $dice_string = $1;
    $sign        = $2 || '';
    $offset      = $3 || 0;

    $sign        = lc $sign;

    $roller = roll_maker( $dice_string );
    return undef unless $roller;

    $exp = define_sub('my $arg = shift; return $arg ' . $sign . ' ' . $offset . ';');
    return undef unless $roller;
    
    return sub { $exp->($roller->()); };
}

my $funny = roll_expression_maker("3d6+20");
print "funny " . $funny->() . "\n";

If we’re calling these functions a lot with the same types of requests, then it would make sense to memoize them. That would cut back on the times we parse these regex’s and run the evil eval….

Roll-your-own OO system with Closures and a Dispatch Table

September 25, 2008

Okay… this isn’t an article or an essay… just more rambling notes from a random guy working on learning Perl. This is not a plea for help, so don’t feel like you have to jump in and rescue me or anything. I should develop my ideas further before posting, but I figure I’ll quit working on this if I “go dark” too soon.

I write these little programs and then I try to turn them into classes and try to instantiate them. Somehow, I don’t understand references, variable scope, and packages well enough to grok why I’m falling on my face here. (I remember a few years ago trying to read an early edition of Learning Perl and giving up pretty fast. Programming Perl was just plain frightening then. I can understand a lot more of it now, but I still get lost when I try to apply it.) I’m mad enough at the OO system that I’m willing to roll my own just to avoid the issue: I’d rather have a system that I understand completely than use one that’s much more robust but that does things that seem weird or wrong to me. Of course, I’d sacrifice some much-needed compiler help in the process, by there you go…. I have played with Moose, but my standard brain-dead OO use case that I want to roll with isn’t in their recipe list. It’s probably there somewhere, but I’ve used it enough that I know I need to understand more about regular Perl OO to use it—just so I can pick up the syntax and know what’s going on under the hood.

In any case… my view of objects may be shifting. I don’t know. I think an object should be completely malleable at runtime. I want to be able to add and remove properties and methods at will. Reflection should be drop-dead easy. Maybe I want to reinvent Smalltalk or something, I don’t know…. For some reason this just seems too hard to me in most OO systems. There’s probably some technical reason for this– or maybe I’m just lazy. Who cares. Meh.

In my last bit of Perl code, I had a table object glommed in with the rest of my code—it’s just crying to be abstracted out. I use dispatch tables all the time now, so a dispatch table class seems like a good idea. How many times do I need to cut and paste that code before I roll it into a module file?? And finally, we have the beginnings of the widget that I’m sketching out below. (Yep… totally done with closures—fun, fun, fun! For some reason I couldn’t figure out how to pass arguments to the dispatch table subs. I spent hours messing with it and I just don’t understand why I couldn’t. That’s why it’s using lame global variables in there for that….) At any rate, if I thought I could just pick up the Perl OO system and just run with it, I was wrong.

Time to sit down and RTFM– back to Programming Perl and Intermediate Perl for the time being…. Then we can revisit class definitions for data tables, widgets, dispatch tables, and my parser if I don’t come to my senses by then….

#!/usr/bin/perl
use warnings;
use strict;
my $show_debug = 0;

my @current_arguments;
my $current_stuff;

my %dispatch_data = (

    'Create Field'  => {
        code    => sub {  my ($field, $regex, $default) = @current_arguments;
                          $current_stuff->{$field} = create_field($regex, $default);
                          return $current_stuff->{$field};
                       },
        debug   => sub { print "Create Field called\n" },
    }

    );

my $dispatch_table = {};
while ( my ( $command, $dispatch ) = each %dispatch_data ) {
    $dispatch_table->{ $command } = sub {
        $dispatch->{ code  }->()  if exists $dispatch->{ code };
        $dispatch->{ debug }->()  if $show_debug;
    } if exists $dispatch->{ code } or ($show_debug and exists $dispatch->{ debug });
}

### return a closure that handles get and set methods for the field
sub create_field {
    my ($regex, $default) = @_;
    my $value;
    $value = $default if $default;
    return sub { my ($v) = @_;
                 if ($v) {
		     if ($regex) {
			 die "Value $v does not match |$regex|." unless ($v =~ /^$regex$/);
		     }
		     $value = $v;
                     return $value;
		 } else {
		     return $value;
		 }
               };
}

sub create_widget {
    my %stuff;
    return sub { my $command = shift;
                 if ($dispatch_table->{$command}) {
                     @current_arguments = @_;
                     $current_stuff = \%stuff;
                     return $dispatch_table->{$command}->();
                 } else {
                     my $thing = $stuff{$command};
                     die "No such thing '$command' in our stuff!" unless $thing;
                     my ($field, $value) = @_;
                     return $thing->($field, $value);
                 }
               };
}

my $f = create_field('\d+');
my $x = $f->(42);
$f->(77);
my $y = $f->();
print "x is $x and y is $y\n";

my $w = create_widget();
$w->('Create Field', 'Prop', '\d+', 43);
my $z = $w->('Prop');
$w->('Prop', 78);
my $a = $w->('Prop');
print "z is $z and a is $a";

using the sphincter-sigil to abuse Perl

September 23, 2008

(Sorry about the title. Don’t you hate it when you’ve got a syntax question and you don’t even know what to google to find out what it means…?)

draegtun dropped by recently to offer suggestions for tuning up some of my Perl code… and some of his code appeared to do some strange things. (Thanks for the help, as always….) Maybe it was something to do with closures and maybe it was operating directly on the internal function table of Perl. So I decided to mess with it some before rolling it out to my mini-project. I’m sure draegtun ‘s a nice guy and all, but I remember all to well the poor wannabe lisp hacker that picked up code for formatting his hard drive via some random use-group advice…. You just can’t be too careful…. ;) Yeah, this is silly… but this is what passes for understanding code snippets before plugging them in….

Anyways, flipping through the Camel book I quickly came across a similar example in the section on closures. I couldn’t find out what the weird * sigil meant even after flipping around in the index. Logically, Perl is a postmodern language. Therefore the * sigil means the same thing as it did in Kurt Vonnegut’s Breakfast of Champions. Obviously, applying the * to a scalar shoves its associated reference up Perl’s hind end. Perl’s DWIM features should cause it to psychically intuit which internal table you meant it to go to based on the type of reference you’re passing….  (Well, I figure if I can think like Larry, then learning Perl will be a lot easier.  That’s my theory, anyway.  Let’s see if this is right….)

#/usr/bin/perl
use strict;
use warnings;

sub hotwire {
  my $number = 42;
  my $string = "foo";
  my $subname = "blarney";
  no strict 'refs';
  *$subname = sub { print "Yes, it's $number and $string.\n"; };
}

hotwire();
my $number = 12;
my $string = "bar";
blarney();

Okay, this is fun…. Now, we know from the opening chapter of Intermediate Perl that you can use eval as a substitute for the type of error-catching try/catch block activity you might have seen in other languages. Of course, as in other languages, using eval for what it’s supposed to be used is generally stupid and a form of language abuse. However we feel it is our divine right to abuse whatever language we have at our disposal, so let’s just check to see that we can load evaluated text into our environment as a function:

sub loadme {
  my ($name, $code) = @_;
  no strict 'refs';
  *$name = eval $code;
}

loadme("noway", 'sub { print "yadda yadda yadda\n"; }');
noway();

Wow… it works! Woo-hoo! Okay…. Let’s try adding in our own syntactic sugar. We’ll create a function and we’ll search for a pattern and replace it with our function call:

sub check {
  my ($a, $b, $c) = @_;
  return "$a...$b...$c...";
}

sub loadme2 {
  my ($name, $code) = @_;
  $code =~ s/(\w+)~(\w+)~(\w+)/check('$1','$2','$3')/g;
  #print "$code\n";
  no strict 'refs';
  *$name = eval $code;
}

loadme2("yesway", 'sub { my $a = 99; print try~this~out, "\n", $a, "\n"; }');
yesway();

Dumb dumb dumb! There are surely better ways to accomplish this! But yeah, it’s nice to know that (in a pinch) we can do evil nasty things like this…. Let’s try one more variation that loads the text in from a file. Also, let’s store our reference in a scalar variable instead of hotwiring the package’s table of functions….

open TEXTFILE, "< loadthis";
my $file;
while (<TEXTFILE>) {
  $file .= $_;
}
close TEXTFILE;

loadme2("crazy", $file);
crazy();

sub loadme3 {
  my ($code) = @_;
  $code =~ s/(\w+)~(\w+)~(\w+)/check('$1','$2','$3')/g;
  return eval $code;
}

my $C = loadme3($file);
print "Now we'll try running the code from a scalar...\n";
$C->();

There we go!

Now… this is evil… this is wrong… this is bogus and brain dead. But but by marrying this to our table parsing code, we can create our own Frankenstein style scripting language. We already can create specialized data structures can initialize them with data parsed from a pseudo-DSL. We can customize these structures with perl code tacked on however we like. This can be basic event and override code like you see in GUI frameworks. I figure that the code is already interpreted for us into compiled-ish code when the references are read in, so you won’t see any real performance hit when you call these things– they should be indistinguishable from any other sub. Just look for things to slow down at startup as you add more and more of these to the script files. (Yeah, I know. I need to learn about hardware and assembly language– and I need to go pick up a comp sci degree, too– I’ll get around to it sometime. Right now I just want something that works….)

Back in Lisp we made the mistake of trying to create an all-new language that ran on top of Lisp. This was bad because we couldn’t take advantage of Lisp’s language features in our hacked/custom dsl-thing. (We should have gone with the grain of the language and modified Lisp slightly so that it became our lisp-ish DSL by itself.) In Blub, we hacked someone’s custom evaluator to go even further to craft our own language from scratch. This proved to be an unsustainable amount of work beyond a certain level of complexity. With Perl, we are compromising by creating special language-interfaces for initializing data structures… and then switching over to (maybe) slightly-abused Perl code where it suits. There’s better ways to do this, but this may be sufficient for a certain class of problems….

Probably what we’re doing is rolling our own Meta-object protocol. (After a fashion, anyway.) That’s not necessarily a complete waste of time: now maybe if we study it, we’ll have a use for it….

Reading On Lisp: Then and Now…

December 27, 2007

“In The Mythical Man-Month, Frederick Brooks proposed that the productivity of a group of programmers does not grow linearly with its size. As the size of the group increases, the productivity of individual programmers goes down. The experience of Lisp programming suggests a more cheerful way to phrase this law: as the size of the group decreases, the productivity of individual programmers goes up. A small group wins, relatively speaking, simply because it’s smaller.” (On Lisp, p 5.)

“Sophisticated users now demand so much from software that we can’t possibly anticipate all their needs. They themselves can’t anticipate all their needs. But if we can’t give them software which does everything they want right out of the box, we can give them software which is extensible. We transform our software from a mere program into a programming language, and advanced users can build upon it the extra features that they need.” (On Lisp, p 5.)

“So yes, reading a bottom-up program requires one to understand all the new operators defined by the author. But this will nearly always be less work than having to understand all the code that would have been required without them.” (On Lisp, p 59.)

I imagine that most people picking up On Lisp these days are mainly interested in two things: how do you use macros and what can lisp do that other languages can’t do? But after an inspiring introduction that deftly weaves in themes that are regularly discussed and debated even fifteen years after the book’s publication, the reader is faced with several dry chapters that focus on functions and utilities. To someone with little experience with functional programming, basic computer science, or extensive lisp coding, much of it will seem pointlessly tedious and pedantic.

I was just such a person last May just before starting this blog. I really wanted to skip ahead to the good stuff and savor Graham’s prose, but I was like Luke Skywalker cutting out on his training to go fight Darth Vader. (I was just about to get my face smashed by gigantic futuristic heavy things as I tried to wrap my mind around a lot of mind-blowing material.  Or something like that.) I strained myself through several chapters of On Lisp even though I understood little of it and finally exhausted myself. And yet, I have code that I wrote about that time that violates the spirit of everything Graham was trying to say! I was completely lost.

There’s really only a few core ideas that you need to know about, though. You need to know what functions can do inside and out before you can understand when to use macros. You need to know about recursion, iteration, accumulators, and mapping. You need to be able to reflexively see the skeleton of a function, know how to abstract it out, and know how create customizable functions that take functions for arguments. And you need to have coded enough of this stuff in Lisp such that you’ve done certain things over and over and over again. If you get to chapter 4 and don’t immediately think, “man, these utility functions are all stuff I wish was already built into the language!” then you probably haven’t done enough Lisp coding to appreciate the book. SICP covers all of these things and more. If you can work through the first two chapters of SICP, then you’ll be more than prepared for On Lisp.

Reading On Lisp now after working through parts of SICP and doing a few small practice projects, I can finally understand what it’s saying. It’s really satisfying to finally “get” what Paul says about closures. Here’s a brief summary of the points about closures he covers:

* Closures emerge from lexical scoping. (Lexical scoping forces the system to save copies of variables when functions are defined and then associate those copies with the functions.)

* Example 1: Create a function that uses a lambda function that references one of the original arguments. (Make an implicit “function factory”.)

* Example 2: Define two functions that share the same counter variable. (Eliminate the need for messy global variables.)

* Example 3: Create a function that returns a customized function based on the arguments. (Make an explicit “function factory”.)

* Example 4: Alter example three so that the customized functions can change their state after they’ve been created. (Use optional arguments on a closure to allow the users to change it’s behavior later– sort of like properties on an object.)

* Example 5: Create a function that returns a group of closures that all use the same data objects.

* Example 6 (Chapter 5): Create a function that takes a function as its argument and returns a function for its value. (In the examples, we get one for Complement and another for Composition. These functions are mainly used to simplify and refactor the core Common Lisp library. SICP‘s coverage of Newton’s Method is much more exciting and as a bonus, it will help you understand the famous fast inverse sqrt routine.)

* Example 7 (Chapter 6): Elegance and speed with closures: Instead of defining a data structure and then defining a set of functions to access it, you can define a “smart” data structure by using closures that keep their own state and reference other closures. The line between code and data can be further blurred here by writing routines to compile your closure-based data structures. “In Lisp we can sometimes use closures as a representation. Within a closure, variable bindings can store information, and can also play the role that pointers play in constructing complex data structures. By making a group of closures which share bindings, or can refer to one another, we can create hybrid objects which combine the advantages of data structures and programs.” (On Lisp, p 76.)

Looking at these examples together, it becomes clear that a typical OOP style emphasizes a top-down approach, while closures encourage a more finely grained bottom-up style. Keep in mind that the examples of closures here are not the final iterations that Graham wants you to take away from the book. In later sections he will improve the code using macros to increase the efficiency and generality of the samples.

What You Don’t Know About Closures Can Hurt You

October 25, 2007

Some of you out there may still be wondering why it is that closures and macros are so great. “What exactly can you do with them that I can’t do in my more widely used Turing-complete language,” you ask. That’s reasonable. Well, I don’t think you’ll fully appreciate closures until you grasp some of the basics of functional programming, so let’s start there.

Functional programming languages accord a first-class status to procedures:

* They may be named by variables.
* They may be passed as arguments to procedures.
* They may be returned as the results of procedures.
* They may be included in data structures.

In other words, you can treat procedures as primitives. In the same way that you operate on integers, juggle them, shuffle them, combine them, transform lists of them, do anything you want with them… a functional programmer does the exact same thing with procedures.

Now if you’ve never thought much about this or experimented with these ideas, this will sound bizarre. You might have a hard time imagining how this can possibly be so useful. The only way I can think of to describe it is to imagine that you lived in a world where there was no such thing as subroutines, yet– and then suddenly some experimental language came out that had that feature. People would not admit that it was particularly useful and they would point to their terrible hacks and workarounds as evidence that such “ivory tower” concepts were useless in the Real World. Of course, you’ve seen almost the exact same thing in your workplace already, more than likely. When you see code repeated across the code base and you abstract it out into a class that is referenced by multiple projects, suddenly your coworkers are peeved at you because they can’t understand a section of code that they’re looking at without hitting “right-click — Definition” to jump to your new class file. This is why the default code architecture for a crummy business app is to open up a hundred separate form files and freely mix, match, and repeat business logic, data access code, and gui instructions all together. It’s a wonder that these guys even use subroutines…. (Oh wait, you see that, too: gigantic subs that go on for screen after screen that you can’t understand….)

Anyways, when you get procedures being treated as first-class citizens of the language, new abstractions are possible. You can create procedures that take procedures as an argument and that return procedures as a value. This means you can take several similar functions and abstract out the basic process that they all share. The procedures that you use as arguments can be used customize as many other procedures as you like. So on the one hand, you get code reuse via the basic “skeleton” procedures… and on the other, you get opportunities to reuse the code that is being used to customize the “skeletons.” Anonymous functions can be used to in-line the customizations when they’re not going to be reused, so you end up with cleaner, more expressive code.

Before I looked much into it, functional programming sounded like a step backward. I thought that the whole idea of separating code and data out and encapsulating it into classes was the answer to everything. Of course, you don’t have to give up on classes altogether. You can use functional style techniques to expand the configurability of your objects. In some cases this might be a more appropriate choice than inheritance, for example.

Now, one thing I’ve noticed with my OOP designs is that I end up spending a lot of time tinkering with my interfaces. It seems like I spend more time fiddling with the properties and methods lists and avoiding breaking the Law of Demeter than I do actually working on real code. And then I go back and break apart classes and rework the object model some more. With first class procedures, you can often skip the tedious object instantiation ritual and the maintenance of a separate class file and just jump right to what you’re trying to do. And if you can apply your procedure to a wide variety of data structures, so much the better.

Along into this scene arrives the closure. Again, if you aren’t jumping up and down with excitement for what first class procedures can do for you, you won’t really appreciate what closures add in to the mix. A closure is just a procedure that carries an environment with it. It’s like a class that has just one method and as many private variables as you’d want to cram into it. If you’re entrenched in an OOP mindset, you might think that’s ridiculous– you can make such things in your current language, you think to yourself. But this is one of those cases where less is more… and when mixed with other functional techniques you get some powerful options at your disposal.

So a closure turns out to be just a procedure with state. It’s sort of like an extremely stripped down object. The first thing people tend to show off when they’re going over closures is how you can add a counter to a procedure. Not very exciting, eh? But yeah, you can actually use that simple counter to package all kinds of logging and diagnostic functionality with your procedures. You can make a procedure that takes a procedure as an argument and returns that exact same function with your diagnostic code bolted onto the inside of it. The rest of your code doesn’t care if you’re using the plain vanilla procedure or the one with the extra logging utilities running with it.

Another thing that attaching a counter to a procedure makes possible is the creation of iterators. You might have seen an article recently that talked about how one of these days you just might write your last for loop. This, my friends, is how you get rid of for loops: you write a procedure that takes a procedure and an integer as arguments and returns a closure as its value. Each time you call the procedure, the counter gets incremented and it returns the next value from the procedure. When the counter reaches the same value as the integer argument, the closure returns a null value. Then what you need is a means of applying your other procedures to the values that are kicked out of your iterators– and with that you are now set free from the tedium of cryptic for-next loop code.

Now you may be thinking that there’s nothing wrong with your trusty for-next loop. And if you’re an OOPy type of person, you might think that your for-each loop does the job just fine. But how would you say “for each prime number from 200 to 300, execute the following function with that number as an argument.” Functional programming gives you a very expressive way to code exactly that. Another situation where iterators are useful is where you’re working with infinite lists or loops that repeat back on themselves. You might not be able to store an entire list of objects in a collection all at once– but with iterators you can simulate that very thing. You probably already use something very similar to an iterator when you open up a text file and operate on it on a line by line basis.

It occurs to me as I write this, that there was a time when I’d write “do while not rs.eof” at least a dozen times a day. I’d crash my IDE at least once every week or so because I forgot to put in “rs.movenext” at the back end of loop. That redundant code peppered throughout the application was a major liability– especially when we ended up having to move to a different data access paradigm. Oh the pain! What we needed was an iterator that could be instantiated with a SQL string. Then we could have written procedures to operate on those iterators. No more annoying do-loops! As a special bonus, if we had been writing unit tests back then, we could have made dummy iterators that don’t even call a live database– they just ould have pulled in a stream of test data instead!

There’s a lot more we could delve into on this topic, but the point is that with even just a smidgen of functional programming techniques, we could have made our code much more concise. We would have had additional means of separating out redundant code, and it would have been much easier to “rewire” the code to change its behavior.

Now, I mentioned briefly already that in functional programming we’ll strive for making our procedures able to operate on a wide range of data structures. It turns out that the key to making your functions really powerful in a wide range of situations is to use recursion. In your Imperative/OOP world, you may have only used recursion in a few rare cases– like searching through directories or setting up a tree view. Well, in functional programming, the default approach to just about everything is to use recursion. Your data structures are often defined recursively, so your functions that operate on them are consequently recursive as well. Recursion becomes your meat and drink and even the air that you breath… and you become intimate with every variety of it.

And just as closures are always introduced with a lame “counter” example, so recursion is introduced with fibonacci numbers. This is a really good bad example, because it shows not only how recursion can be the easiest way to express something, but also that recursion can spin dangerously out of control if it is mishandled.

Anyways, as you study recursion, it often takes a lot of trial and error to get it right. You test your function with several arguments to make sure it’s coming out right. You add in code to print stuff out so that you can get a feel for what’s going on. You take out the text code when you think you’re done and you put it back in when you find a new problem you need to fix. There has to be a better way. Wouldn’t it be great if we could write code to help us write code? Sounds like a job for a macro!

What we’ve got for show and tell in today’s code example is a tool called memoization. When you memoize a procedure, you create a closure that contains the procedure and a hashtable to go with it. Each time you call it, the closure checks to see if the argument is already in the hash table. If it is, it returns what’s in the hash table instead of running the potentially time consuming code to determine the value in the usual way. If the argument is in the hashtable, the closure finds the value using your procedure and then stores it in the hash table. In the case of the typical fibonacci function, this results in a dramatic performance increase.

Looking at the code, we have two defuns, a defmacro, and a macro call. The testing-memoize function takes a function as an argument and returns a closure. The print statements we use to show us what our recursive function is doing are here. We also added a “back door” to allow us access to the hash table that is attached to the function. The other function, tree-substitutes, is a recursive function that searches a hierarchical list for a specific symbol and splices in a list of things in its place. The macro takes the same arguments as a defun, but sets up a parameter to hold the closure, defines the a slightly modified version of the function using the tree-substitutes function, stores the closure in the parameter, and then redefines the function to call the closure instead. Finally, we have an example of how to apply our memoization macro with our ubiquitous fibonacci procedure.

If we like the behavior of our recursive function, we can turn off the memoization by changing “memoize” to “defun” in our procedure definition. If we like the performance gains and want to keep them, we can create our closures with a different momoization technique that doesn’t print out diagnostic text or expose a back door.

So, we can see that closures give us a means of abstracting away any extraneous or temporary behavior in our procedures. These techniques don’t replace subroutines and libraries, but they do provide an additional means of eliminating redundant code. We also see that macros can be used to set up all kinds of behind-the-scenes code by giving us a means of establishing additional ways of defining functions. We could potentially set up “testing” and “production” variants of our definition macros to radically change the behavior of our application by switching between which package we reference.

Hopefully you now have a better idea of what you can gain from studying functional programming, closures and macros. We’ve only scratched the surface, of course. If you’d like to know more, I highly recommend the classic Structure and Interpretation of Computer Programs. If you’d rather learn from a cool Perl hacker (instead of a “smug Lisp weenie”) then you might try Mark Jason Dominus’s Higher Order Perl instead.

(By the way, if you take a peek at the hash table after running the fibonacci code, it makes a pretty picture for you at the REPL prompt.)

Closures, Objects, Forklifts, and the quest for truth

July 23, 2007

My first post on Closures has so far netted 2,235 hits. My second one took in 1,346 while my CodeMunger post has picked up a similar 1,338. The majority of this traffic has come in from Reddit with a smaller chunk coming in from YCombinator’s Reddit-like “Startup News” page.

I wrote these posts because the subject matter was interesting to me. I really had no idea that they’d be read this extensively. (At the time I wrote them I was planning on increasing my readership by doing a few more Emacs articles in order to pick up an extra 10 or 20 hits a day from Google!) The response was largely positive, due in no small part to the timing of Joel’s recent piece on the general “dribble of morbid, meaningless, thoughtless comments” that blog entries so often provoke. There was one guy, however, that was so angered by my overall stupidity (and the fact that I’ve introduced a lot of noise into the search engines) that he needed to go into graphic detail about the violent things he wanted to do to me. (Dude… I’m not the one that posted the article on Reddit. Sheesh.) I have felt pretty lame for posting my ideas to the web anonymously– it’s rather cowardly– but given my unanticipated infamy I think I’ll continue to hang back for a while.

At the same time, traffic such as this does not necessarily equate to readers. It certainly doesn’t mean thoughtful attention. Even Sean Ross’s balanced and well stated post seems to miss the things that I think are key to the discussion:

1) The entire point of the closure/OOP articles is to explore what Paul Graham meant when he wrote on page 2 of ANSI Common Lisp, “With macros, closures, and run-time typing, Lisp transcends object oriented programming.” So yes, I’m writing more from a explorative standpoint. I haven’t yet come to a conclusion about what I think about these things– but these first insights and ah-ha moments that I’m getting by pursuing this line of thought have been more fun than any other programming exercise I’ve ever undertaken.

2) And no, it’s not about closures vs. CLOS or anything like that: closures and objects appear to have an unusual tao-like relationship. And OOP itself is such a slippery concept that the idea that there is “one true implementation/approach” is… well… unrealistic. (And it’s pretty cool that you could, if you wanted to, implement in Lisp whatever vision of OOP you prefer. You don’t currently have the freedom to go the other way in the current reigning OOP languages.)

3) My 7 years of professional life have been predicated on the assumption that OOP should be the fundamental to all that you do in your programming labors. Some really smart people have done just fine without OOP… and see no need to incorporate it into their designs. How is it that folks like that are solving problems with closures and so forth? Are they imitating OOP techniques when they solve real problems? Or is it the other way around? Do advanced OOP programs tend toward having a veneer covering a hacked-out typeless stew of ugliness that belies the core premises built into the design of their languages?  What’s going on here?

And this brings me to Miklos Hollender’s quiet plea for sanity. Am I missing something important in my current quest for beauty and truth? Yes our tools might suck and our languages might hobble us… but shouldn’t I find a zen like contentment in my professional life in spite of all this? My answer to this is two part. One: if you are a professional programmer and you lack a fluency in the core concepts of real computer science, you ought to be doing something to remedy your ignorance. (And if ignorance doesn’t make you feel stupid, then there’s not much anyone can do for you.) Two: normal people in any other line of work would be angry if they were forced to do a job with markedly inferior tools. Programmers should not be content to carry water with pails and a yoke when forklifts and fifty gallon drums are available. While indeed there are many zen-like pleasures to be had in the simple things of life, maintaining badly architected code is not one of them. (At the same time, Miklos probably has a good point that meeting our clients’ needs with tools that are reasonably suitable for the task may not be “fulfilling”… but there’s no reason it can’t be a pretty cool job that we programmers execute well in spite of some of these “academic” issues.)

Nevertheless… there are questions that seem to have gotten lost in the curriculum and tool-kits and language wars. I, for one, would like to look into them. Of course… the next time I post something that difficult and controversial that I can expect to get page views in the thousands… well… I just might take the time to clean it up a little more. (And maybe avoid such shameless hyperbole, too.)

Closures + Lambda = The key to OOP in Lisp

July 18, 2007

As Truls Becken remarked in a comment on a previous post, “when it comes to encapsulation, the very foundation is that all state must be accessible only from inside the object’s methods.” On the one hand, it was loads of fun for us to be able to add methods to our “obi” on the fly at run-time. But on the other hand… those methods would have no clue about the state of the object. It seemed like a fatal blow to my approach. Maybe closures weren’t the key to home-grown object orientation in Lisp after all….

I puzzled over this for a while and tried to think about how I could solve this. I’ll never know if I could have figured it out on my own or not because I happened to pick up Peter Norvig’s Paradigms of Artifical Intelligence Programming last night. Chapter thirteen is all about objects and it just so happens that closures are the key to making things happen. Unlike Paul Graham, who began his OOP example from ANSI Common Lisp with inheritance (and multiple inheritance!), Norvig emphasizes encapsulation (via closures) as the cornerstone to his OOP design. If you ever wondered about the phrase “Lambda the Ultimate,” well… this is one example that illustrates why lambda really is the ultimate… uh… thing.

Here is a proof of concept of from the REPL prompt demonstrating that we’ve solved Truls’ problem:

CL-USER> (setf example (create-obi))
#<COMPILED-CLOSURE CREATE-OBI-1>
CL-USER> (setf setter (funcall example ‘set-size ‘() ‘()))
#<COMPILED-CLOSURE CREATE-OBI-1-1>
CL-USER> (funcall setter ‘huge)
HUGE
CL-USER> (setf shower (funcall example ‘show-size ‘() ‘()))
#<COMPILED-CLOSURE CREATE-OBI-1-2>
CL-USER> (funcall shower)
The size is HUGE
NIL
CL-USER> (funcall setter ‘extra-large)
EXTRA-LARGE
CL-USER> (funcall shower)
The size is EXTRA-LARGE

As you can see, the variables referenced in the set-size and show-size methods are the same… and they are protected from general access by the closure.

And here is my current code file. You’ll see there that I’ve been experimenting with reflection and a recursive approach to evaluating long chains of object references. Nothing earth-shattering, though. Yeah, I know… I still need to add class definitions and inheritance into it. And I know that I’ve got at least three conflicting strands of thought emerging there right now and that a lot will have to change in the coming iterations. If you really want to see how to do things right, go read Graham and Norvig.

But what I really wanted to say about all of this is that I’ve been puzzling over some of Paul Graham’s remarks for some time. How is it that closures can be so powerful? When I wrote my last post on closures and OOP… it had just come to me in a flash of insight that you could do stuff with closures. (It wasn’t really written to change the world or tell people how they should do things… and I was shocked that it was even read at all, much less taken seriously by anybody.)

And macros. Yeah… I could see that you can define looping constructs and so forth… and that such things couldn’t be done with a mere function. But my question always was, when will I ever want to do this? I hear lots and lots of people on the web droning on about their lisp-illumination, but I don’t see so much practical advice on how to get illuminated and what you can do with your new powers once you have them. I know what the Blub paradox is… but… how do you get out of it? Well for me it began when I realized that I could implement my own vision of OOP in Lisp if I wanted to. (That’s something that’s impossible in my professional toolset, now. And as a result, I think of that as a sort of thing that only ‘real’ programmers with hairy chests can ever even think about futzing with.) And as I first started experimenting, I saw that there was going to end up being a lot of repetitive code to get classes to work right. And then it sunk in… that I could write some code to write that code for me. And it would of course be macros that I would use to do that. And I could make things as elaborate as I wanted and all the work would be done at compile time… my constructs would hardly impact performance at run time.

And that leads inexorably to a third Ah-ha moment. With lambda, closures, and macros… I can build not just my own implementation of OOP… but I can build anything. I finally see what Alan Kay meant when he said, “Lisp isn’t a language, it’s a building material.” Lisp is amazing not because of what the Paul Grahams of the world do with it, but because it allows average programmers to solve problems that they thought were only solvable by geniuses.

So anyway… if you want to get illuminated… what do you do? Stay away from the web a bit. A lot of it will just confuse you unnecessarily. Stick with the very best books on Lisp… and tackle the hardest problems you can think of. You know, the one’s you’ve always felt were above your abilities. It will be like starting to learn programming all over again. But the sinking feeling in your stomach that you get at that realization will be offset by the fact that programming will again be as fun as it was when you were first learning.

If you are bored with your current set of tools, languages, and projects… that’s a sign that it’s time for you to start learning Lisp.

—————————————

Related Articles:

Closures + Hash Tables = As Much OOP as You’ll Ever Need

Closures, Objects, Forklifts, and the quest for truth

Closures + Hash Tables = As Much OOP as You’ll Ever Need

July 9, 2007

“In another thirty years people will laugh at anyone who tries to invent a language without closures, just as they’ll laugh now at anyone who tries to invent a language without recursion.” — Mark Jason Dominus

You might be wondering what all of the talk about closures is about, really. You might be scratching your head at the boring examples that just run a simple little counter inside a function. You might be wondering how functional programming can be more fundamental than (and even able to subsume) object oriented programming. You might be disappointed that Mark Jason Dominus hasn’t gotten his book up on-line, yet. You might be willing to settle for a quick dumb example in Common Lisp written by an average developer. If any of the previous guesses are right, you came to the right place.

In this text file, you’ll find a mere 16 13 lines of code that define a small chunk of OOP in Common Lisp. The code should not only be understandable to beginners, but it was actually written by a beginner– so it should also hopefully give an indication of how easy it is to build “new” fundamental-ish programming concepts onto Common Lisp.

An “obi,” as I define it here, is somewhat like an object, but there are some key differences. (I call my version of an object an “obi” because I’m implementing just a very small piece of OOP.) An “obi” is a hash table wrapped up in a closure. The closure provides one of the main pillars of OOP: encapsulation. Nothing in your program can mess with the variables inside of closure without interacting through the function call that the closure provides. The hash table provides a place to store all of your properties and methods. Because Lisp is dynamically typed and because functions are first-class objects, there’s really no need to make a difference between properties and methods. In an “obi,” functions are just one of the many information types that can be stored in the hash table.

In the example below, we see an obi being used from the REPL prompt. Its color and shape properties are set and also a function called add is put in and called. The “.” is used in Common Lisp to notate a dotted list, so I used ~ to represent “get”, ~> to represent the create/set a property/method, and ~! is used to represent a call to an obi’s function:

; SLIME: The Superior Lisp Interaction Mode for Emacs
CL-USER> (setf f (create-obi))
#<COMPILED-CLOSURE CREATE-OBI-1>
CL-USER> (~> f ‘color ‘red)
RED
CL-USER> (~> f ‘shape ‘square)
SQUARE
CL-USER> (~ f ‘color)
RED
T
CL-USER> (~ f ‘shape)
SQUARE
T
CL-USER> (~> f ‘add #’+)
#<SYSTEM-FUNCTION +>
CL-USER> (~! f ‘add ‘(1 2 3 4 5 6 7))
28

Obi’s lack any form of “type”– in fact, you can add additional properties and methods to them at any time. Even if you don’t intend to use Common Lisp in a production application, this is an option that should be in your mental tool box. You may not always need to resort to using inheritance or subclassing to create a variation of an object you defined. If you think a little bit more functionally, you can abstract out the bulk of the design and then allow other developers to customize its use by setting or overriding some of the key functions. Of course, in a wide-open typeless setting, that’s a feature you get whether you think you want it or not!

For a much more elegant and substantive example of how to implement OOP in Common Lisp with an insanely small amount of code, see Paul Graham’s Ansi Common Lisp.

PS If anyone knows how to trick WordPress into letting me display Lisp code without destroying the indenting, please let me know.

PPS Looking at the search engine traffic coming in here, I see that (besides all of the people trying to figure out carriage returns in Emacs) there are a few coming to the site for stuff like “lisp closures,” “oop in lisp” and “lisp hash tutorial.” Hopefully this post will provide a little more substance for those folks than the previous drivel I’ve spewed forth here in my clumsy efforts to get used to some of these new ideas.

PPPS Please forgive the hyperbole in the title.

——————————

Related Articles:

Closures + Lambda = The key to OOP in Lisp

Closures, Objects, Forklifts, and the quest for truth


Follow

Get every new post delivered to your Inbox.