A little experimentation with dispatch tables…

The basic model of development that’s been my hobby for the past year has been to use a custom scripting language to configure an object model which is then consumed by a GUI or something– eliminating large amounts of maintenance and opening up a form of tool-oriented development that allows for an order of magnitude increase in code reuse. That was my big thing after getting halfway through SICP– and the trojan horse by which I could sneak in some of the benefits of Lisp into a more Blubby environment. Doing that sort of thing got me into some serious parsing, which made me realize I really needed to bone up on regular expressions. I messed around with sed and awk enough that I finally broke down and decided to learn Perl. Now I’m working through O’Reilly’s Intermediate Perl book so I can understand and use more stuff from Dominus’s book Higher Order Perl.

The code below was mainly done so that I could get a better handle on references in Perl. I’m having a hard time getting up to speed with it, but hopefully after a few more projects like this I’ll get the hang of it. I also wanted to try to improve on the parser design that I thrashed out for Blub a while back. Finally, I wanted to play around with the concept of a dispatch table before rereading Dominus’s chapter on the subject.

Anyways… what I’ve done is constructed my first complex data structure in Perl: a set of nested hash tables and arrays that describe a table of data. We’re loading up the tables by parsing a text file. I’m tired of parsing beginning and ending brackets, so I’m trying maybe a more Python style approach of just using whitespace to indicate the end of a section. Here’s an example table.

Table Stuff:
	Key/[A-Z]/,     Description
	A,              Fooie
	B,              Barrie

The dispatch table (and Perl’s core features) allow us to treat code as data. With just a few language features built in to our parser, we can allow the users of our scripting language to extend and redefine the parser from within their scripts. Pretty cool…. (Dominus goes much further into this than I do, of course. And I’m crazy, so don’t judge his book by my bad code– I’m just learning, here, with my lame “3rd grade” level of Perl fluency.)

As I go further into the book Intermediate Perl, I can maybe come back and revisit this by turning it into an object. Also, I’m not using the regex’s I’ve tacked onto the table columns, so maybe I could write a validation routine that checks that the column data matches the expression. Another change might be to have a second dispatch table to handle the different state changes… but that may be overcomplicating it. [Actually, after refactoring a bit, it seems okay.] I’m sure I might come up with a different architecture if I read some chapters on compiler-like code….

#/bin/perl
use strict;
use warnings;
my %tables;
my $read_type = 0;
my $current_table = "None";
my $show_debug = 0;

sub cell {
    my ($table, $key, $column) = @_;
    my $rs = rows($table);
    my $r = $rs->{$key};
    return $r->{$column};
}

### returns an array of column names
sub columns {
    my ($table) = @_;
    my $t = $tables{$table};
    return  $t->{"Columns"};
}

### returns a hash of row hashes
sub rows {
    my ($table) = @_;
    my $t = $tables{$table};
    return $t->{"Rows"}
}

### returns an array of column regex's for validating cells
sub column_regexes {
    my ($table) = @_;
    my $t = $tables{$table};
    return $t->{"ColumnRegexes"}
}

sub add_table {
    my ($key, $table) = @_;
    $tables{$key} = $table;
}

sub add_row {
    my ($table, $key, $row) = @_;
    rows($table)->{$key} = $row;
}

### pass an array of columns and an array of column regex's
### and get a table hash back
sub initialize_table {
    my ($columns, $regexes) = @_;
    my %table;
    $table{"Columns"} = $columns;
    $table{"ColumnRegexes"} = $regexes;
    my %empty_rows;
    $table{"Rows"} = \%empty_rows;
    return \%table;
}

### pass a comma delimited header line from a table definition
### and get two array references describing the table structure
sub parse_column_header {
    my ($line) = @_;
    my @fields = split /,/, $line;
    my $column_number = 0;
    my @columns;
    my @regexes;
    print "reading columns to table $current_table: $line\n" if $show_debug;
    foreach(@fields){
        my $field = $_;
        $field =~ s/^\s+|\s+$//g; # trim field
        if($field =~ /^([^\/]*)\/([^\/]*)\//){
            $field = $1;
            $regexes[$column_number] = $2;
        }
        $columns[$column_number] = $field;
        $column_number++;
    }
    return (\@columns, \@regexes);
}

### pass a table name and a comma delimited header line from a table definition
### and get the row's key and a hash of detail data
sub parse_row_detail {
    my ($table, $line) = @_;
    my @fields = split /,/, $line;
    print "reading rows to table $current_table: $line\n" if $show_debug;
    my %row;
    my $column_number = 0;
    my $rowkey;
    foreach(@fields){
        my $field = $_;
        $field =~ s/^\s+|\s+$//g; # trim field
        if ($rowkey){
            $row{columns($current_table)->[$column_number]} = $field;
        } else {
            $rowkey = $field;
        }
        $column_number++;
    }
    return ($rowkey, \%row);
}

sub reading_table_header {
    my ($line) = @_;
    my ($columns, $regexes) = parse_column_header($line);
    add_table($current_table, initialize_table($columns,$regexes));
    $read_type = 2;
}

sub reading_table_detail {
    my ($line) = @_;
    my ($rowkey, $row) = parse_row_detail($current_table, $line);
    add_row($current_table, $rowkey, $row);
}

my $dispatch_table =
    { '^Table ([A-Za-z]+):'     => sub { $current_table = $1; $read_type = 1; print "(reading table $1)\n" if $show_debug; },
      '^\#(.*)'                 => sub { print "found a comment: $1\n" if $show_debug; },
      '^[\w]*$'                 => sub { print "(Whitespace line)\n" if $show_debug; $read_type = 0; }
    };

my $alternate_dispatch_table =
    { 1 => \&reading_table_header,
      2 => \&reading_table_detail
    };

while(<>){
    my $line = $_;
    my $success = 0;
    my $key;
    foreach $key (sort keys %{$dispatch_table}) {
        if ($line =~ /$key/){
            $dispatch_table->{$key}->();
            $success = 1;
            last;
        }
    }

    if ($success == 0 and $read_type > 0) {
        chomp($line);
        my $altcode = $alternate_dispatch_table->{$read_type};
        $altcode->($line);
    }
}

my $a = cell("Stuff", "B", "Description");
print "a is $a\n";
my $cols = columns("Junk");
print "Look at this: $cols->[0], $cols->[1], $cols->[2]\n";

Advertisements

6 Responses to “A little experimentation with dispatch tables…”

  1. Erik Says:

    You may also find the following piece on using state machines to parse structured files interesting reading as well:

    http://gnosis.cx/TPiP/chap4.txt

  2. lispy Says:

    Thanks…

    I *did* refactor the code to break it down to a lot of smaller subs and also… I coded the state-changing aspect of the as a second dispatch table. Thinking about it, each state may ultimately need its own dispatch table…. Hmm. Ah, well. Will read and think some more and come back to this later. That article gets deep fast….

  3. bm3719 Says:

    Did you give up on SICP? I ask because I’ve just started working through it myself, and I only saw references to chapter 1 while perusing. It is definitely a slow process. You spend a few minutes reading, then several hours solving the exercises. A couple I had to stop and think about for a day or two.

  4. lispy Says:

    No didn’t give up. 🙂 I had to take a break to apply the principles in a real world project before continuing. That project turned out to be my most successful professional project. (My resume looks so much better now… woo-hoo!)

    I was originally reading SICP so I could understand PAIP. Now I’m reading HOP so I can better grok SICP when I go back to it. I use Perl at work now. Also the Blub project I used to work out some ideas from SICP is much better when it’s reworked in Perl. Also Perl hackers are just plain cool folks in general. Also my cool hacker friends that would never use Blub for anything– they all use Perl and don’t care so much about Lisp/Scheme. Also it appears that my mega-mondo-big-dream project is pretty doable with some intermediate Perl hacks.

    If I solve my “dream problem”, then I will need to go back to SICP and PAIP so I can get a new dream. (Unless Mark Miller succeeds in getting me to try to learn Smalltalk next year…. Doh!)

    Basically… I’m aiming a little more towards getting something moderately significant finished (and in a form that other people can use) before going too much deeper into Lisp. Being the guy that studies Lisp but has never finished anything kinda stinks. So my priorities have shifted since last year; the serendipitous intersection between Perl, friends, and work seems to be a pushing the tipping point somewhere….

    That doesn’t mean I’m not jealous of Eli Bendersky, though….

  5. draegtun Says:

    Hi again Lispy,

    You doing some good stuff here and if you continue reading HOP then I’m sure it will get better and better.

    I think you right that you could encapsulate this all very nicely into an object at some point. However in the meantime here are some things which might prick your interest (unless you have already done these by now!).

    Some of your subs could be built dynamically…..

    for my $sub_name qw/columns rows column_regexes/ {    
        no strict 'refs';
        *$sub_name = sub {
            my ( $table ) = @_;
            return $tables{ $table }->{ camel_case( $sub_name ) };
        };
    }
    
    sub camel_case {
        my ( $text ) = @_;
        $text =~ s/_(\w)/\U$1/g;    # _a  to A
        return ucfirst $text;
    }
    

    NB. *$sub_name = sub {} is the key. The asterix is a typeglob and this line assigns a subroutine to the symbol table using name in $sub_name.

    Yes dispatch tables are a very powerful tool. Thought it might be nice to only build the debug parts when necessary…..

    my $do_table = sub { $current_table = $1; $read_type = 1 };
    
    my $dispatch_table = { 
        '^Table ([A-Za-z]+):'     => $do_table,
        '^[\w]*$'                 => sub { $read_type = 0 },
    };
    
    if ( $show_debug ) {
        $dispatch_table->{ '^Table ([A-Za-z]+):' } = sub { $do_table->(); print "(reading table $1)\n" };
        $dispatch_table->{ '^\#(.*)'             } = sub { print "found a comment: $1\n" },
        $dispatch_table->{ '^[\w]*$'             } = sub { print "(Whitespace line)\n"; $read_type = 0; };
    }
    

    Thats ok but I think this is even flexible……

    my %dispatch_data = (
        
        '^Table ([A-Za-z]+):'  => { 
            code    => sub { $current_table = $1; $read_type = 1 },
            debug   => sub { print "(reading table $1)\n" },
        },
        
        '^\#(.*)' => {
            code  => {},
            debug   => sub { print "found a comment: $1\n" },  
        },
        
        '^[\w]*$' => {
            code    => sub { $read_type = 0 },
            debug   => sub { print "(Whitespace line)\n" },
        },
    );
    
    # build the dispatch table (unlike one below builds dispatch precisely)
    my $dispatch_table = {};
    while ( my ( $regex, $dispatch ) = each %dispatch_data ) {
        $dispatch_table->{ $regex } = sub {
            $dispatch->{ code  }->()  if exists $dispatch->{ code };
            $dispatch->{ debug }->()  if $show_debug;
        }; 
    }
    

    The last part could be amended so that it only creates the dispatch regex if there is some code to run like in my first attempt (useful if you have a large dispatch table).

    Hope this helps. Keep up the excellent blogging.

    /I3az/

  6. using the sphincter-sigil to abuse Perl « Learning Lisp Says:

    […] Learning Lisp (notes from an average programmer studying the hard stuff) « A little experimentation with dispatch tables… […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: