How Studying SICP Made Me a Better Programmer

While working through the problems of SICP section 2-2 this week, I had a flash of insight. Things that had frustrated and stymied me for years were suddenly becoming trivial. One data structures problem that had particularly bugged me had become pretty easy, too– though I had to apply everything I’d learned in SICP about recursion, first class functions, and the closure property to do it. Here’s a quick tour of my proof of concept from the REPL prompt:

CL-USER> (setf c (make-hash-table))
#S(HASH-TABLE EQL)
CL-USER> (create-widget c ‘division-a ‘mega-corp ‘(factory-1 factory-2 factory-3))
(DIVISION-A MEGA-CORP (FACTORY-3 FACTORY-2 FACTORY-1) #S(HASH-TABLE EQL))
T
CL-USER> (create-widget c ‘division-b ‘mega-corp ‘(dist-1 dist-2 dist-3 main-office))
(DIVISION-B MEGA-CORP (MAIN-OFFICE DIST-3 DIST-2 DIST-1) #S(HASH-TABLE EQL))
T

We’re starting out by defining a hash-table to store all of our work in. I guess I could have stored everything in a global variable so that I wouldn’t have to keep referencing the store all of the time, but I figured it was worth it not to have all of my functions hard coded to a particular variable. With the above code we assigned division-a and division-b to a company we like to call Mega-Corp. We created some factories, distribution centers, and a main office while we were at it.

CL-USER> (gethash ‘mega-corp c)
(MEGA-CORP NIL (DIVISION-B DIVISION-A) #S(HASH-TABLE EQL))
T
CL-USER> (gethash ‘division-a c)
(DIVISION-A MEGA-CORP (FACTORY-3 FACTORY-2 FACTORY-1) #S(HASH-TABLE EQL))
T
CL-USER> (gethash ‘factory-1 c)
(FACTORY-1 DIVISION-A NIL #S(HASH-TABLE EQL))
T

Here we take a look at what we’re actually storing in our hash table. Each item in the hash is a list in the format of (WidgetName Parent-Widget Children-Widgets-List Properties-Hash-Table). You can see above that Mega-Corp has children, but no parent. Division-a has both a parent and children. And Factory-1 has a parent but no children.

CL-USER> (create-widget c ‘office-staff-f1 ‘factory-1 ‘(bob fred joe))
(OFFICE-STAFF-F1 FACTORY-1 (JOE FRED BOB) #S(HASH-TABLE EQL))
T
CL-USER> (create-widget c ‘office-staff-d1 ‘dist-1 ‘(james mike sally bert))
(OFFICE-STAFF-D1 DIST-1 (BERT SALLY MIKE JAMES) #S(HASH-TABLE EQL))
T
CL-USER> (gethash ‘office-staff-f1 c)
(OFFICE-STAFF-F1 FACTORY-1 (JOE FRED BOB) #S(HASH-TABLE EQL))
T
CL-USER> (gethash ‘bert c)
(BERT OFFICE-STAFF-D1 NIL #S(HASH-TABLE EQL))
T

Here we extend our data structure even further by adding some office staff groups. We can extend the structure as much we want– and in principle, there’s nothing preventing us from writing procedures to copy and combine such data structures however we choose. That wide-open combinatorial possibility is what the closure property is all about.

CL-USER> (setprops c ‘(bob fred joe james mike sally bert) ‘salary ‘(40100 55050 50010 35025 33099 75042 20010))
(40100 55050 50010 35025 33099 75042 20010)
CL-USER> (setprops c ‘(factory-1 factory-2 factory-3 dist-1 dist-2 dist-3 main-office) ‘assets ‘(800000 950000 725000 225000 235000 230000 125000))
(800000 950000 725000 225000 235000 230000 125000)
CL-USER> (getprop c ‘factory-2 ‘assets)
950000
T
CL-USER> (getprop c ‘sally ‘salary)
75042
T

These functions use mapcar to set a slew of property values all at once– without having lots of repetitive code. The getprop procedure allows us to read those settings back out again.

CL-USER> (getval #’+ 0 c ‘sally ‘salary)
75042
CL-USER> (getval #’+ 0 c ‘office-staff-d1 ‘salary)
163176
CL-USER> (getval #’+ 0 c ‘office-staff-f1 ‘salary)
145160
CL-USER> (getval #’+ 0 c ‘mega-corp ‘salary)
308336
CL-USER> (getval #’+ 0 c ‘mega-corp ‘assets)
3290000
CL-USER> (getval #’+ 0 c ‘division-a ‘assets)
2475000
CL-USER> (getval #’+ 0 c ‘division-b ‘assets)
815000

Here we’re using a recursive procedure to apply the plus function to all of the property values from a certain point in the hierarchy down. The zero in the s-expressions there is an identity value– just like the ones we coded in SICP section 1-3. I had to implement that just in case there were all NIL’s in my hash table for a certain property. With a careful use of property names, you can get some benefits of stratified design (covered in SICP section 2-2) in this kind of data structure. We can code in all kinds of information about the separate divisions at one level… and way down the hierarchy we can have all kinds of other information stored about the employees. While working at either end, we can effectively ignore all of the other levels. At the same time we can “roll up” related values however we like at any level we care to focus on.

(There’s probably a few other things I could have done to make this better, faster, or slicker. If you’re working through SICP, too, please check out this complete code file and let me know if you have any tips or suggestions. I used Common Lisp instead of Scheme, though. I really needed to stop a while and apply what I was learning in SICP to problems that I’d developed a personal stake in. Most people tend to abandon the book by section 2-2, so I better get back to work if I want to finish it!)

Back in the day, I would have avoided a recursive data structure like this if I was working on a similar task. I would have (horrors!) hard-coded the hierarchy at maybe 3 or 4 levels deep, coded all my tables as separate entities, and more than likely built in some work-arounds for the places where such choices were inadequate. (Thinking that OOP was the “right way,” I would have coded a class file for companies, divisions, offices, and employees– and then been stuck storing and retrieving data in as many separate RDBMS tables and figuring out how to map them back and forth somehow. Writing nested hash tables to a file, on the other hand, is trivial in Common Lisp.) The end result would have been a maintenance nightmare…. It’s a good thing I was too lazy to finish any of those pet projects back then! Funny, I could see clearly the problems of coding 50 slightly different forms when one generic form could be rigged up to do the same thing, but I couldn’t break through my mental limitations to think three-dimensionally about data structures. Thanks to many painful hours spent working through SICP, I can now synthesize recursion, first class functions, and the closure property to make more elegant solutions to things that used to be next-to-impossible for me.

Update 10/26/07: I did manage to follow this post up here.

This entry was posted on October 13, 2007 at 3:52 am and is filed under Computer Science, LISP, Programming. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

8 Responses to “How Studying SICP Made Me a Better Programmer”

Eli Says:
October 14, 2007 at 6:23 pm | Reply
FYI, such data structures are the bread-and-butter of Perl, that even has special “notation” for naming them. For example: HoHoA is Hash of Hash of Array, or a Hash table where each value is a Hash table, in which each value is an array.
ken Says:
October 14, 2007 at 8:50 pm | Reply
Yeah but he’s trying to become a *better* programmer. 😉
Jon Says:
October 14, 2007 at 9:18 pm | Reply
“And Factory-1 has children but no parent.”

should be: “And Factory-1 has a parent but children.”

[Oops; fixed that. Thanks. — lispy]
Jon Says:
October 14, 2007 at 9:55 pm | Reply
It doesn’t seem to me that the structure you describe here is itself problematic with OOP. It could be handled adequately with a something like the XML DOM. The difficulty arises when you want to have different semantics for different entities in the hierarchy. What can Lisp do for us here? You hint at “wide-open combinatorial possibility” but it would be nice to see some examples.

I can see that generic functions might create some possibilities – especially multiple dispatch when it comes to parent-child relationships; but I’m not sure that takes us much beyond what a dynamically-typed OO language provides (I only have a superficial familiarity with Lisp).

How does Lisp support the application of operations to this data structure. Are you planning a follow-up post?
lispy Says:
October 14, 2007 at 10:57 pm | Reply
Hi Jon,

You’re probably right about OOP not being at fault. Not being fluent in recursion would of course limit my ability to do things like this in any language. On the other hand, the desire to make a fancy object model using lots of inheritance works against the kind of generic approach that seems “obvious” after a couple of chapters of SICP. Nevertheless, that doesn’t prevent me from taking what I learn in Lisp and using it become a more savvy OOP guy.

There’s several other things I’d like to do with this structure beyond just storing property values. As I learn more from SICP, I will be sure to try to apply it and detail my progress in follow up posts on this theme.
lispy Says:
October 15, 2007 at 12:57 am | Reply
Jon,

Thinking about this some more, I realize you’ve put your finger on something that SICP is trying to teach me, but that I’m shying away from. I’ll try not to wimp out on this….
How Studying SICP Made Me a Better Programmer, part II « Learning Lisp Says:
October 27, 2007 at 2:22 am | Reply
[…] SICP Made Me a Better Programmer, part II Okay, I told “Jon” that I would follow this up, so here […]
SICP in Python | Pedro Kroger Says:
August 31, 2010 at 6:56 pm | Reply
[…] is considered one of the great computer science books. Some people claim it will make you a better programmer. It was the entry-level computer science subject at MIT and it’s still used in universities […]