Archive for September, 2007

Mastering Higher Order Programming Techniques with SICP Chapter 1

September 22, 2007

Section 1.1 was deceptively simple, but things rapidly got heavy in section 1.2. The process of tree recursion was exceedingly difficult to visualize. Exercise 1.13 asked us to “use induction” to prove something. Plugging through exercise 1.20 was exceedingly painful back when I was working on my own. The march of technology added some complications to the later problems that caused me to dread going on… and the Miller-Rabin problem was completely mystifying. (I hacked out a Common Lisp version of the algorithm described on Wikipedia, but even that exercise failed to give me the insight necessary to understand the question. I haven’t been that befuddled since that problem from PAIP chapter 2….) Collectively, these experiences might lead one to believe that working through SICP was a lost cause….

A couple of things can really improve your chances of getting through it, though. One thing is to definitely watch the videos— they really make things interesting and more accessible. Another thing is just to have a friend or two that you can bounce things off of. “I tried x on problem y… is that what you think they wanted there? Also, are you getting something like z on problem w?” A few encouragements from a peer can really help you focus.

Continuing on into section 1.3 things got much easier. The lecture was outstanding. It is probably the definitive sales pitch for higher order thinking. If you ever wondered what people mean when they talk about “higher order procedures,” this is the place to go. The problems for the most part were straight-forward or about medium level. A lot of them push you to alter or imitate clear examples, so they are pretty “fair” as exercises go. Others require a modicum of persistence and experimentation, but are still pretty straight forward. I got through most of them with only a couple of minor questions. (Simpson’s rule in 1.29 didn’t work for me as it was advertised. The instruction to “show that the Golden Mean is a fixed-point transformation” is perhaps beyond my math skills. I wasn’t sure how to test my answer in 1.44, and I’m not quite sure where to go with 1.46.) But if the goal here was to get me to understand when and how to implement functions that take functions as arguments and that return a function for their value… then I’m there. They push the level of abstraction… and then push it some more… and then push it until I want the merry-go-round to stop. I used to think that mapcar was pretty darn cool. Now I have an entire new mental apparatus that I can apply regardless of the language or platform.

So I’ve got the bulk of the problems from the first chapter completed and it’s all culminated into a deeper understanding of how to apply and implement functional style abstractions. I’m really pleased with how far they’ve brought me in the text, but I begin to get anxious: I want to go code something… and I’m afraid of how far into the stratosphere they’re going to push my mind. I wanted at first to learn more powerful methods of creating general solutions… but really, I only wanted to get maybe half an order of magnitude better as a programmer. I had no idea that they would push my mind so far beyond its comfort zone. I had no idea that I even had a mental comfort zone! The experience gives me a new respect for MIT… and I’m glad I have a chance to experience a bit of what they put their computer science students through. (I want to also give my thanks to the SICP study group for providing me the inspiration and the encouragement to continue working through this book!)

Two minor things stand out for me, as far as mental processes go. One is the “faith oriented programming” that Sussman advocates in the lecture: if you’re not sure how to code a complex routine, just assume the hard parts are already done. If you can put a name on what’s missing, you’ll be able to easily address it later. Thankfully, DrScheme doesn’t auto-highlight “syntax errors” in this case! This worked surprisingly well for me when I went back and reworked the Pascal’s Triangle problem from section 1.2. Also, in section 1.3 they formalize the “guess-fudge” method that helped me do well in so many math classes in the past. The iterative approaches in the section mirror my own problem solving techniques: if you’re not sure of the answer, guess… then make small improvements or adjustments until it’s good enough. Combining these techniques gives you the key to solving recursion problems: start with a rough outline the covers the trivial cases… then “guess-fudge” the missing piece into existence with a function or two. This works for many of the problems in the text.

Anyways, if you got overwhelmed in section 1.2, just hang tight… it gets easier. Or maybe you get smarter. Something. They keep the pressure on, but they are very steady in section 1.3. It’s a lot of work– almost like learning to program all over again– but everything they make you do is part of their perfect plan to transform the way you think about algorithms. I can’t for the life of me imagine why so many people have complained about Scheme: it’s an exceeding beautiful language that gets out of the way so that you can express raw thought in its most elegant form. It’s hard to believe how much can be accomplished with only a handful primitives and parentheses….

For those that want to compare notes, here’s my code for exercises 1.29 to 1.33, 1.35 to 1.39, and 1.40 to 1.46. Also, here’s my algorithm for the Miller-Rabin test based on the instrictions from this Wikipedia entry.

Recursive Discontents: Grahamian Macho Programmers are Restricted and Narrow-minded

September 14, 2007

Michael Harrison recently wondered why recursion was emphasized so much in Scheme and noted that it was perfectly natural for a language designed to model algorithms to do so. Notice the repetitive application inherent in Euclid’s algorithm: “Given two natural numbers a and b, not both equal to zero: check if b is zero; if yes, a is the gcd. If not, repeat the process using, respectively, b, and the remainder after dividing a by b.” Notice that step 6 of the Sieve of Eratosthenes is to “Repeat steps 4 and 5 until no more numbers are left in List A.” Step three of the Babylonian method for calculating square roots is “Repeat step 2 until r and x / r are as close as desired.” From the dawn of mathematical thought, recursion has naturally emerged in the study of algorithms. A computer language built to express algorithms concisely would necessarily include recursion.

But can’t we do without recursion in a programming language? Aren’t FOR…NEXT loops and DO…UNTIL loops sufficient? Those are nice friendly abstractions, of course, but they are just two special case forms of iteration. As we see in SICP lecture 1b (and section 1-2 in the book), recursive functions can be used to define any recursive or iterative process. Simpler language concepts will allow you to get by on most business programming assignments, but sooner or later you’ll come to a problem where you will actually need recursion to solve it. It is better to be fluent in the more general tool than to limit your mind to expressing itself with a small subset of what is possible.

But there’s something deeper here than the convulsions of Blub programmers sparing with smug functional dweebs. Recursion lies in the very heart of the “accurate reckoning for inquiring into things, and the knowledge of all things, mysteries… all secrets.” Recursive functions are the only functions that can be computed mechanically: Alonzo Church developed his lambda calculus in order to explore function definition, application, and recursion. Thanks to his work on the lambda calculus and to the work of Kurt Godel, he was able to demonstrate that Hilbert’s famous Entscheidungsproblem was impossible to solve. All known computational processes are equivalent to lambda calculus and Turing machines… and both of those have been shown to be equivalent to recursion.

Later on, John McCarthy (who was a student of Church) used the lambda calculus as a basis for the Lisp programming language. Recursion is inescapable in any dialect of Lisp. Not only is the list data type itself defined recursively (as Chuck Hoffman recently pointed out), but as we are reminded in SICP lecture 1-b, the evaluation process of Lisp is defined recursively as well. I expect recursion to be a continuing theme in the course as our work culminates into an understanding of the Metacircular Evaluator. At any rate, given the title of Macarthy’s original paper on Lisp, it is evident that the language itself is designed primarily as a means of expressing recursive functions.

As Leo Comerford Matthias Benkard said, “Lisp, in its most general sense, is a uniform notation for programs.” SICP will gradually develop our ability to think and implement code in Metalinguistic terms. Theoretically, this will allow us to conceptualize and solve problems without having our minds hindered by the limitations of procedural, OOP, or even functional methodologies– even while we cherry pick from those approaches whenever we feel inclined. Lisp’s homoiconic and multi-paradigmatic nature would seem to facilitate such thinking. It seems to me that someone who had mastered such techniques would be quite far from being “restricted and narrow-minded.”

SICP Exercises 1.22-24 are Somewhat Out of Date

September 13, 2007

(This is the point where I broke down and finally started using Scheme.  Going without my Emacs shortcuts in DrScheme is painful, but the IDE-like environment is probably a lot more friendly to the raw recruits.  I’m as of yet too lazy to russle up SLIME48/Scheme48 for now and just want for finish some problems….)

Starting with exercise 1.22 in SICP, we are given code with a reference to “runtime.” This does not exist in the MzScheme language I was using under DrScheme…. Thanks to a quick tip from Sridhar Ratnakumar, I didn’t have to bang my head against the wall figuring this out. By defining runtime thusly, (define (runtime) (current-milliseconds)), the code from SICP will run as-is.

One of the first things we have to do here is to write a procedure search-for-primes that “checks the primality of consecutive odd integers in a specified range.” If you think in Basic, you’ll probably want a FOR-NEXT loop for this. Or if you’re really hip, you might want a DO-UNTIL loop. I looked at some options for doing that, but I have to admit that a recursively defined iterative function is much simpler. Much simpler than trying to keep track of yet-another-language’s control construct syntax….

Exercise 1.22 asks us to compare the runtimes of our prime checker for the ranges of 1000 to 1,000,000. On my machine, these came back as taking 0 milleseconds– apparently my box is a little faster than the ones running back in the eighties. I started my comparison at 100,000,000… and did a quick munge to make the code only give output when we actually hit a prime. We can see the output of my code below. Each time my range goes up by a factor of 10, we expect the time it takes to find the prime to increase by about SQRT(10) or 3.1622. Is that what’s happening here?  (Here’s my code file if you want to try this at home.)

> (search-for-primes 100000000 100)

100000007 *** 0
100000037 *** 0
100000039 *** 15
100000049 *** 0
100000073 *** 16
100000081 *** 0
> (search-for-primes 1000000000 100)

1000000007 *** 16
1000000009 *** 16
1000000021 *** 31
1000000033 *** 16
1000000087 *** 16
1000000093 *** 15
1000000097 *** 32
> (search-for-primes 10000000000 100)

10000000019 *** 141
10000000033 *** 313
10000000061 *** 140
10000000069 *** 156
10000000097 *** 312
> (search-for-primes 100000000000 100)

100000000003 *** 656
100000000019 *** 641
100000000057 *** 656
100000000063 *** 656
100000000069 *** 656
100000000073 *** 656
100000000091 *** 672
> (search-for-primes 1000000000000 100)

1000000000039 *** 2656
1000000000061 *** 2281
1000000000063 *** 2124
1000000000091 *** 2124
> (sqrt 10)
3.1622776601683795
> (* 16 (sqrt 10))
50.59644256269407
> (* 32 (sqrt 10))
101.19288512538814
> (* 140 (sqrt 10))
442.71887242357315
> (* 313 (sqrt 10))
989.7929076327027
> (* 656 (sqrt 10))
2074.454145070457

Just eyeballing it here, it appears that as we get further down the list, the closer the time factor gets to SQRT(10).

Let’s try that last one a few more times to see if there’s any variation:

> (search-for-primes 1000000000000 100)

1000000000039 *** 1968
1000000000061 *** 1937
1000000000063 *** 1969
1000000000091 *** 1922
> (search-for-primes 1000000000000 100)

1000000000039 *** 1937
1000000000061 *** 1968
1000000000063 *** 1937
1000000000091 *** 1984

> (search-for-primes 1000000000000 100)

1000000000039 *** 1953
1000000000061 *** 1999
1000000000063 *** 1968
1000000000091 *** 1999

Okay… so we’re pushing 2000 now for these primes. Go up another order of magnitude and we should get around 6000 or so.
> (search-for-primes 10000000000000 100)

10000000000037 *** 6280
10000000000051 *** 6217
10000000000099 *** 6311
> (search-for-primes 10000000000000 100)

10000000000037 *** 6326
10000000000051 *** 6670
10000000000099 *** 6295

Allright! Spot on. Now let’s try it with the changes suggested in Excercise 1-23.

> (search-for-primes 10000000000000 100)

10000000000037 *** 3374
10000000000051 *** 3546
10000000000099 *** 3389
> (search-for-primes 10000000000000 100)

10000000000037 *** 3375
10000000000051 *** 3358
10000000000099 *** 3531

And it looks like we’re not quite halving the time it takes to solve the problems now…. Nice.

Moving on to Exercise 1-24… we end up with problems. The random number generator expects numbers just within a certain range– and because our machine is running so fast, we’ve bumped things up to some *really* large numbers….

random: expects argument of type <exact integer in [1, 2147483647]>; given 9999999999999

Another problem we have is that the compiler doesn’t grok our true and false constants. We need to change the code in start-prime-test to use fast-prime? to use #t and #f instead.

Let’s run our code without fast-prime? for the upper bound of our random number generator.

> (search-for-primes 2147483547 100)

2147483549 *** 32
2147483563 *** 31
2147483579 *** 31
2147483587 *** 31
2147483629 *** 32
2147483647 *** 31

And now let’s try it with fast-prime? with the number of Fermat tests being run 100 times for each prime.

> (search-for-primes 2147483547 100)

2147483549 *** 0
2147483563 *** 16
2147483579 *** 15
2147483587 *** 0
2147483629 *** 16
2147483647 *** 16

Even with that insane amount of tests, we can identify the primes in either half the time or instantaneously. Let’s shift the number of Fermat tests down to 10 and see if there’s much difference.

> (search-for-primes 2147483547 100)

2147483549 *** 0
2147483563 *** 0
2147483579 *** 0
2147483587 *** 0
2147483629 *** 0
2147483647 *** 0

We get the same results… but all are practically instantaneous now.

I don’t think I have enough information to give a really good answer to Exercise 1.24, but it’s clear that log(n) growth is the way to go if you can manage it. The fact that you have to choose the number of Fermat tests to run adds some additional complexity to the question… and the combination of a fast computer with a limited random number generator makes it difficult to determine what’s going on here experimentally.

Special Iteration Constructs are Useful only as Syntactic Sugar

September 12, 2007

Michael Harrison has been trying to wrap his head around the reasons for why Scheme emphasizes recursion and discourage looping constructs like loop and while. SICP actually digs into this very topic in section 1.2:

“One reason that the distinction between process and procedure may be confusing is that most implementations of common languages (including Ada, Pascal, and C) are designed in such a way that the interpretation of any recursive procedure consumes an amount of memory that grows with the number of procedure calls, even when the process described is, in principle, iterative. As a consequence, these languages can describe iterative processes only by resorting to special-purpose ‘looping constructs’ such as do, repeat, until, for, and while. The implementation of Scheme we shall consider in chapter 5 does not share this defect. It will execute an iterative process in constant space, even if the iterative process is described by a recursive procedure. An implementation with this property is called tail-recursive. With a tail-recursive implementation, iteration can be expressed using the ordinary procedure call mechanism, so that special iteration constructs are useful only as syntactic sugar.”

(I thought it was cool that his thoughts on recursion ended up being tied together with my ramblings on syntactic sugar in this second section of SICP.)

So one reason why you see looping contructs being the “default” in other languages is that they aren’t always optimized for handling recursion. Lisp programmers in general are more concerned with aesthetics and with being concise… and because SICP embraces the “right way” approach, I expect to see code that looks more like mathematical proofs than the typical code-munging “get ‘r done” style you’re liable to run into in the real world.

Meanwhile, in “real” Lisp programs, I don’t expect to see so much compulsive recursion. I don’t even expect to see people writing macros to develop their own custom looping contructs. What I do expect to see is people using “the mighty loop” instead. (You can see in my last little project how the lisp hacker eliminated almost all of my gratuitous recursion– and how he used loop in a nifty read macro to take care of things like “roll 2d6” and “roll 1d6.”)

At any rate, in the same way that it is cool that you can roll-your-own OOP system in Lisp with closures and lambda… I think it is important for programmers to be able to conceptualize iterative processes in terms of recursion. To think deep thoughts, you need to be free from the crutches of less concise ways of imagining things. This is just the sort of thing that “average people” in math classes everywhere whine about: “when am I going to use this stuff?” But how much real thinking do they do, anyways?! They avoid it like the plague!

Seriously though, which kind of world do you want to live in? One in which the mathematics that determines who gets what land is the exclusive domain on the Egyptian priesthood? Or one in which someone like Euclid has expressed the underlying principles of that mathematics… thereby opening the door to make it possible for lesser men to discover and conceptualize things he never imagined. Do you want to spend the rest of your career merely implementing the ideas of others? Or would you like to have the option of doing creative work in spite of the fact that you’re not necessarily a genius?

My hunch is that SICP is The Elements of our generation.  That’s why they’re boiling things down as far as they can go– to an almost “elemental” level….  They are training us to think pure thoughts so that we can connect them to yield deeper insights than we would otherwise be able to obtain without such seemingly pedantic formalizations….  A side benefit is that they don’t need to spend hardly any time explaining language syntax, but the real point is to teach us a way of viewing programing languages not in terms of “how do I make a GUI” and “how do I talk to a database….”  (Anybody can pick up that sort of stuff from a quick google.)  They’re teaching us a more mathematized way of understanding and discussing programming languages so that we can think about them instead of just thinking in them.

Computer Science is Not About Computers… and tasting a bit of Syntactic Sugar

September 11, 2007

I finally got around to watching the first of the SICP lectures. This first third of it is a real treat for those with a mathematical bent. I summarize Sussman’s Abelson’s argument here:

Computer Science is not a science.

Computer Science is not about computers.

Computer Science is not even “real.”

Geometry is not about Surveying or Surveying instruments.

Geometry is about an using axioms and logic to provide a framework for exploring declarative knowledge– identifying and articulating what is true.

Similarly, Computer Science is really about imperative knowledge– articulating how to do things.

Computer Science deals with idealized components. There’s not all that much difference between what you can build and what you can imagine. The constraints imposed in building large software systems are not physical constraints, but rather the constraints are the limitations imposed by your own mind.

Computer Science is therefore an abstract form of engineering.

The second portion of the lecture dealt with an overview of the course. SICP is going to focus on programming techniques for controlling complexity. Sussman made something as simple as Black Box Abstraction sound really interesting: the key to that was in how he highlighted that we’re not just going to build black boxes, but that we’re also also going to build black boxes that return other black boxes… and by passing black boxes as parameters, we’ll be able to obtain another order of magnitude of abstraction in our programming style that is far beyond what non-functional programmers tend to think in. He spent less time selling us on the Conventional Interfaces section, though he did mention that this part would hit on OOP somewhat. The final portion of the course on Making New Languages sounded the most exciting– but his brief description of Metalinguistic Abstraction sounded (at this point, mind you) pointlessly geeky and pedantic. I must as of yet lack some sort of illumination, because I suspect that my mind should be blown away by the mere hint of the magic of Eval and Apply….

In the last section of the lecture he went over a bare bones introduction to Lisp. He was using Scheme– and even though I have gotten DrScheme up and running with almost no hassle, I still recoil at the thought of spending any time away from Common Lisp, Slime, and Emacs. Nevertheless, the Scheme approach is very clean and slightly more intuitive than the traditional Common Lisp way of doing things. Just for fun I wrote a macro that implements some of the functionality of Scheme’s define operator:

CL-USER> (macroexpand-1 ‘(define (sq-a x) (* x x)))
(DEFUN SQ-A (X) (* X X))
T

CL-USER> (macroexpand-1 ‘(define sq-b (lambda (x) (* x x))))
(DEFUN SQ-B (X) ((LAMBDA (X) (* X X)) X))
T

CL-USER> (define (sq-a x) (* x x))
SQ-A
CL-USER> (sq-a 11)
121
CL-USER> (define sq-b (lambda (x) (* x x)))
SQ-B
CL-USER> (sq-b 12)
144

Note that the second case there was somewhat more tricky to implement. I tried several different ways of doing it with setf and defun before getting a version that would work identically to the first case. The key (or possibly the ‘hack’) was to define the function with the arguments from the lambda… and then splice those arguments in with ,@ after the lambda expression. I think I had this same difficulty when I was trying to understand Michael Harrison’s example from the Little Schemer a few weeks ago– Scheme and Common Lisp seem to work in subtly different manners in this case.  But working through this example has helped me understand the differences a little better.

Anyways, the most shocking thing in the lecture for me was when Sussman blithely tossed out the phrase “syntactic sugar.” (He defined it as “somewhat more convenient surface forms for typing something.”) I usually associate this phrase with smarty pants blogger types that really want to sound cool. Yeah, like all you need to do to sound cool is to criticize Java a little and throw this phrase around!! Now that I’ve seen it turn up here, I guess I’ll have to get down off my high horse the next time my hackles are raised over this….