Archive for the ‘Design Patterns’ Category

Why UML Fails to Add Value to the Design and Development Process

October 29, 2008

While attending the Domain Specific Modeling workshop at OOPSLA 2008, I heard many pointed criticisms of UML. No one went into detail, so I bought a book on DSM by Steven Kelly and Juha-Pekka Tolvanen. The book was not cheap– over ninety bucks when tax was added. (Doh!) So far I’ve read the first four chapters, but they cover the problems of UML fairly well in those early sections and also outline the basic tenets of the DSM philosophy as well. I’ve synthesized the gist of their points below. All the good ideas about modeling architecture below are imperfect summaries of Kelly and Tolvanen’s work. The opinionated ideas about programming languages and tools are my own.

UML is applying an abstraction at the wrong end of the problem. It is primarily used to sketch object models for inferior languages. As such, it tends to explode into incomprehensible patterns of accidental complexity in order to accommodate the various “design patterns” that are used work around the lack of essential language features. Because the UML models cannot be compiled, executed, or interpreted, they are reduced to the level of mere documentation. As such, it is generally not even worth keeping in sync– the manual round trip from the code to the model and back is just too expensive for something that adds no more value to a project than an elaborate code comment. (Slides of elaborate UML diagrams, on the other hand, are nevertheless great for impressing the uninitiated in a presentation, of course– that goes without saying!) Efforts to “fix” UML tend to not to gain traction: the specification is itself too broad and coding environments and applications are themselves too diverse.

A modeling language needs to do three things on order to become useful:

First, it should map directly to domain problem concepts. UML is designed to map to coding architectures– and because of this it fails to raise the level of abstraction. The jump from assembler to C gave an order of magnitude increase in productivity because of its corresponding increase in abstraction. OOP languages and UML have not given the increase in productivity gains that they should have– in some cases they may even hurt productivity. Modelers should not even think about implementation details when they develop their language. Instead they need to focus on mapping their ideas directly to the domain concepts. (Note: this is not a new idea. Ableson and Sussman popularized this approach in their SICP lectures at MIT where they demonstrated how Scheme would allow them to define programs that called functions that didn’t even exist yet. They would start with the right solution… and then gradually build up until it would run. Michael Harrison talked about this when he described the ‘wishful thinking’ approach to programming from SICP.)

Second, the modeling language must be formalized. The modeling language must be a first-class citizen of the development process rather than just make-work for architects and project managers. It must be possible to generate useful executable code from the models made within the language. The language needs to add value not just in communicating with domain experts and helping them validate the “business logic”– it needs to add value at all levels of the development process. It needs to raise the level of abstraction for the code maintainers by allowing them to stop thinking about the underlying frameworks and libraries. It needs to contribute to testing efforts by eliminating the need for implementing certain classes of tests and also by providing a basis for generating other classes of tests automatically from the models. It needs to possible to generate documentation automatically from the models. The models should be useful in and of themselves and should be significantly useful to all development tasks downstream from them.

Finally, the modeling language should have first-class tooling support. The tools should not be thought of as IDE extensions for programmers. These tools are not “wizards” to generate ugly code or partial stubs for developers to flesh out. The tooling should stand alone for the domain experts; if they have to think about the code at all then the tools are developing in the wrong direction. Expert developers aware of the intricacies of the domain problem need to organize their frameworks and libraries in such a way that the models can be “compiled” into fully functioning code. The correct analogy for this is, again, to think in terms of compiling C to machine code. The C compilers are created by machine language experts. C programmers do not modify the compiled machine code– they use the results in practically all cases. The C programming environment allows the programmer to work at a higher level– without having to think in terms of the underlying machine code or hardware. C is, in effect, a modeling language for machine code.

So, to get a modeling language that is actually useful, you have to go “Domain Specific” on both sides of the problem: the modeling language has to map directly to the problem domain and the generated code has to map directly to the target environment. There are two linguistic abstraction barriers that must be implemented in order to make this work: 1) the modeling language between the models and the generated code and 2) the framework between the generated code and the target libraries. You must build up from your core code components to the framework… and you must build down from the models to the generated code. If the code generation process is too complicated, you may need better abstractions at the framework level. If the code generation process is impossible, then the modeling language may not be providing a detailed enough description of the requirements. If there is too much repetition in the models, then the modeling language will need to be extended to cover additional concepts.

As Steven Kelly said in a 2006 article, “To make model-driven development work this way, you cannot use a general-purpose design language like UML and a modeling tool’s built-in code generator. The people that created UML did not design it for describing applications in your domain, or for generating code other than skeletons. Despite the efforts of many to make it suitable for generation, no one has ever, or will ever, make it happen in a way which is still smart and convenient. An application’s behavior and the rules it has to adhere to are domain-specific. To capture that effectively and completely in a graphical design, you need a Domain-Specific Modeling language.”

Update 10/30/08:

I have discovered some interesting ideas from the opposite point of view. Here is some remarks from Franco Civello— someone that has used UML successfully in model driven development:

“… having produced informal use cases to clarify requirements, and a domain model to get an initial understanding of the subject area, the analyst produces a precise specification model, in UML, in which the system to be developed is represented as an object, belonging to a type (note, not a class, as the system is an abstraction used to define visible behaviour, not a software entity to be directly implemented in e.g. Java).”

Notice the key similarity there with the Kelly/Tolvanen Tolvanen approach. Civello is not using UML to describe the code architecture– he is mapping the UML more toward the problem domain.

“Steps in the use case flows are then formalised as operations on the system type, with a declarative specification of behaviour based on the notion of functional contract, written as pre- and post-conditions expressed on an underlying model (the system type model, derived closely from the domain model).”

Again, this point reflects back to my second point: there must be a formalization of the modeling language at some point. Note also, we have (possibly) an analog to the declarative approaches that I saw in the ModelTalk presentation last week.

“UML static modelling gives you the language to represent the state of the system in abstract and yet precise terms. Just don’t think of your classes as software things with methods and member data, but as specification types that give you the vocabulary to specify the business outcome of a system operation. These types can have attributes, associations, queries, constraints and definitions, all powerful UML concepts, but no state-changing operations. The only state-changing operations are defined at the system level, and are not elaborated into message-based solutions, but just specified declaratively in terms of business logic, steering clear of any design decisions.”

How Studying SICP Made Me a Better Programmer, part II

October 27, 2007

Okay, I told “Jon” that I would follow this up, so here goes.

The problem I had with the code last time was that I was leaning too much on bad habits I developed staring at too much relational database code. I used a hash table to store everything about the data structures. In all my functions for manipulating the data, I had to tag the arguments with an additional reference to to the hash. This was pointlessly cumbersome. What I really needed to do was eliminate my dependency on a data store and just use cons, car, and cdr to construct everything. This allows me to treat my data structures (and pieces of my data structures) as primitives in a linguistic abstraction layer. As a bonus, all of the standard common lisp operators work with them without any “crutches” or extraneous code getting in the way.

One thing I learned going down this road is that the Visitor Pattern is hardly anything more than an application of mapcar. I remember trying to learn about Design Patterns a few years ago and I just couldn’t pick many of them up. The code examples for them and the UML diagrams seem to me to do a good job of obfuscating the whole point. Your brain has to “compile” so much information just to get to the point– its no wonder so many developers mangle them up in the application of them. After reading SICP, it’s much easier to parse them. In a lot of cases they just do things that are very similar to what I want to do in my more functional language anyway.

Another thing I picked up in this exercise was that dynamic variables and hash tables store references to the “conses”. This means you have a lot of control over the lists that are stored in them. It also means that I don’t have to work too hard to write code to manipulate or transform them. Life certainly gets easier coding in Lisp once you think in terms of “pairs” so fluently that they are more natural than anything else. The example code for this post demonstrates some implications of this in detail, so folks that are just beginning their studies with Lisp will want to take note of what’s going on there when we start doing the “forbidden” destructive operations….  (The output of the Common Lisp code should look like this.)

Finally, after having slogged through most of the first two chapters of SICP, I can finally understand much better how bad a developer I’ve been. On the one hand, I can comprehend a much wider range of architectural approaches… but as a consequence it’s sinking in more and more just how limited my mental programming vocabulary was up until now. I really had no clue how ignorant I was. I knew I was struggling with things that should have had tidier answers, but I really thought I could only ever get just a little bit better than what I was. Now I can sense all those several orders of magnitudes that I have to go and I wish I could consume a dozen more books before going on to the next project– but life and learning don’t work quite like that….

The main thing is that I now see that I never really understood OOP near as well as I thought that I did. A lot of people objected to the recent Golf ball example that a Perl hacker recently used to criticize typical object oriented “design,” but that guy sure had me pegged. Of course, good OOP design is going to be almost indistinguishable from functional programming approaches in some cases– even if it’s a little more verbose– but I don’t think I would ever have understood that so well without my many hours spent with Lisp and SICP.

The Future of Programming

October 4, 2007

“84 months worth of work was reduced to 2 months, and the results were error free.” It’s stories like this that fire our imaginations. Whether it’s Paul Graham fixing bugs while the client is still on the phone or Peter Siebel’s dad finishing a project with only half a budget, we want to know the secret.

And what were some of the attributes of this  latest successful project? “I didn’t have to code the changes for each machine; it would create what was needed from the machine specifications…! I didn’t need a new release, all I needed to do was apply my new business rules to the existing system…! This is what development was supposed to do for us.”

I could see some of this. Configuring systems with tables stored in a database or in XML files is not enough. Each installation is different… and if enough are them are different enough that we’re forced into making changes in the actual code, we’re hosed. We get sucked into an endless treadmill of patching, redeploying, gathering more requirements, putting out fires…. Source Control, Unit Testing, Agile techniques, and good coding style all contribute to making this somewhat manageable. But they don’t address the core issues. They will eventually fail us on the interesting problems.

My own little toy project, though fun, is deficient not only because of its amateur technique, but rather because it works against the grain of the language. It did achieve a moderate level of configurability via a human readable configuration language, but it was accomplished in a brute force manner… and extending the new language is not a lot of fun. While we managed to abstract away the essence of the generation definitions, we nevertheless violate the closure principle: we’re frozen at a single “pretty good” level of abstraction. And unlike a true embedded language, my custom language does not benefit so much from the features of the parent language.

How can the success story be recreated? Quoth the hacker, we need “grammars to read and execute specification files….” This, of course, points back to Norvig’s deceptively simple PAIP chapter 2… a theme that sets the tone for his entire book.

To say that we are going to invent custom languages on a problem by problem basis is misleading. We’re going to be extending existing languages in expressive ways– without burning the bridge back to the parent language’s core idiom. As pico explains, “a DSL isn’t really writing a new language, but rather manipulating an existing language to define your problem, or domain, in a more natural form. It’s designing objects and writing methods that isolate the problem and illuminate your business rules.”

We are not “true believers” in any single programming language, but we recognize that some languages are friendlier to our creativity than others. As far as is possible, we will not allow any language to limit our imaginations. And we will code solutions to problems that chafe at the constraints imposed by relational and object-oriented assumptions. Still, the question is not whether or not to write DSL’s or embedded languages, but when.

Boilerplate code says a lot about a language…

October 1, 2007

 I ran across this recently:

“In fact the problem of ‘boilerplate design pattern code’ says nothing about the quality of the language; it says only that the programmer are working at the wrong level of abstraction and are the victims of that all too common sickness: bad design. Picking another language is not the cure for this disease. (More than likely it’s only going to make things worse!)”

This is just plain wrong– except for the part about the programmer working at the wrong level of abstraction and being a victim of “bad design.” The question is, how much of that has been foisted onto the programmer by the design of the language he’s writing in? And if much of it is due to the language, then it’s certainly not off base to go shopping for a new one!

If you’re not clear on this issue, then you need to watch SICP Lecture 3a: Henderson Escher Example. Early in the lecture (00:05:45) he points out the importance of closure in any means of combination in a programming language. (Now this isn’t about “closures”, now… this is a more general mathematical principle.) Anyways, Abelson says, “Closure was the thing that allowed us to start building up complexity…. The things that we make… those things themselves can be combined…. to make more complicated things…. When you look at a means of combination, you should be asking yourself whether things are closed under that means of combination.”

Obviously, it’s much better to be able to make an array of arrays than it is to be restricted to only storing numbers or strings in them. Whenever the principle of closure is violated, you’re going to be limited in your ability to formulate useful abstractions. Just like you lose the ability to express certain ideas cleanly (or even at all) when functions are anything other than first class citizens in your programming language. Programming languages really do have significant differences; they’re not all the same.

After going through his example step by step (and incorporating everything covered so far in the course), he (01:02:37) sums up why the ability to create embedded languages is so important: “That’s the important point: the  difference between merely implementing something in a language and embedding something in a language so that you don’t lose the original power of the language. Lisp is a lousy language for doing any particular problem; what it’s good for is figuring out the right language that you want and embedding that in Lisp. That’s the real power to this approach to design.”

I think a lot of people get the heeby jeebies when they hear Lisp programmers talking about creating new languages to solve problems. They’re thinking, “if people start writing their own macros, how will I understand the code when I have to maintain it?!” Well, yeah. That’s only going to be a problem if you don’t know how to use macroexpand to show what’s going on “under the hood.”

But when you look at what’s going on in the example from the video, you can see that the process of implementing languages to describe what’s going on actually makes the code much easier to understand. The languages are implemented at each level of abstraction. Each language doesn’t “care” how the lower level languages are actually implemented– you can think at each level in terms of that level without being confused by unnecessary detail. And each level can be manipulated with and integrated with all of the usual idioms of the Lisp language itself. It’s not like you’re randomly inventing a new Python or Perl depending on the problem you’re solving. You’re making languages that are fully embedded in Lisp… and there’s nothing keeping you from utilizing other Lisp tools, techniques, and elements with these language layers.

So don’t let all the talk about DSL’s and embedded languages scare you. Expressive code can actually be easier to understand, extend and maintain.

Update 10/2/07:   “discipline and punish” responds with Programming Languages, DSLs, Static Typing, and the Answer to Life, the Universe and Everything.

Because Java Programmers can Suck, too

July 24, 2007

I started out as a VBA programmer. This means I was looked down upon by the VB6 programmers… who in turn were looked down upon by Java programmers…. Ah, Java was so cool then.

Yeah, it stunk being a VBA guy. You always picked up these projects by some guy who was long gone. You had to add a feature or fix something and you just couldn’t figure out what was going on. Crap was being stored in global variables and all the work was being done on the event triggers of the GUI. What a nightmare! It’s a pain because the state of the program could appear to change randomly… and it was very hard to figure out by trial and error what it was that was breaking things. This lead to bug reports from clients that you couldn’t reproduce… and you ended up looking really incompetant.  (Now I showcase my incompetance with blog entries like this.  There– I said it before a commentor could say it for me!)

Well, Java programmers generally have more class than that. (The language forces them too. Heh.) But there are ways of introducing such CodeMunging in all languages, however fastidious they might be. It’s worse, though, when it’s introduced by a catchy Gang of Four design pattern that makes you sound cool to the uninitiated. But if you thought you could just cut and paste this stuff and automatically be a better coder, then I guess you get what you deserve. Yep, you too can bring all the joys of a crappy VBA debugging nightmare into just about any popular programming language: just get ahold of the “singleton pattern” and start applying it indiscriminantly and you can have all the pain of a classic VBA global variable munge-fest– even in a language that outlaws global variables outright!

I used a singleton once. Once. After first being annoyed that it wasn’t completely clear how to implement one in my language, I released one into the wild of a production system. “There should only be one of these objects in existence ever– therefore this MUST be my chance to finally apply this real live design pattern!” A classic hammer-looking-for-a-nail scenario.  After letting it go, I had a distinct sinking feeling in my stomach. Somehow the experience wasn’t all that fulfilling. I moved on to other things and tried to avoid such pedantic NDD (Nerd-driven development) in live systems where other people would surely come after me and hate my guts for such pointless smarty-pantsness.

But now I know why the technique didn’t stay in my tool-kit. Sure, it ultimately didn’t pass my personal smell test… but could I articulate why? “In some cases they can make testing difficult and hide problems with your design,” says Google’s Dion Almaer.  (Google has gone so far as to release a Singleton Detector to help people identify and remove Singletons from production code.)  Yeah… that’s it.  I knew there was a reason that my coder-sense was tingling….

In the mean time, if you aren’t an expert on Design Patterns… then watch out for questions about them in job interviews. Talking about them without understanding them is a good way to make yourself look stupid.  A lot of those com-sci kids pushed out by the Java Schools nowadays use Singleton to un-OOP their OOP code.  Don’t let a potential employer get you confused with them….

Maybe Those “Deliberately Chosen Limitations” Weren’t Such a Good Idea…

June 6, 2007

“Yet, the most popular form of software patterns is exemplified by those found in Design Patterns, by Gamma, Helm, Johnson, and Vlissides, which contains little more than techniques for coding in C++ constructs found in other programming languages– for example, 16 of the 23 patterns represent constructs found in the Common Lisp language…. Down with quality, up with clever hacks. Why worry about what makes a user interface beautiful and usable when you can wonder how to do mapcar in C++.”

— Richard P. Gabriel in “Back to the Future: Is Worse (Still) Better?”, 2000

“This practice is not only common, but institutionalized. For example, in the OO world you hear a good deal about ‘patterns’. I wonder if these patterns are not sometimes evidence of case, the human compiler, at work. When I see patterns in my programs, I consider it a sign of trouble. The shape of a program should reflect only the problem it needs to solve. Any other regularity in the code is a sign, to me at least, that I’m using abstractions that aren’t powerful enough– often that I’m generating by hand the expansions of some macro that I need to write.”

— Paul Graham in “Revenge of the Nerds”, May 2002

Peter Norvig (of Google fame) appears to have done the initial work on identifying the 16 Lisp constructs that correspond to the Gang of four patterns. Here’s his talk where (i think) he first made that point back in March of 1998. He reiterated these points at a presentation in October 1999 and he said that Lisp is the best host for first class patterns.

What blows my mind about all of this is that these discussions are all taking place in the nineties. The nineties! And in spite of that knowledge, we end up today in pretty much the same place even in 2007:

“More than half of the code in every Java enterprise framework exists purely to work around well-known, deliberately chosen limitations at the language level. Smart Java developers have paid a staggering price to prop up the illusion that the Java language is easy.”

Stuart at Relevance, LLC

For the past eight years I’ve struggled to make my programs as dynamic and configurable as possible. Many times I’ve thought to myself, “what I really need to do is write my own language for solving this type of problem.” I’ve literally thought that a half dozen times before reading a word of Paul Graham’s essays…. The idea seems so natural, I always wondered why it felt like I was the only person in my niche that wanted to do that. And it seems that the answer to that is that the development environment I was locked into was specifically engineered to make it difficult to do those things–it even made it difficult to even imagine doing those things.

This is pretty disappointing….