Architecting for scalability and audit logs by explicitly modeling state transitions

In his talk, Unsleash Your Domain, Greg Young presents a dense discussion of topics about which I am passionate. At its core, the talk is about how to guarantee a correct audit log and architect for scalability.

Before watching this talk, I suggest brushing up on the following terms if you’re not already familiar with them: domain model, aggregate root and ubiquitous language. (In one system I work on, we use the words “session” and “ticket” to describe the same concept. Precise, consistent and accurate language leads to better communication and understanding.)

Click here to watch the talk.

Here are some valuable lessons I see in Greg’s talk:

  • Software runs. It’s about action. Most models I’ve seen focus on nouns, such as customer and order, and make actions, such as place order and cancel order, secondary concepts. A functional design focuses the model on actions, not aggregate roots. Software with a focus on action makes testing and auditing, which are fundamentally about verifying action, natural.

  • Think of the state of an aggregate root as the result of a stream of events. As Greg mentioned, Martin Fowler dubbed this design Event Sourcing. Distributed source control systems focus on changes to a file rather than versions of files. This subtle difference is a behind much of the value distributed source control systems offer. Greg talks about how building your state from a stream of events can result in the ability to rewind and playback history. To see how this might be visualized, check out the playback feature of Google Wave, demonstrated at 49 minutes into this video.As a bonus, since a stream is your data, you can easily start writing code in a functional style, even in Java. Sequences are the core interface to data for Clojure and LINQ.Also, Greg mentioned the the ability to use a document database. These are the kinds of databases that power Google and Amazon.

  • Look for places to implement eventual consistency. Here’s an example of an opportunity for eventual consistency:  You want to provide the ability to mark a blog post as a favorite. Here’s a series of steps to implement eventual consistency: You have an application responsible for rendering the blog post and if it has already been marked as a favorite. User clicks “mark as favorite.”  This triggers a call to a different application which accepts the command, validates it, inserts the command into a queue to update the read-only data stores and response to the client that the command has been accepted. The UI then displays the item has been marked as a favorite. If the user refreshes the page before all read-only data stores have been updated, then they may see the blog post as not marked as a favorite. As Greg would say, so what?  This is unlikely to happen and the consequences are small compared to scalability gained by making this operation asynchronous.

  • Greg talks about designing business processes as if they were to be done with paper: “Paper was awesome because it never gave the impression of global consistency and it is the thinking that went into the optimization of paper processes that can help us optimize our transactional systems.“  I thought his restaurant example in the talk might not click with some viewers, so I’m expanding on it here. When you place an order at a restaurant, you’ve inserted a command in a queue. The waitress transfers the message from the queue she’s holding to a queue for the kitchen. The kitchen then prepares your order and notifies the waitress of the event that the order is ready to be served. The kitchen may then decrement the inventory records. The waitress then picks up the order and serves it to you. The reason why this not a fully consistent state is that the cook knows about your order after you’ve placed it. The inventory is decremented after the food is cooked. It’s eventually consistent. And it works.As a slight tangent, human processes are often automated because automation is thought to be the solution to poorly executed processes. Strive to first standardize processes before automation as the process is often ill-defined. Sometimes, simply defining a process cures problems. From The Toyota Way Fieldbook: “Standardization is a key element of the Toyota system. A process that is not standardized is fraught with chaos, variation, and the associated problems of continually “riding the wave.” According to Toyota, standardization is the baseline for continuous improvement, the time when real improvement begins and is measurable.”

Related links:

“Man acts as though he were the shaper and master of language, while in fact language remains the master of man.” –  Martin Heidegger

Update Sept 2, 2009: I received feedback that such an architecture is only for mission critical systems. In his talk, Buy a Feature: An Adventure in Immutability and Actor, David Pollack, the creator of the Lift web framework for Scala, talks about an application, Buy a Feature, which implements this architecture.

Written on July 18, 2009