DDD – misunderstood

Over the years since finding out about Domain-Driven Design I went through several iterations of grokking it.  I’ve also seen other people completely miss the point of “tackling the complexity at the heart”, for example:

http://www.infoq.com/news/2008/09/SOADDD and http://blogs.msdn.com/nickmalik/archive/2008/09/03/applying-ddd-to-it-management-first-failure.aspx

I see others going through the same mistakes:

  • I’ve dismissed it as nothing but rehashing of  OOD principles
  • I’ve overdone “IDs are the impedance artifacts, an instance reference is my ID”
  • I’ve overdone “everything is an object therefore every class will manage its own behavior”
  • I’ve modeled my entities and called it my domain model

I think Evan’s book is partly to blame, the man just wasn’t a writer. Partially, the reason is that what we seek is like Plato’s cave shadows – we may be able to glean a shape of the ultimate form, but it’s only an approximation of the real thing, lacking the detail.

Here’s my current understanding:

  • It’s the principles!
    We use patterns and principles in our solutions all the time. OOD is great, but it is too abstract to be of any value by itself. SOLID is a good starting set, but it doesn’t go into domain modelling. There are lots of other principles that are suitable in one case or another. DDD is just like that, it’s a way to capture desired characteristics of your problem domain in your code. Bringing any kind of dependency, especially a large framework into the domain model deserves a frown. The focus of the modelling, the principles applied, will come at a cost and will be wasted if something is sacrificed for a framework.
  • It’s about the code!
    We have the models everywhere, some of them are for communicating to the database, some – to communicate to/from external services and some are to talk to people. DDD is about the code at the core of the problem you are working to solve. There maybe a need for a different model for each of the purposes. In CQRS, for example, you are expected to have two separate models within the same app!
  • It’s about modelling the behavior!
    Which leads me to the ultimate purpose of the domain modelling: capturing the behavior as close to the way business treats it as possible. Have you heard about coding dojos where you have to try and solve a problem w/o ever using the setters? Well, it’s kinda like that, only with a real purpose. The immutability, for example, is of paramount importance when modelling the domain, because that’s one of the few ways to express the meaning and the intent about particular interaction.
  • It’s about testability!
    Presumably, there’s a significant cost attached to a model that permits a wrong behavior.  Keep the model small, abstract and inject all the infrastructure. We shouldn’t need a complex setup to test a theory or modify a behavior and we should test it continuously. The model and the Behavior-Driven tests is our documentation, it’s for the coder to capture and to understand the requirements.

Happy coding!

DDD – misunderstood

Arriving at Scalability – part 3

This is a part 3 of the series started in part 1 and part 2.

One of the things that becomes obvious when tackling Scalability is that calculating certain things on the fly takes too long to be practical. Another obvious thing is that the logic that deals with data needs to be executed somewhere close to the data.

Denormalization

By structuring the data in such a way that we can hold on to the results of the calculation we can take advantage of the cloud processing capabilities we have on the backend. We end up with copies of many things, but by partitioning the data into Aggregates we are free to modify any bits w/o any locking issues. It also opens the doors to further distribution – if you have your own copy, it doesn’t matter where you work on it. The interested parties, such as UI, will eventually become consistent all along returning a cached copy of data.

Event-Driven Services

Introducing copies of data means we need to know when to update them. By communicating via messages that represent domain events taking place, we let our services work within their narrow scope with their own copy of the data. Once they modify their little part of the domain, all they have to do is notify the parties that depend on it with particulars of what was done.

Push notifications for UI

UI becomes just another publisher and subscriber of business events, triggering the changes and minimizing the reads. The delays between a change taking place and the UI reflecting it has to be kept an eye on, but by computer standards humans are slow.  We read and write at glacial pace and while computers carry out all this eventing, processing and copying of the data, a human would barely make a click or two.

Batching

Taking a lot of data in and promising to get back with the results via asynchronous means is another thing made possible once you embrace fire-and-forget methods of communication. By looking at a batch, we can employ more intelligent strategies about resource acquisitions and aggregate the events, enabling all the parties involved to do their thing more efficiently.

Putting it all together we can take our scale-out efforts pretty far: If a particular calculation is very demanding, we can put the service carrying it out on a separate machine and it’s not going to affect anything else. This is very powerful, but eventually we’ll hit a wall again – even within a narrow scope we’ll accumulate too much data. The data we have partitioned “vertically” will have to be partitioned “horizontally”. It’s a big challenge, but also the “holy grail” of scalability and we have some ideas as to the approach and maybe one day I’ll be able to tell about it as well.

Arriving at Scalability – part 3

Arriving at Scalability – part 2

Several decisions we’ve made earlier became powerful enablers for our first pass at Scalability.

 

Queryable Repository

Data reads take time, reading less than everything is always a good idea. Implementing Repositories as queryable allowed us to switch to paged, ordered and generally highly selective views.

 

Unit of Work

Effectively building small projections for the UI from large datasets was possible due to request-scoped Unit of Work implementation, enabling underlying ORM to read data across multiple Repositories.

 

Aggregate Roots

Root Aggregate, defined as a cluster of associated objects treated as a unit for the purpose of data changes. By implementing Repositories one per aggregate, we isolated data affected by transactions and avoided deadlocks. With a bit of per-ID synchronization we were able to avoid optimistic concurrency exceptions as well.

 

Message passing

By integrating message passing early, we were able to move all the writes to backend services and avoid distributed locks in exchange for eventual consistency. Moving the backend services to another machine after this was trivial.

 

And that’s how by scaling out we were able to get that x100 performance gain, on the cheap. We’ll do it again a few months later by batching, denormalizing the data and embracing Event-Driven SOA.

Arriving at Scalability – part 2

Arriving at Scalability

This is going to be a series of posts exploring my revelations stemming from last couple of years designing a SaaS solution for IT organizations.

Probably like most startups, or even most new product developments we set out to deliver the features. Scalability was one of the desired quality attributes, but considering how prioritizing one attribute affects all the others, it wasn’t on top. Moreover, knowing about experiences at places like MySpace, it’s a given that we’ll end up rewriting some parts of the system with every other increase in the order of magnitude of user transactions. As our understanding of our domain and our users improves, as the usage metrics become available we’ll figure out what needs to change to get the next x10.

With this in mind, we set out to work with these priorities for the code:

  • Small and easy to maintain
  • Secure
  • RAD-supported, rich UI with dynamic query capabilities
  • Low friction data access
  • Low coupling between interacting bits

Simple. No Scalability here. Leaving aside Security, this is what it translated into:

  • Lightweight, POCO domain model with Root Aggregates
  • Unit tests and Continuous Integration
  • Silverlight with 3rd party controls and RIA services
  • An ORM and SQL Server backend
  • Message passing with guaranteed delivery

Here is where it gets interesting, with some of the patterns we used in implementation:

  • A Repository capable of carrying out dynamic queries
  • Unit of work
  • Dependency injection
  • Lightweight service bus with publisher/subscriber

At this point we’re in our second or third month of development, with the staging environment getting ready for testing on EC2. Every developer has a complete system running on their machine. We handle the entire projected load while running on a single virtual machine and we haven’t even had to make any sacrifices or do anything in terms of optimizations.

These may or may not seem impressive, but if felt good to get some stuff done and done right. All along the features are the focus and we can handle several concurrent users and hundreds of data entities.

Next I’ll talk about our first Scalability endeavour and how relatively small an effort that was for x100 gain.

 

Continued in Part 2

And concluded in Part 3

Arriving at Scalability

Taking control of your application development – finding correct abstractions

At this point I’m going to go ahead and claim that finding correct abstractions and coming up with interfaces is the most difficult thing in application design. Note that I’m talking about design, not implementation.

Implementation might carry a load of technical difficulties and workarounds and digging through the web. But that’s what they call “mechanical” problems. Often this is what is assumed programmers are meant to do, but I’d say that in mainstream software development most things have already been figured out, articles have been written and examples are available for download. That’s the kind of work that said to take about 10% of overall development time.

On the other hand, abstractions are always domain-specific and while writing them down or typing up or diagramming on the board is easy, it requires deep understanding of the problem domain and applicable use-cases. Usually the process involves more than one person. Depending on the scope it might require coordination of multiple designers/architects to ensure consistency throughout the system being produced.

Leaving these difficulties aside, you have to come up with abstractions that are easy to use correctly, but more importantly are very hard to misuse. Comments in the code are a lie and the code itself is the ultimate documentation and first and foremost it should be written for other people to read and make use of.

I’ll admit I’ve done my share of mistakes, but I was fortunate enough to be able to start from clean slate and try something different (and hopefully better) every time. After having designed half a dozen decent-sized applications from ground up I came up with these observations:

Separation of concerns is easy to accomplish with layered architecture;

Role-based interfaces with very few verbs over relatively large number of nouns allows you to build Chain of Responsibility with ease (HTTP with its 4 verbs is the ultimate example this), cut down on the lines of code and are easy to understand at glance – w/o looking at the details of implementation;

– Practicing ubiquitous language and keeping domain model rigid with Services, Entities and immutable Value Objects (Domain-Driven Design concept, not DTO) simplifies understanding, use and change;

– Thinking about external dependencies (like persistence, communications with other systems, etc) in terms of services for the Business layer to consume establishes solid ground for defining interfaces with Data Source layer;

– Having separate unit test projects for every layer gives you sense of accomplishment, confidence and freedom to change things;

– Once all that is done, wiring it all together with Dependency Injection framework of your choosing becomes trivial;

– Adding integration test project to ensure that various parts of the application talk to each other frees you from having to run UI to test a change.

 

Each one of those practices helps, but taken together they end up amplifying the effect of each other. Let me tell you, going through a month of intensive development following these practices feels like a vacation. It keeps the management happy too because there’s always solid and verifiable progress and thanks to extensive testing the definition of “done” is really inclusive.

I think I’ll conclude on this happy note.

Taking control of your application development – finding correct abstractions

Taking control of your application development – managing dependencies

If we look at wikipedia article for coupling, we’ll find that it’s synonymous with dependency. Just about everybody know that coupling is bad and yet you can find a dependency graph like this:
 
Presentation->Business->Data Source
(meaning Presentation depends on Business that Depends on Data Source)
 
Assuming the implementation is not superficial and nothing like typeless datasets are being passed directly up to the Presentation, let’s look at what this graph allows:
– direct use of Business abstractions in Presentation (good)
– no direct coupling of Presentation with Data Source (good)
– Data Source driving the interface/protocol for communications with Business (bad)
– testing of Data Source in isolation (worthless, because unless our Business abstractions are just DTOs we are not involving any Business rules)
 
And what it doesn’t allow:
– testing Business in isolation (bad)
– testing Presentation in isolation from Data Source (bad)
– substitution of Data Source with another implementation (bad)
– and related to the previous, reuse of Business w/o the reuse of Data Source (bad)
 
Does that look like Pit of Success to you? I don’t think so.
 
Here’s another graph suggested by Domain-Driven folks like Eric Evans and Jimmy Nilsson:
 
Presentation->Business<-Data Source
 
This graph allows us to keep all the good stuff from the previous graph and eliminate all the bad stuff.
 
I want to elaborate a bit on the substitution of Data Source implementation. While this might seem unlikely, recent years have proven that Data Source and Presentation are the most likely targets of change (once the Business functionality has been achieved).
Imagine porting your ASP.NET application to Silverlight and you’ll realize:
1) Silverlight means you’re adding another application (client for you web-app server!). There are of course other likely clients that you might get. If you’re like me you’ll want to involve the Business rules as early as possible and that means… Business layer, hopefully reusing the implementation you’ve created for the server.
2) the change is not unlikely given the feasibility
3) the change is not feasible if you have to drag the Data Source implementation along.
 
It might seem at first that we have a circular dependency here, but that’s not the case.
What are the Business abstractions that would allow as to do it? What is required from the infrastructure to let it happen?
Tune in later for the spectacular conclusion!
Taking control of your application development – managing dependencies

Taking control of your application development – pit of success

This series of posts was long in coming and today seems like a good day to start it.
 
Over the last few years I’ve had a few conversations about structuring an application and found that while there’s a solid body of knowledge and proven practices available people don’t care much about it or misinterpret it. From what I’ve seen, coming up with the structure as you go or creating a superficial one leads to fast take off, but slow progress, the pace that gets slower all the time.
 
I’m not talking about heavy investment in infrastructure or design upfront. You will make adjustments, but having a good start is like digging a Pit of Success. Your developers can’t help, but fall into one.
In most abstract terms what it comes down to is planning the tiers and… here it comes… layers, if any, your application will have.
The most obvious practical aspects are:
– What processes you’ll run,
– Where will they run,
– What your dependency graph for each process should look like.
 
Obviously all of it highly depends on the requirements; however the software industry in general through trial and error seems to have settled on certain ways of dealing with certain class of problems. Fowler calls those Patterns of Enterprise Application Architecture.
Here’s the default layered architecture according to Fowler:

Layer

Responsibilities

Presentation

Provision of services, display of information (e.g., in Windows or HTML, handling of user request (mouse clicks, keyboard hits), HTTP requests, command-line invocations, batch API)

Domain

Logic that is the real point of the system

Data Source

Communication with databases, messaging systems, transaction managers, other packages

 
He makes a distinction between layers and tiers here, because a lot of people confuse the two. Essentially, by Data Source we mean a bunch of our abstractions (classes, interfaces, etc) that talk to the database and other systems. It does not include the database itself. That’s another tier – one that might or may not happen to be running on the same machine, but we consider it separate.
 
The distinction is important because we’re defining the application boundaries. For example in a client-server application, what we really have is 2 applications with their own boundaries and the choices of layering should be made separately for the two.
 
There might be a sharp learning curve in learning the concepts, but all the tools to implement the infrastructure for common set of problems already exists and available in a number of flavours.
Traditionally tools drove the approach to the solution, but I believe the tools are just means to an end. Let’s start with “What do we want our solution to look like?” this time around. Read about the quality attributes I quote from Patterns of Software by Richard P Gabriel. I think everyone would like to see those quality attributes in their application. What can we do from the start?
Tune in later for the second part, where I’ll go over the language (abstractions) the layers might want to talk in and what my preferred dependency graph looks like.
Taking control of your application development – pit of success