One of the things that becomes obvious when tackling Scalability is that calculating certain things on the fly takes too long to be practical. Another obvious thing is that the logic that deals with data needs to be executed somewhere close to the data.
By structuring the data in such a way that we can hold on to the results of the calculation we can take advantage of the cloud processing capabilities we have on the backend. We end up with copies of many things, but by partitioning the data into Aggregates we are free to modify any bits w/o any locking issues. It also opens the doors to further distribution – if you have your own copy, it doesn’t matter where you work on it. The interested parties, such as UI, will eventually become consistent all along returning a cached copy of data.
Introducing copies of data means we need to know when to update them. By communicating via messages that represent domain events taking place, we let our services work within their narrow scope with their own copy of the data. Once they modify their little part of the domain, all they have to do is notify the parties that depend on it with particulars of what was done.
Push notifications for UI
UI becomes just another publisher and subscriber of business events, triggering the changes and minimizing the reads. The delays between a change taking place and the UI reflecting it has to be kept an eye on, but by computer standards humans are slow. We read and write at glacial pace and while computers carry out all this eventing, processing and copying of the data, a human would barely make a click or two.
Taking a lot of data in and promising to get back with the results via asynchronous means is another thing made possible once you embrace fire-and-forget methods of communication. By looking at a batch, we can employ more intelligent strategies about resource acquisitions and aggregate the events, enabling all the parties involved to do their thing more efficiently.
Putting it all together we can take our scale-out efforts pretty far: If a particular calculation is very demanding, we can put the service carrying it out on a separate machine and it’s not going to affect anything else. This is very powerful, but eventually we’ll hit a wall again – even within a narrow scope we’ll accumulate too much data. The data we have partitioned “vertically” will have to be partitioned “horizontally”. It’s a big challenge, but also the “holy grail” of scalability and we have some ideas as to the approach and maybe one day I’ll be able to tell about it as well.