Real-time analytics with Apache Storm – now in F#

Over the past several month I’ve been prototyping various aspects of  an IoT platform – or more specifically, exploring the concerns of “soft” real-time handling of communications with potentially hundreds of thousands of devices.

Up to this point, being in .NET ecosystem I’ve been building distributed solutions with a most excellent lightweight ESB – MassTransit, but for IoT we wanted to be a little closer to the wire. Starting with the clean slate and having discovered Apache Storm and Nathan’s presentation I realized that it addresses exactly the challenges we have.

It appears to be the ultimate reactive microservices platform for lambda architecture: it is fairly simple, fault tolerant overall, yet embracing fire-n-forget and “let it fail” on the component level.

While Storm favours JDK for development, has extensive component support for Java developers and heavily optimizes for JRE components execution, it also supports “shell” components via its multilang protocol. Which is what, unlike Spark makes it interesting for a .NET developer.

Looking for a .NET library to implement Storm components there’s the Microsoft’s implementation – unfortunately components in C# end up looking rather verbose and it happens to work exclusively with HDInsight/Azure, which is a deal breaker for us, as we want our customers to be able to run it anywhere. Fortunately though, further search revealed recently open-sourced FsStorm announced on Faisal’s blog and I liked it at first sight: concise F# syntax for components and the DSL for defining topologies makes authoring with it a simple and enjoyable process.

The FsStorm components could be just a couple of lines of F#, mostly statically verified, have clear lifecycle and easy to grasp concurrency story. And with F# enjoying 1st class support on Mono, we are able to run Storm components effectively on both dev Windows boxes and distributed Linux clusters while capitalizing on productivity and the wealth of .NET ecosystem.

It is now available under FsStorm umbrella as a NuGet package, with CI, a gitter chatroom and a bit of documentation.

While still in its early days, with significant changes on the horizon – something I want to tackle soon is static schema definitions for streams and pluggable serialization with Protobuf by default, I believe it is ready for production, so go forth and “fork me on GitHub”!

Advertisements
Real-time analytics with Apache Storm – now in F#

A problem with resources

One classic trait you will find throughout enterprise software development is that people are literally treated as individual resources – to be shared, allocated, etc.

Of course the management has a perfectly reasonable motivation – how else would you maximize gains or lower the cost of while solving multitude of problems that need solving?

There are several underpinning factors that have to be in place for this mentality, but first, let’s examine what it really means:

  • part-time commitments – it’s harder to foresee when any single feature is going to be actually in customer’s hands
  • limited window of opportunity – there’s little or no chance to address bugs or overlooked features
  • no ownership – things to tend to deteriorate when there is no reason to maintain overall consistency by going beyond the immediate feature/scope

If that looks acceptable, we’re done, read no further!

 


 

On the other hand, if that causes a familiar sinking feeling and you totally can imagine the problems down the line, let’s see how organization ends up in this situation:

Well, duh, Waterfall:

Yes, the process calls for Requirements Phase and we can’t let anyone sit on their hands, we gotta keep everyone busy while that happens. The commitments have been made and the corresponding budget has been allocated last year and for 12 months ahead! Besides, we need more people, right now – over there, on that project.

Organization structure:

Centralized leadership that’s expected to maintain technical and domain expertise, project management skills and a charismatic personalty necessary to solve the problems and effectively manage manpower to implement the solutions. These leaders have to be good at everything! And oh, yeah, the leaders will know the best how long something is going to take.

Tradition of partitioning the responsibility: 

Sole but partitioned responsibility is a terrible setup. BA – for requirements, an architect – for the technical stack and “best practices”, a coder – for the immediate feature code (with a potential separate role for a person responsible for deployment infrastructure) and tester – for the quality: “Hey, don’t look at me, my part is done!”

Lack of talent:

It’s been said many times, but you don’t hire top 10%. Nobody does, and while you could bring up the average, see above – “we can’t let anyone sit on their hands”. Besides, they’ll just quit and use their newly acquired skills elsewhere, right? Lack of training in broad problem solving leads to skill set silos.

If you find yourself in an organization with those traits – that’s it, prepare to reap the consequences of treating the individuals as resources, indefinitely: low morale, quality problems, slow and unreliable feature delivery, rotting code base and employee turnover.

 


 

Incidentally, what might fairy-land of Agile have to address this? Optionally mapping to Agile Manifesto

Teams:

Stable units with the broad skill set to solve a problem from the original statement to tested product in customers hands. Own the code and the solution as a whole. Give the team the problem, give them a lot of problems, just line them up and let them finish – one thing at a time.

Figure out the requirements together with the solution (Customer Collaboration):

The requirements are never done, solution team working with the customer (or Problem Stater, to use flavor-independent term) will come up with a better problem statement (or a better problem!) than a BA marinating in his/her own juice. Chances are it will have to be broken down into smaller problems in order to be solvable. In the end you’ll definitely have a better solution.

Decentralized leadership (People and Interactions):

Problem Stater and Impediments Remover are the only two assigned roles (and arguably they are not even on the team), the rest is allowed to emerge within a team any way the team feels best (and the best itself is allowed to be figured out by the team). People tend to stick to and be happier with the commitments they take and estimates they make themselves. We all try to take pride in what we do and self-commitments allow us to do better. All you have to do is position them for success and let them.

Definition of done (Working Software):

Seriously, you can’t even talk about quality unless you defined what it means to be Done. Refactor, keep the consistency or don’t and know you are paying the price the next iteration, but since you own it – it’s yours to pay. Pick the best stack that works for your team and the problem, but make good and deliver.

Iterative delivery (Responding to Change):

Test ideas, change tracks, fail early, develop deeper understanding of your customer (or discover a different customer!). Deliver, make profit, repeat.

Continuous focus on improvement:

Keep asking “what we can do differently, what we can do better?”. Improve skills, tools, processes, artifacts, etc. as matter of course.

 

In the end…

If all you want is a cog, then all you’ll get is a cog. But make a person matter, let people take pride in what they do, let them grow and feel the benefits of their accomplishments and they will stick around to make you a handsome profit.

A problem with resources

How to fail while implementing agile

Agile software development, the best process the industry has come up with so far, how could it fail? Easy.

Any of the following will do, even if you hired a professional trainer…

 

Dismiss the idea of training as “it’s obvious, isn’t it?!”.

About 80% of all attempts at agile result is “scrumfall”, i.e. people going through the motions w/o understand where the value is supposed to come from. So, no, it’s not.

 

Allow the trainer to avoid locking into any specific flavour.

Scrum or Kanban? Leaving the choice to the uniformed or proceeding with training not tailored to fit the organization is not a training, it’s an information session. People walk away w/o an understanding of the role they are supposed to play or how to play it or even why bother.

 

Let people taking the training to wonder in and out as they please.

Exercise in the class – focusing on one thing at a time improves the team productivity. If someone was allowed to skip it, they are still thinking in terms of their silo and how personally more productive they are if they don’t have to deliver in small chunks.

 

Fail to grasp product owners and project managers/scrum masters roles and their differences.

In a traditional development  model your manager and your team lead are playing both sides – figuring out what to do and how to do it.

To benefit from agile you need to let product owners concentrate on figuring out the “what” and “why” and let the team to come up with the “how”. Empower the scrum masters to facilitate and unblock.

 

Don’t provide additional training to product owners and project managers.

Focus on what and why is not natural for many people ending up in the product owner role. It requires a different “hat”, if you will.

Project managers need to learn when to step back and when to put their foot down – in agile these two usually happen for entirely different reasons.

 

Don’t bother with the definition of done, don’t account for testing in the estimations.

Even if your teams do everything right, without the definition of done you don’t know when or what you can realistically ship. Just trying to come up with the definition will highlight potential problems with your delivery.

 

Say that testers are uninformed, unavailable, elsewhere and don’t involve them in the planing and estimations.

Related to the definition of done. No testing? Chances are you’ve built the wrong thing, even if it works as intended.

 

Don’t write stories, keep writing tasks.

Tasks don’t tell the team what motivates you, people have to reverse-engineer the value you expect to derive from things you tell them to do. Describe the value instead, they will find the best way to provide it.

Coming up with the pipeline of prioritized “value adds” is not an easy job, don’t make it harder by trying to figure out the solutions as well.

 

Help the developers by solving the problems for them, instead of focusing on defining them.

Traditional approach to problem solving has produced generations of developers whos only motivation is not to screw up. Let the people come up with the solution and they will “own” it, they’ll take pride in it, they will think late at night reviewing their decisions.

 

Swoop on the team now and then and ask if they are done yet, make an off-hand remark that it shouldn’t take this long.

It’s their estimates, not yours. If it takes longer than expected, maybe the problem wasn’t well understood, or maybe the task was scoped too big. Unknowns will pop, feed the observation back into the next retrospective and the next planing session.

 

When another pressing thing comes up tell them to get on that, regardless of any previous priorities and commitments.

Some times you have to scrape the sprint, but it should be red flag – something is wrong with your planing. Making a habit of it will destroy the morale, as prioritized backlog is part of the trust relationship between the “chickens” and the “pigs”.

 

Talk about refactoring as if it were a separate, optional activity, done at another time.

This is often a developer’s folly. People used to others making the decisions will try to present the choice of “cheap and fast” or “slow, but good” to the product owner, not realizing that continuous delivery and improvement is on them.

This why having a good backlog is important – any refactoring decision has to be done by the developers, with the knowledge of the roadmap for the best results.

 

Keep pretending that you can estimate in hours, pad the estimates as your experience tells you.

By now the question “How long do you think it will take you?” should have been exposed as an apparent invitation for a lie. The only relevant question should have been that of relative complexity, everything else is counterproductive.

 

Skip on the retrospectives… you don’t have any power to change anything anyways.

What is the forum for the constructive feedback in your team and organization? How often are you prepared to receive it and react on it? Telling someone “if I’m doing something wrong just tell me” is not a forum, it’s an invitation for reprisal and hurt feelings.

 

Keep the cost, time and scope fixed.

Project management triangle has been known for a long time. Agile assumes that the quality is given. Pick two and let the other float, that’s the only way.

 

How to fail while implementing agile

void is a bug

Having studied Haskell and F# and done a lot of C# coding in functional style I realize that “void” declaration in C-family of languages is a bug in the type system.

It appears where a type should be, but is in fact an instruction to forget everything you know and hold dear and treat the calls to this function differently.

Enter Action vs Func<Unit> distinction. What it comes down to is that you have to duplicate all the code that works for Func<> to get its functionality for methods that don’t have a value to return.

.ForEach() doesn’t have to exist and yet, there’s gazillion implementations floating around for something that could have been implemented (once) as something like a fold().

.NET 3.5 should have included and promoted Unit type.

/rant

void is a bug

Starting an open-source rewrite of a validation microframework

A few years ago I gave birth to the validation framework that I felt was uniquely capable of addressing complex validation scenarios I’ve come across. The following is a brief insight into how it came to be and where it’s at.

If you’re familiar with DDD, you’ll remember the pattern for expressing business rules via specifications. While simple and powerful, I found it to be insufficient: what we want to know in the real world scenarios is not only if something satisfies a given criteria, but also what does it mean if it does or doesn’t.

Here’s an example, given an entity E you need to ensure that a field X is of certain length. You can have a rule, here expressed as  the function:

 e => e.X.Length < MAX_LENGTH

Great you’ve got the logic, nicely encapsulated inside your domain. It can be referenced by name and it’s testable. Now what? What are we going to do with the fact that the rule evaluated to false?

What all of it lacks is context. What is the context of this evaluation? Why are we doing this? To throw an exception? No, we need more information even if we throw an exception and even having composable rules (named Specifications by Evans) isn’t going to help us tell the user what do we think is wrong.

Here’s an example of the context: a user is entering some text in a webform into the field X. Joel Spolski will have a field day ridiculing your UX, but you want to tell the user that what he’s entering is too lengthy.

Another example using the same rule but in different context: your API is called to import list of Es and you want to collect all the errors to respond with once you’re done processing the batch.

We most certainly want to reuse the rule, but what we do with the outcomes of the evaluation is completely different. How about an entirely different UI? Or an another application that makes use of your domain internally? They may all want to do something different with the outcome!

I dislike the idea of validation as a cross-cutting concern (addressed with attributes) for the same reason – it makes no effort to incorporate the context.

Here’s another example of the context: Say the entity E contains two complex properties of the same type:

class E { public A A1; public A A2; }

which contains property B:

class A { public string B; }

And your rule is to permit value “b1” for A1.B property and “b2” for A2.B property. You see the context (it’s the overall path to the property B), but your attributes – don’t. It’s very hard to make sense of things looking from inside out.

This leads me to my proposition: The validation is best performed and described from the outside, looking in.

Everything is easier as you have the access to all sorts of contexts: use-case, user, dependencies, etc. It’s easier to implement and it’s easier to understand.

So what do we need, other than the rules?

We’ll need to redefine what Specification is. We’ll use Specifications to associate Rules with outcomes.

We’ll also need to express Interpretations of the rules in a manner that would let us carry out contextual parametrization (such as reporting the offending value, constraints of the rule itself, localization, etc).

For the maximum flexibility we’ll need factories of Interpretations and Actions to carry out given an outcome, enter: Bindings.

The framework was very successful and ended up used in several Novell products (of PlateSpin fame) and I’ve decided to do an open-source rewrite. That’s all there is to the microframework I posted on github.

At the moment, I’ve got the basics setup: the build, the tests and the major components. The library is .net portable, so I’d like to have samples in MVC, WPF, Silverlight and WP.

Starting an open-source rewrite of a validation microframework

Arriving at Scalability – bandwidth management

No matter how fast and vast your infrastructure, at some point you find that it’s not enough or it’s not used efficiently. The messages make sense and the endpoints are configured in optimal way, the problem becomes the updates themselves – it’s expensive to carry out many little updates to the state w/o sacrificing the ACID properties.

No matter how fast an endpoint can process the messages, it’s possible to get more messages than you can process – the messages start to pile up. It’s OK if you know a reprieve is coming, but what if it’s not?

Batching to the rescue – we have to make effective use of the bandwidth we have on any given endpoint and some times it means repackaging the messages into something that carries the summarized bulk down for the actual updates.

The approach we’ve found simple and effective is to use some kind of persistent Event Store to keep all the message information, then, based on a trigger (timer or otherwise) group and aggregate the events before proceeding with the domain model updates.

We freed up the endpoint and the storage and avoided introducing some kind of conflict resolution inherent to eventual consistency models for dealing with updates at scale.

Arriving at Scalability – bandwidth management

Event-Driven SOA – a misnomer

Having built a system in Event-Driven SOA fashion I’ve come to realize that the moniker applied to this style of architecture actually misses the point.

I’d argue that the focus of such an architecture shouldn’t be services. As a case might be, only some of the actors involved would actually be Services, but you’ll also have Distributors, Workers, Sagas etc. What is important distinguishing characteristic in each case is the type of message these actors are meant to handle.

Should a message be a Notification, a Command or something else?

When I had to work with FIX to implement a trading system I didn’t really appreciate how well the protocol captures the ideas of Event-Driven architecture in its message types. Now I do – well-designed messages are the primary outcome of such an architectural approach.

Event-Driven SOA – a misnomer