Tuesday 18 November 2014

Don’t Pass Factories, Pass Workers

A common malaise in the world of software codebases is the over reliance on the use of factories. I came across some code recently that looked something like this:

public class OrderController
{
  public OrderController(IProductFinder products, 
           ICustomerFinder customers, 
           IEventPublisherFactory publisherFactory)
  { . . . }

  public void CreateOrder(Order order)
  {
    var products = _products.Find(. . .);
    var customer = _customers.Find(. . .);
    . . .
    PublishCreateOrderEvent(. . .);
  }

  private void PublishCreateOrderEvent(. . .)
  {
    var event = new Event(. . .);
    . . .
    var publisher = _publisherFactory.GetPublisher();
    publisher.Publish(event);
  }
}

Excessive Coupling

This bit of code highlights the coupling created by this approach. The “business logic” is now burdened with knowing about more types than it really needs to. It has a dependency on the factory, the worker produced by the factory and the types used by the worker to achieve its job. In contrast look at the other two abstractions we have passed in - the one to find the products and other to find the customer. In both these cases we provided the logic with a minimal type that can do exactly the job the logic needs it to - namely to look up data. Behind the scenes the concrete implementations of these two might be looking up something in a local dictionary or making a request across the intranet, we care not either way [1].

Refactoring to a Worker

All the logic really needs is something it can call on to send an event on its behalf. It doesn’t want to be burdened with how that happens, only that it does happen. It also doesn’t care how the resources that it needs to achieve it are managed, only that they are. What the logic needs is another worker, not a factory to give it a worker that it then has to manage as well as all it’s other responsibilities.

The first refactoring I would do would be to create an abstraction that encompasses the sending of the event, and then pass that to the logic instead. Essentially the PublishCreateOrderEvent() should be lifted out into a separate class, and an interface passed to provide access to that behaviour:

public interface IOrderEventPublisher
{
  PublishCreateEvent(. . .);
}

internal class MsgQueueOrderEventPublisher :
                             IOrderEventPublisher
{
  public MsgQueueOrderEventPublisher(
           IEventPublisherFactory publisherFactory)
  { . . . }

  public void PublishCreateEvent(. . .)
  {
    // Implementation moved from
    // OrderController.PublishCreateOrderEvent()
    . . .
  }
}

This simplifies the class with the business logic to just this:

public class OrderController
{
  public OrderController(IProductFinder products, 
           ICustomerFinder customers, 
           IOrderEventPublisher publisher)
  { . . . }

  public void CreateOrder(Order order)
  {
    var products = _products.Find(. . .);
    var customer = _customers.Find(. . .);
    . . .
    _publisher.PublishCreateEvent(. . .);
  }
}

This is a very simple change, but in the process we have reduced the coupling between the business logic and the event publisher quite significantly. The logic no longer knows anything about any factory - only a worker that can perform its bidding - and it doesn’t know anything about what it takes to perform the action either. And we haven’t “robbed Peter to pay Paul” by pushing the responsibility to the publisher, we have put something in between the logic and publisher instead. It’s probably a zero-sum game in terms of the lines of code required to perform the action, but that’s not where the win is, at least not initially.

The call on the factory to GetPublisher() is kind of a violation of the Tell, Don’t Ask principle. Instead of asking (the factory) for an object we can use to publish messages, and then building the messages ourselves, it is preferable to tell someone else (the worker) to do it for us.

Easier Testing/Mocking

A natural by-product of this refactoring is that the logic is then easier to unit test because we have reduced the number of dependencies. Doing that reduces the burden we place on having to mock those dependencies to get the code under test. Hence before our refactoring we would have had to mock the factory, the resultant publisher and any other “complex” dependent types used by it. Now we only need to mock the worker itself.

My suspicion is that getting into this situation in the first place can often be the result of not doing any testing, only doing high-level testing (e.g. acceptance tests/system tests) or doing test-later unit testing; i.e. the production code is always written first. This is where using a test-first approach to low-level testing (e.g. TDD via unit tests) should have driven out the simplified design in the first place instead of needing to refactor it after-the-fact.

One of the benefits of doing a lot of refactoring on code (See “Relentless Refactoring”) is that over time you begin to see these patterns without even needing to write the tests first. Eventually the pain of getting code like this under test accumulates and the natural outcomes start to stick and become design patterns. At this point your first order approximation would likely be to never directly pass a factory to a collaborator, you would always look to raise the level of abstraction and keep the consumer’s logic simple.

That doesn’t mean you never need to pass around a “classic” factory type, but in my experience they should be very far and few between [2].

Use Before Reuse

Once we have hidden the use of the factory away behind a much nicer abstraction we can tackle the second smell - do we even need the factory in the first place? Presumably the reason we have the factory is because we need to “acquire” a more low-level publisher object that talks to the underlying 3rd party API. But do we need to manufacturer that object on-the-fly or could we just create one up front and pass it to our new abstraction directly via the constructor? Obviously the answer is “it depends”, but if you can design away the factory, or at least keep it outside “at the edges” you may find it provides very little value in the end and can get swallowed up as an implementation detail of it’s sole user.

The reason I suggest that factories are a smell is because they hint at premature thoughts about code reuse. Rather than just creating the abstraction we need at the time and then refactoring to generalise when the need arises, if the need ever arises, we immediately start thinking about how to generalise it up front. A factory type always gives the allure of a nice simple pluggable component, which it can be for connection/resource pools, but it has a habit of breeding other factories as an attempt it made to generalise it further and further.

One of the best articles on premature reuse is “Simplicity before generality, use before reuse” by Kevlin Henney in “97 Things Every Software Architect Should Know”.

 

[1] The Network is Evil is a good phrase to keep in mind as hiding a network request is not a good idea, per-se. The point here is that the interface is simple, irrespective of whether we make the call synchronously or asynchronously.

[2] The way Generics work in C# means you can’t create objects as easily as you can with templates in C++ and so passing around delegates, which are essentially a form of anonymous factory method, is not so uncommon. They could just be to support unit testing to allow you to mock what might otherwise be in an internal factory method.

No comments:

Post a Comment