Tuesday, 23 May 2017

Are Refactoring Tools Less Effective Overall?

Prior to the addition of automatic refactoring tools to modern IDEs refactoring was essentially a manual affair. You would make a code change, hit build, and then fix all the compiler errors (at least for statically typed languages). This technique is commonly known as “leaning on the compiler”. Naturally the operation could be fraught with danger if you were far too ambitious about the change, but knowing when you could lean on the compiler was part of the art of refactoring safely back then.

A Hypothesis

Having lived through both eras (manual and automatic) and paired with developers far more skilled with the automatic approach I’ve come up with a totally non-scientific hypothesis that suggests automatic refactoring tools are actually less effective than the manual approach, overall.

I guess the basis of this hypothesis pretty much hinges on what I mean by “effective”. Here I’m suggesting that automatic tools help you easily refactor to a local minima but not to a global minima [1]; consequently the codebase as a whole ends up in a less coherent state.

Shallow vs Deep Refactoring

The goal of an automatic refactoring tool appears to be to not break your code – it will only allow you to use it to perform a simple refactoring that can be done safely, i.e. if the tool can’t fix up all the code it can see [2] it won’t allow you to do it in the first place. The consequence of this is that the tool constantly limits you to taking very small steps. Watching someone refactor with a tool can sometimes seem tortuous as they may need to use so many little refactoring steps to get the code into the desired state because you cannot make the leaps you want in one go unless you switch to manual mode.

This by itself isn’t a bad thing, after all making a safe change is clearly A Good Thing. No, where I see the problem is that by fixing up all the call sites automatically you don’t get to see the wider effects of the refactoring you’re attempting.

For example the reason you’d choose to rename a class or method is because the existing one is no longer appropriate. This is probably because you’re learned something new about the problem domain. However that class or method does not exist in a vacuum, it has dependencies in the guise of variable names and related types. It’s entirely likely that some of these may now be inappropriate too, however you won’t easily see them because the tool has likely hidden them from you.

Hence one of the “benefits” of the old manual refactoring approach was that as you visited each broken call site you got to reflect on your change in the context of where it’s used. This often led to further refactorings as you began to comprehend the full nature of what you had just discovered.

Blue or Red Pill?

Of course what I’ve just described could easily be interpreted as the kind of “black hole” that many, myself included, would see as an unbounded unit of work. It’s one of those nasty rabbit holes where you enter and, before you know it, you’re burrowing close to the Earth’s core and have edited nearly every file in the entire workspace.

Yes, like any change, it takes discipline to stick to the scope of the original problem. Just because you keep unearthing more and more code that no longer appears to fit the new model it does not mean you have to tackle it right now. Noticing the disparity is the first step towards fixing it.

Commit Review

It’s not entirely true that you won’t see the entire outcome of the refactoring – at the very least the impact will be visible when you review the complete change before committing. (For a fairly comprehensive list of the things I go through at the point I commit see my C Vu article “Commit Checklist”.)

This assumes of course that you do a thorough review of your commits before pushing them. However by this point, just as writing tests after the fact are considerably less attractive, so is finishing off any refactoring; perhaps even more so because the code is not broken per-se, it just might not be the best way of representing the solution.

It’s all too easy to justify the reasons why it’s okay to go ahead and push the change as-is because there are more important things to do. Even if you think you’re aware of technical debt it often takes a fresh pair of eyes to see how you’re living in a codebase riddled with inconsistencies that make it hard to see it’s true structure. One is then never quite sure without reviewing the commit logs what is the legacy and what is the new direction.

Blinded by Tools

Clearly this is not the fault of the tool or their vendors. What they offer now is far more favourable than not having them at all. However once again we need to be reminded that we should not be slaves to our tools but that we are the masters. This is a common theme which is regularly echoed in the software development community and something I myself tackled in the past with “Don’t Let Your Tools Pwn You”.

The Boy Scout Rule (popularised by Uncle Bob) says that we should always leave the camp site cleaner than we found it. While picking up a handful of somebody else’s rubbish and putting it in the bin might meet the goal in a literal sense, it’s no good if the site is acquiring rubbish faster than it’s being collected.

Refactoring is a technique for improving the quality of a software design in a piecewise fashion; just be careful you don’t spend so long on your hands and knees cleaning small areas that you fail to spot the resulting detritus building up around you.


[1] I wasn’t sure whether to say minima or maxima but I felt that refactoring was about lowering entropy in some way so went with the reduction metaphor.

[2] Clearly there are limits around published APIs which it just has to ignore.

1 comment:

  1. Regarding local/ global minima/ maxima: I tend to find that, in my case, I settle on a Nash Equilibrium ;-)

    Also, I'm glad you gave us tool vendors a reason to continue towards the end :-)