Program Logic & Refactoring

First off, let me say that refactoring is really nothing more than rewriting code to improve the logic. My personal recommendation is to write it correctly in the first place. That said, I have worked on many projects when I didn't write the code in the first place, but it needs help. Since management tends to frown on rewriting code, instead we refactor it, because that is clearly a better alternative.

OK, so we are going to structure some code into some meaningful arrangement that allows the code to be understood and maintained. Where do we start? What's the best organization? Should we use design patterns?

Organizing

How would you go about writing a recipe? First, you need an idea of what you want to make. Then you need a list of things that will be used to make it. This can include additional tools necessary in the production. Generally you would then break the process up into meaningful steps or stages. Say, for example, we are making a pumpkin pie.

The Idea

We have our idea: Pumpkin Pie.

The List

We'll need:

  • a pie pan
  • pie crust (make or buy?)
  • pumpkin pie filling (again make or buy?)
  • an oven (we often miss the obvious)
  • spoons
  • oven mitt

That's a good start.

To simplify this, we'll just buy the parts we can, such as a pie crust and filling.

Steps

We need to prepare the crust.

We need to prepare the filling.

We need to combine them.

We need to bake it.

We need to eat it. (OK, this isn't really a step, I just want some pie.)

Is This Cooking or Programming?

We are talking about programming so let's convert this pseudo-recipe into a psuedo-program.

function pumpkinPie () { $crust = new PackagedCrust(\'Mama\\\'s Homemade Crusts\'); $filling = new PieFilling(\'Pumpkin Joe\'); $pie = new Pie(); $pie->crust = $crust; $pie->filling = $filling; $pie->bake(350, \':30\'); $result = $pie.serve(); return $result; }

So That's It?

In a word, No. In two words, No way. OK, so what's wrong with what we just did?

It makes one kind of pie. It uses only one set of ingredients. It isn't expandable.

How to Fix it

Think long term, think options, think maintenance. What if we want to use another manufacturer of each product? What about using homemade crust (or hand coded)? How can we change this so we have that kind of flexibility?

Data Driven Design

Let's start by passing in parameters to allow for some customization. Let's see, we might want to specify a different crust or a different filling. We'll add two parameters.

function pumpkinPie ($crustname, $fillingname) { $crust = new PackagedCrust($crustname); $filling = new PieFilling($fillingname); …

So far so good. But what if the new materials require different bake times? Let's add them in. We'll need a new temperature parameter and a new time parameter.

function pumpkinPie ($crustname, $fillingname, $temp, $time) { $crust = new PackagedCrust($crustname); $filling = new PieFilling($fillingname); $pie = new Pie(); $pie->crust = $crust; $pie->filling = $filling; $pie->bake($temp, $time); $result = $pie.serve(); return $result; }

Better? Well, now we've lost the defaults, so let's add them back.

function pumpkinPie ($crustname='Mama\'s Homemade Crusts', $fillingname='Pumpkin Joe', $temp=350, $time=':30') {

OK, now we are back to where we were before, but is this really better? Not really. In some languages this would be OK since you can simply not pass a value and have the default value apply. In PHP, you cannot skip a parameter so if you just want to override the third parameter, you have to pass the first two as well.

So how do we fix this? In languages that force strong typing of variables the previous technique works well enough even with its limitations. In PHP we have another alternative I like. If we create an array of default values and pass a set of override values we can simply pass only the values we want to change and leave the rest as is.

function pumpkinPie ($options=array()) { $defaults = array( 'crust' => 'Mama\'s Homemade Crusts', 'filling' => 'Pumpkin Joe', 'temp' => 350, 'time' => ':30', ); $options = array_merge($defaults, $options); $crust = new PackagedCrust($options['crust']); $filling = new PieFilling($options['filling']); $pie = new Pie(); $pie->crust = $crust; $pie->filling = $filling; $pie->bake($options['temp'], $options['time']); $result = $pie.serve(); return $result; }

Now, is this better? Yes. OK, so why is it better? Well, for one we only need pass values for what we want to change. We can also pass the parameters in any order. We can also add new parameters and defaults without affecting existing code. Cool huh?

Consolidation of Code

I've seen this done very very often and it works. A function needs to be called with a set of parameters say twenty times. So the programmer does just that and we get:

processRequest('AAA', 20, 13.425, false); processRequest('AAB', 20, 13.425, false); processRequest('AAC', 20, 13.425, false); processRequest('AAF', 20, 13.425, false); processRequest('ABA', 20, 13.425, false); processRequest('ADA', 20, 13.425, false); … processRequest('CAA', 20, 13.425, false); processRequest('DAA', 20, 13.425, false); processRequest('EAA', 20, 13.425, false);

This works, so what's wrong? Well, there is a great deal of redundancy. If you need to add new entries there are lots of opportunities to make mistakes. It isn't elegant. So how do we fix this? Data-driven design.

First rewrite

$array = array('AAA', 'AAC', 'AAF', 'ABA', 'ADA', … 'CAA', 'DAA', 'EAA'); foreach($array as $item) { processRequest($item, 20, 13.425, false); }

This does the same thing as before, but now it's no longer twenty lines, but only four. Easier to maintain, easier to read, less likely to make mistakes when adding new entries.

But what if the second parameter also changed? Or even worse, only changed for 3 items? Let's fix our example to deal with that. We aren't going back to the original code for the original reasons. But we will go one step further. If the other two parameters might eventually change as well, let fix it so we can support that in the future.

First we need to restructure the array to allow for multiple parameters.

Second rewrite

$array = array( array('AAA', 17), 'AAC', 'AAF', array('ABA', 19), 'ADA', … 'CAA', 'DAA', array('EAA', 21), ); foreach($array as $item) { if (!is_array($item)) { $item = array($item); } @list($name, $ix, $iy, $rewrite) = $item; if ($ix == NULL) $ix = 20; if ($iy == NULL) $iy = 13.425; if ($rewrite == NULL) $rewrite = false; processRequest($item, $ix, $iy, $rewrite); }

This is better, but it still isn't elegant enough. if the parameter list gets too long, the default replacements get messy. Let's improve it again.

Third rewrite

$array = array( array('name' => 'AAA', 'ix' => 17), 'AAC', 'AAF', array('name' => 'ABA', 'ix' => 19), 'ADA', … 'CAA', 'DAA', array('name' => 'EAA', 'ix' => 21), ); $defaults = array('ix' => 20, 'iy' => 13.425, 'rewrite' => false); foreach($array as $item) { if (!is_array($item)) { $item = array('name' => $item); } $item = array_merge($defaults, $item); processRequest($item['name'], $item['ix'], $item['iy'], $item['rewrite']); }

Now wait, just a second. This code is almost longer than the original. How is this better? We could have done all this in the originally formatted code with less effort, right? So what are we missing here?

Well, for one, we've separated the code from the data. We've also allowed the data to change without impacting the logic. If we later need to do more than just call processRequest we can do so without having to repeat that change 20 or more times. We've decoupled things so they are less fragile. We've allowed the data to become self-descriptive and order-independent. If the parameters to processRequest ever change, we can easily change it in one place.