Enumerations for the Reduction of Complexity

Introduction

One of the ever present problems faced by developers is the concept of complexity. Complexity is everywhere; it makes things difficult to analyze and even more difficult to control. When writing programs, complexity is introduced exponentially every time a conditional path is added. With every ‘IF’ statement, you are introducing two possible paths your program could take. This article will start first with principles of conditions. I will then examine possible use cases for enumerations.

https://www.pexels.com/photo/jellyfish-2615541/

First Principles

In 500 BC a philosopher named Democritus proposed this theory that the universe could be broken down into these tiny discrete geometric units that he called ‘atoms’. This theory during it’s conception was no more founded on science than any of the other competing theories at the time. Of course, now we know that this theory was much closer to the truth than the other theories. For the sake of argument, lets just assume that Democritus’s theory of atoms is correct. What has he just done? He has taken the complexity of the universe and broken it down into it’s principle elements. This process has been, and continues to be attempted by scientists all over the world. In order to fully describe something, you must be able to break it down into its atomic components. This is one of the reasons I enjoy programming so much. It forces you to describe a problem in it’s most fundamental building blocks. After all, at the end of the day the only tool in your tool-box is a ‘0’ and a ‘1’. It’s up to the programmer to describe arbitrary complexity with those two digits. One of the most common ways that this is done is through conditionals. Say you are trying to write the behavior for a car. You would probably have the following idea:

If the traffic light is red then stop

With this simple condition you have defined a branch of execution. The car will either encounter a red light and stop, or the light won’t be red and the car can continue. This is one building block required in describing the behavior of a car. But if you look a little closer at this statement, it doesn’t take long to realize that it is not precise enough. According to the ‘IF’ statement the traffic light can only be ‘red’ or ‘not red’. We of course know that traffic lights can be yellow as well as green. Therefore, there are situations where a simple boolean is not descriptive enough to describe a condition. What does this tell us? It means we haven’t broken the problem down into it’s most atomic parts yet. There is still some complexity hiding in the condition that needs to be defined. So the obvious thing you would do next is change the statement to look something like this:

 If the traffic light is red then stop 
 else if the traffic light is yellow slow down
 else if the traffic light is green continue

This works and is much more precise, but it’s a little verbose and unnecessary. Let’s say the light is green. When the program hits this section of execution it is now keeping track of the following booleans:

  • IsGreen
  • IsYellow
  • IsRed

In some situations this might be required, but not in the traffic light scenario, because the light can never be in more than one of those states. This gives us a chance to use another tactic to reduce the complexity of our behavior. Give a warm welcome to our friend enumerations.

Enumerations

Enumerations1 at their heart are simple integers. Their intended purpose is to describe all possible states that an object could be in. Thinking back to the traffic light problem, we know that a stop light can either be Red, Yellow, Or Green2. So what if we made an agreement that when the light was green we would represent that as a ‘1’ , yellow as a ‘2’ and red as a ‘3’. We could now rewrite the ‘IF’ statement above to say

If traffic light is currently 3 then stop

This is perfectly precise and we are only keeping track of one variable now. Much better! The downside here is that we have lost some readability, because we now have to remember that 3 actually means red. This will get annoying very quickly. Time to take a look at enumerations and kill two birds with one stone. Enumerations map an integer to a name. So if I wanted to create an enumeration to represent our traffic light state it would look like this:

enumeration named TrafficLight{
   Green = 1,
   Yellow = 2,
   Red = 3
}

With this enumeration defined, we could then redefine our ‘IF’ statement:

If traffic light is TrafficLight.Red then stop

Voila! We now have a statement that is precise and easy to read.

The Fountain of Youth
Lucas Cranach

Flags

So are enumerations a magic elixir that cures all our ails of conditional complexity? Well sadly, no. But when used judiciously they can go a long way in managing complexity. Let me leave you with one more way you can use enumerations, which is probably not immediately obvious to you. Consider the following system:

This system contains 3 animal components. Each animal can do different things, but many of the actions have similarities. So imagine that you had to write a controller that would be responsible for receiving and processing events from these three components. Let’s make an enumeration that contains all of the events that the system can contain.

enumeration named AnimalSystem{
    DogEat = 1,
    DogRun = 2,
    DogAttack = 3,
    Bark = 4,
    CatEat = 5,
    CatRun = 6,
    CatAttack = 7,
    LickPaw = 8,
    Meow = 9,
    FishEat = 10,
    FishRun = 11,
    BlowBubble = 12
}

Wow! I am out of breath just writing that! The complexity is killing me and that’s only for three animals! Imagine if we added a rabbit to the system? Surely there has to be way to cut down this complexity even more. Indeed there is. We need to use enumerations in a different way. So far we have used enumerations to represent state. Like in our traffic light example, the state of the traffic light can neatly be described as ‘red’,’yellow’, or ‘green’. We also have just defined the state of the Animal System explicitly. One way to rephrase it is that we have described what the current state “IS”. What if we could ask a different question. What if we could describe the current state in terms of what the current state “HAS”. Well in the traffic light scenario it would make no difference. But it would drastically reduce the complexity of the animal system. If we redesign the enumeration to answer the “HAS” question we would get an enumeration like this:

 enumeration named AnimalSystem{
    Dog = 1,
    Cat = 2,
    Fish = 4,
    Eat = 8,
    Run = 16,
    Attack = 32,
    Speak = 64,
    LickPaw = 128
}

The way this works is probably not immediately obvious unless you’ve seen this type of thing before. The reason this works is due to the numbers we assigned the names. You’ll notice that each of those numbers is a power of 2. As we learned in the post What is Binary, every time you raise a binary number by a power of 2 you are essentially adding a zero to the right of the number. So lets rewrite the enumeration in binary

 enumeration named AnimalSystem{
// NAME  INTEGER       BINARY  Power of 2 
    Dog = 1,          00000001       20
    Cat = 2,          00000010       21
    Fish = 4,         00000100       22
    Eat = 8,          00001000       23 
    Run = 16,         00010000       24
    Attack = 32,      00100000       25
    Speak  = 64,      01000000       26
    LickPaw = 128     10000000       27 
} 

So now that we have cleverly organized numbers, we can use these values to quickly filter Animal System events. We can now know that if the first bit of the event is a ‘1’ then the event is going to be about the dog. If the fourth bit is also ‘1’ we now know that the dog is supposed to eat. If this is sounding similar to Bit Masking, you are absolutely correct. It is the same technique only a little easier to think about, because the numbers now have names. So how would we create a Dog-Eat event for our animal system? Well, we would use the logical “OR” operator to combine bits.

//Dog-Eat Event
// As a reminder the symbol '|' represents a logical OR statement
AnimalSystem.Dog | AnimalSystem.Eat =  0001001
0000001          |  0001000         =  0001001

So now our AnimalSystem sends out the event that has the code ‘0001001’. The people listening for the event then can process that event by asking ‘HAS’ questions about the event. So the dog component in this case would say – does this event have anything to do with dogs? Well the first bit is ‘1’ so the answer is yes! This means the dog is now interested in processing the rest of this event. You can ask this event what it ‘HAS’ by using the reverse of the ‘OR’ operation which is the ‘AND’ operation.

GenericAnimalSystemEvent = 0001001

//Does this AnimalSystemEvent have anything to do with a dog?
// Reminder the '&' symbol represents the logical 'AND' operation

does  GenericAnimalSystemEvent & AnimalSystem.Dog  = AnimalSystem.Dog
                       0001001 & 0000001           = 0000001

As you can see, you can use the ‘AND’ operation to ask an event if it ‘HAS’ a certain flag.

Conclusion

There are always multiple ways to solve problems. Some are better than others at limited complexity and, more importantly, making it easier for us humans to think about. I believe the above example is a great way to encode/decode complex data in a very simple way. Using this new method, if you wanted to add a rabbit component to the animal system, you would only have to add one more entry into the enumeration instead of four entries. On top of that, there are no rules inherent in this data structure about what trait the animal components have to support. This allows a lot of flexibility. For example, right now a Dog-LickPaw event is not supported by our system specifications. But with this design it would be trivial to add support to that event. Again, one of the main goals of a programmer should be to describe problems as specifically as possible. We started out with a boolean that can only be true or false. Then moved to an enumeration which can have ‘N’ number of states. Finishing with a simple AnimalSystem. First using an enum that explicitly defined 12 different states the AnimalSystem could be in. We then used the enumeration as an implicit description of the state of the AnimalSystem that allowed us to represent 5,040 possible states with 7 entries.

Footnotes

1. are often called ‘enums’ in programming languages
2. This of course is assuming the traffic light is powered on and functioning correctly. This is a testament to how often you think you have a problem fully broken down, but you have overlooked certain states and edge conditions.

Leave a Comment