Close your eyes and imagine your program as a function that takes a set of inputs and produces a set of outputs. I know this may seem overly simple, but a set of actions in a GUI can be thought of as a set of inputs, and a set of resulting side effects to a database can be seen as a new state of the world being returned.
Now focus on its input space. This space is comprised as all possible combinations of all possible inputs. In this set some will be well defined for your program and some not. An example of a not well defined input could be as simple as an incorrect database connection string, as straightforward as an incorrect combination of flags on a console application, or as difficult to detect as a date with month and day transposed.
A program thought of in this way is a fractal-like thing, a program made of little smaller programs, made of smaller programs yet. However, there’s no guarantee that each of these smaller programs will treat of a piece of data in exactly the same way as others. In addition to any initial validation, any top-level inputs which cause other inputs to be given to sub-programs where they are not properly handled are similarly considered not well defined. Consider these three approaches to making your program safer by reducing the size of incorrect input space:
First, you can increase the size of the blue circle with explicit input checking. This means numerous validations to ensure the program exits with proper notification when incorrect inputs are given. However, the program is fractal, and so if we want to be safe we’ll need to reproduce many of these checks fractally. A great example of this is handling null values and we all know how that turns out.
Another approach is to shrink the size of the red circle. We can do this by making fewer incorrect states representable with types. Because we know that all of the potential states are valid once encoded, we only need to do our checks once while marshalling our data into a well-typed representation. This eliminates almost all need for repeated validation, limited only by how far your type system will take you. Even better, with newer language features (such as F# type providers) we can eliminate much of this marshalling phase, this is however similarly limited by how far the schema of the data will take you.
A third approach, available only in some situations but which I find extremely fascinating, is to build everything in such a way as the entire program logs the error and resets its state when an incorrect input is found. Most paradoxically, in this case the more fragile you make your system, the safer it is (as long as you ensure that external state changes are the very last thing done, and that they’re done transactionally). This seems to be the Erlang philosophy and the only flaw I can find with it is shared with most type systems. That is, you can’t implicitly account for ambiguous inputs or state spaces that your type system can’t constrain.
Enjoy this post? Continue the conversation with me on twitter.