The Finite State Machine

Introduction

The Finite State Machine is a conceptual model of computation known for its simplicity. Although it has significance in the study of automata and computation theory, it also has practical significance, such as its application in lexical analysis, text parsing, user interfaces, event-driven systems, machine learning, and others. They are found around us in every day code and can usually be used to improve or simplify your every day code. You probably have written a finite state machine without even knowing it!

What is it a Finite State Machine?

A Finite State Machine is a type of model of computation, which is an abstract machine that performs computation and solves problems via algorithms. What makes it stand out to other models of computation is that it can only be in one state at a time, out of a finite (limited) number of pre-defined states. The machine takes an input, reading its elements one-by-one (think characters of a word or string), and then changes its state based on the input element. The machine has a transition function, which tells it how to change its state based on its input. In other words, the machine changes state according to its transition function.

We study abstract machines instead of real ones to simplify our studies. The abstract machines are close enough to abstract real world cases, but forgetting about real world restrictions simplifies our model and helps us reason about the model and arrive to conclusions more easily.

Why limit ourselves to such a basic model when we have such powerful computing machines? Well it turns out that this simple model bear properties that we would not have using a more powerful model, like the Turing Machine or the RAM model found in every day computers.

To understand this abstract talk better, let's take an example.

Example: Detecting Consecutive Errors

We have an e-commerce site that allows customers to place orders. We have received complaints about the site's performance, and we suspect the search end-point is the offending one. We tolerate one-off errors occurring due to network conditions, but two or more errors in a row are not. We want to investigate logs of the search end-point to see if it has had two or more errors in a row.

We have a list of all our application logs over the past month to scan for consecutive errors. We can write an algorithm using a typical programming language to do this. Below is a Python example:

Imperative Code Solution

def are_errors_consecutive(logs):
    consecutive_errors = 0
    for log in logs:
        if is_error_log(log):
            consecutive_errors += 1 # increment by one
        else:
            consecutive_errors = 0 # reset back to 0
        if consecutive_errors >= 2:
            return true
    return false

We first initialize a variable consecutive_errors then start a for-loop through the input. The variable holds our state throughout the loop. For each log, we check if it is an error, and update the state variable by incrementing it if an error was found, or resetting to 0 if non-error log is encountered. If we see that there have already been two consecutive errors, we return early from the function.

We used is_error_log() since we did not specify what our input looks like exactly. Its implementation depends on the form of the input.

The program we just wrote is very similar to a finite state machine! Let us explore the finite state machine implementation.

Finite State Machine Solution

A finite state machine can only be in one state. Thankfully, our function already used one state variable with only a finite number of possibilities: 0, 1, and 2 errors detected.

The state has two values of special interest: the starting state and acceptance state. Our starting state is 0, and acceptance is 2 consecutive errors detected. If execution ends with the machine in the accept state, then the input altogether is accepted. That is the "return" of the finite state machine: whether the input is accepted or not.

Real world implementations of the finite state machine may deviate slightly from these definitions. For example, we may have the final state as the return value rather than a binary accept or reject. Those machines would still closely resemble the abstract finite state machine.

Transition Function

An important part of the state machine is how it moves from state to another. The finite state machine reads the elements of the input one-by-one. For every element, it may change its state based on the input element and its current state. This is called the "transition function", defining how the machine changes state.

In our example, we can describe the transition function as follows:

- If state is 0, and non-error log encountered, new state is 0
- If state is 0, and error log encountered, new state is 1
- state = 1, input = non-error -> state = 0
- state = 1, input = error -> state = 2
- state = 2, input = non-error -> state = 2
- state = 2, input = error -> state = 2

One last difference about the finite state machine is that it does not return early. It will read the input in its entirity until it is done.

Congratulations, we just devised a finite state machine! Let us walk through a sample execution.

Sample Execution of Finite State Machine

Let's take an example input. For simplicity, let us define a non-error log as 0, and an error log as 1. Then, let us take the following input example: 00100110

The machine starts with its starting state as 0 (state = 0)
it reads the first input element as 0. Based on the transition function, the new state remains at 0 (state = 0)
same as above happens for the second input element, as it is also a 0 (state = 0)
the third input = 1. Based on transition function, state moves to 1. (state = 1)
input = 0. Based on transition function, state moves to 0 (state = 0)
input = 0, new state = 0 (state = 0)
input = 1, new state = 1 (state = 1)
input = 1, new state = 2 (state = 2)
input = 0, new state remains at 2 (state = 2)

we Finish execution on an accept state, so our input is accepted!

Notice how it is easier to reason about the algorithm by thinking of it as a bunch of states and transitions between them, rather than an imperative loop. While imperative loops are a common encounter for programmers and make a lot of sense, state machines are even easier to make sense of for non-technical people. All it takes is thinking of states and transitions. The difference in ease of understanding may be more apparent in more complex problems compared to our simple example.

Definition of Finite State Machine

To define a specific finite state machine, we must define the following:

** An Alphabet:** The range of values it accepts as input elements
** States:** The finite list of possible states the machine can hold
** Starting State**
** Accept State**
** Transition Function:** Defines how the machine changes state, given an input element and current state as input to the function

Our example above can be defined as follows:

** An Alphabet:** The set ["error log", "non-error log"]
** States:** [0,1,2]
** Starting State:** 0
** Accept State:** 2
** Transition Function:** We already defined it above in the transition function section

A deterministic finite state machine bearing the above is a machine that holds a single state, and reads an input, changing state based on its transition function as it goes.

What we listed above are all the constraints needed to define a Finite State Machine.

You Have Built Finite State Machines Without Noticing!

Notice how the example we presented was originally solved using a traditional loop in the common imperative style of code. If you're a programmer, you have probably written many Finite State Machines without even noticing that they are! This is one of the important aspects of learning models of computation. It is not only about choosing which model of computation to use, but also about identifying them, which can help with conceptualizing and visualizing, as well as recognizing the properties you can apply to them.

Characteristics of Finite State Machines

Limited Memory and Good Performance

The "finite state" part means that the machine's memory is bounded. This is in contrast to many other models of computation where memory is unbounded. This makes the Finite State Machine very performant in many cases, and useful in certain low level programming such as embedded devices and network devices.

This also means that the Finite State Machine is unsuitable for applications where finite memory is insufficient. For example, consider the problem of verifying whether an input has a certain number of 0's followed by the same exact number of 1's. This is impossible for a Finite State Machine to solve, as it requires unbounded memory, counting an unbounded number of 0's in order to verify that the number of 1's is equal.

Note: nearly all real world problems have some memory bound, even if unrealistically high. So you can still use Finite State Machines there, but it just may not be the best idea to do so.

Simplicity

The simplicity of the Finite State Machine makes it easy to reason about. Modeling problems using it can help you identify superfluous or error states, use reduction methods to simplify the problem, or make use of the other characteristics we discuss below.

Easy to Construct Complex Machines from Simpler Ones

Every Finite State Machine can be expressed as a combination of simpler ones. In fact, Every one of them can be broken down to the most basic Finite State Machines, i.e. Machines that accept if the input is one specific character, and reject otherwise.

If we take our error log example, the machine we devised can be thought of as the combination of two simpler machines: one that accepts 0's only, and one that accepts two or more 1's only (no 0's). Those two machines can in turn be broken down further into elementary machines: one that accepts the single input 0 and rejects everything else, and the same but for 1.

This has important implications, one of which probably affects your every day programming life: Regular Expressions.

Regular Expressions

Regular expressions are actually nothing more than expressing the class of inputs that a Finite State Machine must accept, but using elementary characters. For example, if I have the regular expression /yes/, this can be thought of as the combination of /y/, /e/, and /s/, or the finite state machines that recognize each of those individually. This is called the "concat" operation. The union operation can be expressed as /[abc]/ which would accept an input that is "a", "b", or "c".

Regular expressions are indeed built on top of regular expressions, at least conceptually (modern implementations tend to have extensions).

Equivalence of Deterministic and Non-Deterministic Finite State Machines

Most of what we have seen is about deterministic Finite State Machines, in which transition functions have deterministic outcomes. That is, given a current state and an input, there is always one possible non-null state outcome. Moreover, every state has a transition for every input-state combination.

Non-deterministic Finite State Machines do not have this requirement. In other words, a transition could have one or more output states for the same input. How does this make sense? well, when multiple outputs are encountered, you can imagine as if the process splits, with two new state machine processes born out of it, one for each state. Execution continues as normal, and one of these "processes" ending in an accept state is sufficient for the entire input to accepted. If this does not make sense, I will be writing a separate blog post on Non-Deterministic Finite State Machines with examples, so stay tuned! Feel free to check the Wikipedia article in the meantime.

The key takeaway though: it may seen that Non-Deterministic FSMs (Finite State Machine) are more powerful and capable of solving more problems than their deterministic counterpart. In reality, every Non-Deterministic FSM has an equivalent deterministic FSM. Not only that, there is a specific mechanism for converting a Non-deterministic FSM to a deterministic one.

This is quite important. Some problems are much easier to conceptualize with a non-deterministic model, but can be easily converted to a deterministic model, conserving the simplicity of the latter.

The Pumping Lemma

Although out of scope for this article, regular languages have an important mathematical property that they all share called the Pumping Lemma. The key takeaway is that this makes it possible to mathematically prove whether a language (set of inputs) is regular (solvable by finite state machine) without having to search for a specific implementation.

Applications of Finite State Machines

Although important in theory, Finite State Machines have practical applications. They are commonly used in lexical parsing, network devices, machine learning and others.

Moreover, as we saw earlier, you might write a program that is a Finite State Machine without noticing it! You don't have to go out of your way to write programs as FSMs to gain their benefits. Sometimes, it is enough to recognize them in existing problems and solutions. It'll make it easier to map the program's state to the real-world representation. You might represent state and its transitions in a clearer, more maintainable and readable way. Moreover, being more intentional about state management will help you avoid errors in the long run.

Some examples that you are more likely to run into day-to-day:

Event-Driven Programming: Event driven programming is increasingly common in distributed systems, and they can often be modeled as state machines
Reactive UI State: If you have ever created a React.JS component, you very likely modeled your component state as a state machine
Regular Expressions (Regex): Explained in a previous section