The Finite State Machine

| 8 min read

Introduction

The Finite State Machine is one of many models of computation in computer science. Although it has great significance in the study of automata and computation theory, it also has practical significance, such as its application in text parsing, compilers, and machine learning, and network devices. Not only that, they are found around us in every day code (such as event-driven programs) and can usually be used to improve or simplify your every day code. You probably have written a finite state machine without even knowing it!

What is it?

A Finite State Machine is an abstract machine that performs computation. What makes it stand out to other computing machines is that it can only be in one state at a time, out of a finite (limited) number of pre-defined states. The machine takes an input, reading its elements one-by-one (think characters of a word or string), and then changes its state based on the input element. The machine has a transition function, which tells it how to change its state based on its input. In other words, the machine changes state according to its transition function.

To understand this abstract talk better, let's take an example.

Example: Detecting Consecutive Errors

We have an e-commerce site that allows customers to place orders. We have received complaints about the site's performance, and we suspect the search end-point is the offending one. We tolerate one-off errors occurring due to network conditions, but two or more errors in a row are not. We want to investigate logs of the search end-point to see if it has had two or more errors in a row.

We have a list of all our application logs over the past month to scan for consecutive errors. We can write an algorithm using a typical programming language to do this. Below is a Python example:

Imperative Code Solution

def are_errors_consecutive(logs):
consecutive_errors = 0
for log in logs:
if isErrorLog(log):
consecutive_errors += 1 # increment by one
else:
consecutive_errors = 0 # reset back to 0
if consecutive_errors >= 2:
return true
return false

We first initialize a variable consecutive_errors then start a for-loop through the input. The variable holds our state throughout the loop. For each log, we check if it is an error, and update the state variable by incrementing it if an error was found, or resetting to 0 if non-error log is encountered. If we see that there have already been two consecutive errors, we return early from the function.

We used isErrorLog() since we did not specify what our input looks like exactly. Its implementation depends on the form of the input.

The program we just wrote is very similar to a finite state machine! Let us explore the finite state machine implementation.

Finite State Machine Solution

A finite state machine can only be in one state. Thankfully, our function already used one state variable with only a finite number of possibilities: 0, 1, and 2 errors detected.

The state has two values of special interest: the starting state and acceptance state. Our starting state is 0, and acceptance is 2 consecutive errors detected. If execution ends with the machine in the accept state, then the input altogether is accepted. That is the "return" of the finite state machine: whether the input is accepted or not.

Real world implementations of the finite state machine may deviate slightly from these definitions. For example, we may have the final state as the return value rather than a binary accept or reject. Those machines would still closely resemble the abstract finite state machine.

Transition Function

An important part of the state machine is how it moves from state to another. The finite state machine reads the elements of the input one-by-one. For every element, it may change its state based on the input element and its current state. This is called the "transition function", defining how the machine changes state.

In our example, we can describe the transition function as follows:

- If state is 0, and non-error log encountered, new state is 0
- If state is 0, and error log encountered, new state is 1
- state = 1, input = non-error -> state = 0
- state = 1, input = error -> state = 2
- state = 2, input = non-error -> state = 2
- state = 2, input = error -> state = 2

One last difference about the finite state machine is that it does not return early. It will read the input in its entirity until it is done.

Congratulations, we just devised a finite state machine! Let us walk through a sample execution.

Sample Execution of Finite State Machine

Let's take an example input. For simplicity, let us define a non-error log as 0, and an error log as 1. Then, let us take the following input example: 00100110

  • The machine starts with its starting state as 0 (state = 0)
  • it reads the first input element as 0. Based on the transition function, the new state remains at 0 (state = 0)
  • same as above happens for the second input element, as it is also a 0 (state = 0)
  • the third input = 1. Based on transition function, state moves to 1. (state = 1)
  • input = 0. Based on transition function, state moves to 0 (state = 0)
  • input = 0, new state = 0 (state = 0)
  • input = 1, new state = 1 (state = 1)
  • input = 1, new state = 2 (state = 2)
  • input = 0, new state remains at 2 (state = 2)

we Finish execution on an accept state, so our input is accepted!

Notice how it is easier to reason about the algorithm by thinking of it as a bunch of states and transitions between them, rather than an imperative loop. While imperative loops are a common encounter for programmers and make a lot of sense, state machines are even easier to make sense of for non-technical people. All it takes is thinking of states and transitions. The difference in ease of understanding may be more apparent in more complex problems compared to our simple example.

Definition of Finite State Machine

To define a specific finite state machine, we must define the following:

  • An Alphabet: The range of values it accepts as input elements
  • States: The finite list of possible states the machine can hold
  • Starting State
  • Accept State
  • Transition Function: Defines how the machine changes state, given an input element and current state as input to the function

Our example above can be defined as follows:

  • An Alphabet: The set ["error log", "non-error log"]
  • States: [0,1,2]
  • Starting State: 0
  • Accept State: 2
  • Transition Function: We already defined it above in the transition function section

A deterministic finite state machine bearing the above is a machine that holds a single state, and reads an input, changing state based on its transition function as it goes.

What we listed above are all the constraints needed to define a Finite State Machine.

You Have Built Finite State Machines Without Noticing!

Notice how the example we presented was originally solved using a traditional loop in the common imperative style of code. If you're a programmer, you have probably written many Finite State Machines without even noticing that they are! This is one of the important aspects of learning models of computation. It is not only about choosing which model of computation to use, but also about identifying them, which can help with conceptualizing and visualizing, as well as recognizing the properties you can apply to them.

Characteristics of Finite State Machines

Limited Memory and Good Performance

The "finite state" part means that the machine's memory is bounded. This is in contrast to many other models of computation where memory is unbounded. This makes the Finite State Machine very performant in many cases, and useful in certain low level programming such as embedded devices and network devices.

This also means that the Finite State Machine is unsuitable for applications where finite memory is insufficient. For example, consider the problem of verifying whether an input has a certain number of 0's followed by the same exact number of 1's. This is impossible for a Finite State Machine to solve, as it requires unbounded memory, counting an unbounded number of 0's in order to verify that the number of 1's is equal.

Note: nearly all real world problems have some memory bound, even if unrealistically high. So you can still use Finite State Machines there, but it just may not be the best idea to do so.

Simplicity

The simplicity of the Finite State Machine makes it easy to reason about. Modeling problems using it can help you identify superfluous or error states, use reduction methods to simplify the problem, or make use of the other characteristics we discuss below.

Easy to Construct Complex Machines from Simpler Ones

Suppose you have a complex computation problem. You can break down the problem into simpler ones, solve each of them with a finite state machine, then combine the result.

This is because Regular Languages (see note below) are "closed" under certain operations, such as union or concatenation. This means that languages formed from these operations applied to regular languages are also regular (i.e. solvable by Finite State Machine).

A Regular Language is a set of strings that, for all of its members, there exists a single Finite State Machine that can accept all of them. In other words, if those members are given to said Machine as input, all of them would end in accept state. Conversely, an irregular language is almost the same, except there is not a finite state machine that can solve them all.

If you know two regular languages and their finite state machines, you can find the finite state machine that accepts the language made from the union (or concatenation, for example) of those two languages from the two finite state machines.

This is useful in constructing regular expression (regex) patterns, which were originally built on Finite State Machines!

Equivalence of Deterministic and Non-Deterministic Finite State Machines

Most of what we have seen is about deterministic Finite State Machines, in which transition functions have deterministic outcomes. That is, given a current state and an input, there is always one possible non-null state outcome. Moreover, every state has a transition for every input-state combination.

Non-deterministic Finite State Machines do not have this requirement. In other words, a transition could have one or more output states for the same input. How does this make sense? well, when multiple outputs are encountered, you can imagine as if the process splits, with two new state machine processes born out of it, one for each state. Execution continues as normal, and one of these "processes" ending in an accept state is sufficient for the entire input to accepted. If this does not make sense, I will be writing a separate blog post on Non-Deterministic Finite State Machines with examples, so stay tuned! Feel free to check the Wikipedia article in the meantime.

The key takeaway though: it may seen that Non-Deterministic FSMs (Finite State Machine) are more powerful and capable of solving more problems than their deterministic counterpart. In reality, every Non-Deterministic FSM has an equivalent deterministic FSM. Not only that, there is a specific mechanism for converting a Non-deterministic FSM to a deterministic one.

This is quite important. Some problems are much easier to conceptualize with a non-deterministic model, but can be easily converted to a deterministic model, conserving the simplicity of the latter.

The Pumping Lemma

Although out of scope for this article, regular languages have an important mathematical property that they all share called the Pumping Lemma. The key takeaway is that this makes it possible to mathematically prove whether a language (set of inputs) is regular (solvable by finite state machine) without having to search for a specific implementation.

Applications of Finite State Machines

Although important in theory, Finite State Machines have practical applications. They are commonly used in lexical parsing, network devices, machine learning and others.

Moreover, as we saw earlier, you might write a program that is a Finite State Machine without noticing it! You don't have to go out of your way to write programs as FSMs to gain their benefits. Sometimes, it is enough to recognize them in existing problems and solutions. It'll make it easier to map the program's state to the real-world representation. You might represent state and its transitions in a clearer, more maintainable and readable way. Moreover, being more intentional about state management will help you avoid errors in the long run.

Some examples that you are more likely to run into day-to-day:

  • Event-Driven Programming: Event driven programming is increasingly common in distributed systems, and they can often be modeled as state machines
  • Reactive UI State: If you have ever created a React.JS component, you very likely modeled your component state as a state machine
  • Regular Expressions (Regex): Explained in the next section

Regular Expressions (Regex)

You may be surprised to learn that regular expressions are entirely based on finite state machines, at least in their original theory. If you are not familiar, regular expressions are commonly used in text parsing tools, employing a declarative syntax for defining patterns to be matched.

While regex syntax is annoying to some, you might find it easier when you realize that it is based on set operations (like union and Kleene's star).

Note: Most modern implementations of regular expressions have extensions that extend beyond finite state machines, but the base theory and syntax are still based on FSMs.

Finite State Machines are commonly used in parsing applications, and regular expressions is one example. In fact, regular expressions are equivalent to finite state machines. Regular expression syntax is a syntax for defining regular languages. This will be discussed in a future blog post.