Computational complexity

Before we dive into data structures and algorithms, it's worth discussing computational complexity. As the name suggests, it's a method and a whole field of theoretical computer science that's used to analyse the complexity of computations of all kinds.

Motivation

Suppose you have a computational problem (a problem that a computer might be able to solve) and you need to devise an algorithm to solve it. Depending on the context, you might need a really fast algorithm. For example, if you're developing an algorithm to place trades in a high-frequency trading platform, speed is critical.

Computational complexity is usually used to analyse how quickly algorithms or operations on data structures run. It's also used to analyse how much computer memory certain operations or data structures might require. The former type of analysis is usually called "time complexity" and the latter is usually called "space complexity".

Computational complexity is more general than just time and space, though. If you're interested in other applications, check out computational complexity theory.

Time complexity

Consider an algorithm that takes in some input and, regardless of what the input is, always executes the same 10 steps and then outputs a result. (If you're not familiar with algorithms, think of them as funnels which take in some input, execute some steps and then output a result. We'll cover them some more in the algorithms overview section.)

No matter how large the input to the algorithm gets, the time the algorithm takes to run will always be the same. We could say that the running time of the algorithm is constant with respect to the size of the input. In computational complexity, this is often referred to as a constant-time algorithm.

Now suppose you have an algorithm that takes in a list of $n$ numbers and prints out each of the numbers, one at a time. If we assume that printing a single number takes a constant amount of time, then the running time of this algorithm will grow linearly in proportion to the size of the input list. If the input list has size 5, the algorithm will take 5 steps to run. If the input list has size 500, the algorithm will take 500 steps to run. Such an algorithm is often referred to as a linear-time algorithm.

Say we modify the previous algorithm slightly. This time, it takes in a sorted list of $n$ consecutive integers starting with 1 - for example, [1, 2, 3, 4, 5]. Instead of simply printing out each number, it prints out the sum of all the positive integers less than or equal to the number.

So, in this case, the algorithm would print: 1, 3, 6, 10, 15.

What's the running time of this algorithm? If we assume that each sum operation takes constant time, then we just need to calculate the number of sum operations and add that to the number of print operations. There'll be $n$ print operations and, for a given number $i$ in the list, there'll be $i - 1$ sum operations.

Since this is a list of consecutive integers starting at 1, the total number of sum operations will be $0 + 1 + 2 + … + (n - 1)$ . This is an arithmetic series which sums to $n (n - 1) / 2$ . See this triangular number sequence article for a visual derivation.

So the total number of steps the algorithm takes for an input list of size $n$ is quadratic with respect to $n$ . Such an algorithm is referred to as a quadratic-time algorithm.

There are a whole host of other running times - for example, logarithmic, cubic, exponential and even algorithms which run in roughly $n!$ (n factorial) steps, where $n$ is the input size. In order to reason about all of these more effectively, people who discuss computational complexity often use a particular notation called "big O". Let's look at that next.

Big O notation

When I was writing this, I suddenly wondered why big O is called "big O". The letter "O" is used because this notation is used to reason about the "order" of various mathematical functions. More specifically, when studying computational complexity, we're often most interested in the growth rates of functions of some input size $n$ , say, and the growth rate is also known as the order of the function.

For example, the time complexity of the constant-time algorithm we discussed earlier is written as $O (1)$ in big O notation. This is pronounced as "big O of 1" or "order 1" and it means that, given an input size of $n$ , the algorithm in question takes a constant number of steps in the worst case, with respect to $n$ .

"In the worst case?" Yes, an algorithm might take a different number of steps depending on the type of input it receives, even if the input size is fixed. For example, suppose you have an algorithm that takes in an input list of $n$ numbers and then sorts the numbers in ascending order and outputs the result. If the input list is already sorted in ascending order, then the algorithm doesn't need to do anything. However, if the input list is sorted in descending order, the algorithm will have to do a lot more work to sort the list. This is an example of a worst-case input for a given input size.

The linear-time algorithm from above is denoted as $O (n)$ and pronounced as "big O of n" or "order n". An interesting point to note is that, if the algorithm takes $2 n$ steps instead of $n$ , the worst-case running time of the algorithm is still denoted as $O (n)$ , not $O (2 n)$ . Why? The reason we're able to drop the constant is that, when we analyse computational complexity, we're most interested in the rate of growth of the function in question as the input size $n$ gets very large. As $n$ gets very large, the constant $2$ in $2 n$ affects the rate of growth much less than the variable $n$ .

Similarly, if we have an algorithm that takes $2 n^{2} + n$ steps, the rate of growth for very large input sizes is most affected by the highest order term, $n^{2}$ . Therefore, in big O notation, we drop the lower order term and the constant and say that the running time is $O (n^{2})$ .

It's also helpful to note that the usage of big O notation in the technology industry often differs from its usage in academia. In academia, big O notation is used to describe an upper bound on the computational complexity of some computation. Therefore, the linear-time algorithm we saw earlier could be described as $O (n^{2})$ , even though that's not a tight bound. However, in industry, big O notation is usually used to describe a tight bound on the computational complexity. Therefore, that algorithm would be described as $O (n)$ . In the worst case, it runs in linear time and in the best case it runs in linear time.

If you're like me and love to dig deeper into the details, see Introduction to Algorithms, Third Edition, often referred to as "CLRS", for a more formal treatment of big O notation.

Space complexity

Sometimes, it's important to analyse how much memory a given operation or algorithm will require. This is where space complexity comes into play. As with time complexity, we usually use big O notation to reason about space complexity.

A formal treatment of this topic would require us to specify the computational model and device in great detail. Most of the time, however, it's generally assumed that the algorithms or operations are implemented as computer programs and that each step is executed one after the other.

This doesn't account for more complex models, like algorithms that have steps that can be run in parallel, but for the purposes of this manual, it's sufficient. Again, for more formality, check out Introduction to Algorithms, Third Edition.

As a brief example of space complexity, the linear-time algorithm we saw earlier which simply prints out each of the numbers in the input list has a space complexity which is also linear, if we assume that the input is an array. Why? This is because memory must be allocated for each of the $n$ elements in the input list.

A helpful resource

A final note: I came across this big O cheat sheet a while ago. It's a really helpful summary of various time and space complexities, especially if you're prepping for upcoming technical interviews.

Next up are data structures.