Big-O Basics

March 2, 2020 in Computer Science | Big-O, Performance, Complexity, Algorithms, Computer Science

Big-O helps us determine the relative performance of a function or algorithm. Essentially, it tells us how the runtime grows as the input size grows. Big-O is an abstract tool in that it generally doesn’t take into account processor power, memory, etc. - it is focused on the worst possible efficiency of the logic as written. Let’s dive into some basic examples and learn how to start thinking about performance in software.

I’ll be using ‘pseudocode’ instead of using any particular language. This means that it’s not real code that can be executed - it’s a ‘sketch’ that helps us understand the concepts in a language-agnostic way.

Constant Time: O(1)

Let’s take a very simple function that simply prints the argument.

function printArg(arg) {
  print(arg);
}

How long does this function take to run if the argument is an array with 1 item? What about 100 items? Notice that no matter if the argument is an array of 1, 100, or 1,000,000 items, the program simply takes action on one item - an array.

This is known as constant time, represented by O(1). The performance will not drastically differ based on input size. Again, we’re not thinking about how beefy the computer is, just the complexity of the logic.

Linear Time: O(n)

Now, a function that takes an action (printing) on each item in the argument/array. Note: n represents the size of the input in Big-O notation.

function printArgItems(arg) {
  for each item in arg {
    print(item)
  }
}

So, if the argument has one item in the array, print is called once. If there are 100 items in the array, print is called 100 times.

This is known as linear time, represented by O(n). The time complexity varies 1:1 with the size of the input.

Quadratic Time: O(n²)

Now, a function that prints every combination of 2 items in an argument array.

function printEveryCombination(arg) {
  for each item in arg {
    for each itemToCompare in arg {
      print(item and itemToCompare)
    }
  }
}

For each item in the array, time complexity is n², with n being the number of items. If there are 10 items in the array, print is called 100 times. This is because each item in the array is compared with each of the 10 items, and this happens 10 times.

This is known as quadratic time, represented by O(n²).

Polynomial Time: O(nˣ)

Quadratic time, cubic time, etc., fall into this category. Essentially, the exponent represents the depth of nested loops (or similar operations).

A Practical Example

We want to figure out the time complexity of the following operation, which emails each address in the input list.

function emailEachPerson(emailAddresses) {
  for each email in emailAddresses {
    sendEmail(email);
  }

  print('Sent emails to the following emails: ' + emailAddresses);
}
The Answer

Breaking this loop down and changing the wording a little bit can help us identify the complexity: “We send 1 email for each 1 email address.” The ratio of actions to items is 1:1, so we are dealing with linear time.

What if there are multiple types of operations?

When you have multiple types of operations (a loop, and a single ‘print’ action in this case), you should look at the most complex operation to determine its time complexity. A loop over each array item is more computationally expensive than a single action (print).

What if there are multiple of the same type of operation?

In the following example, we have two O(n) operations. We don’t need to do anything special here. Even if we had 100 different O(n) operations within this function, we simply go with O(n) time complexity.

function processArrays(arrayOfTenItems, arrayOfOneHundredItems) {
  // Print each item of the first arg
  for each item in arrayOfTenItems {
    print(item);
  }

  // Print each item of the second arg
  for each item in arrayOfOneHundredItems {
    print(item);
  }
}

In the next example, there is an O(n) operation (printing each item in the first array), as well as an O(n²) operation (printing every combination between the two arrays). We simply ignore all but the more expensive operation, leading us to O(n²) time complexity. The reason we do this is that at greater and greater scales, the most expensive operation has the largest effect on performance.

function processArrays(arrayOfTenItems, arrayOfOneHundredItems) {
  // Print each item of the first arg
  for each item in arrayOfTenItems {
    print(item);
  }

  // Print every combination of two items between the two arrays
  for each item in arrayOfOneHundredItems {
    for each otherItem in arrayOfTenItems {
      print(item, otherItem);
    }
  }
}

Going Forward

As you grow as a developer, thinking in this way will come more naturally as you work on larger scale applications. When you’re dealing with arrays of 10 items in personal/toy projects, you can get away with less performant code. As you scale up to work with thousands, millions, or even larger batches of data, you are naturally forced to think about more performant ways of processing inputs.

At these scales, one step up in complexity can have a major effect on the time that an operation takes, and what infrastructure you need to support it. Being able to put a name to the time complexity makes it just a little bit easier to brainstorm ways to reduce complexity.