# UNIT 1: Big-O Notation

## Table of Contents

## 1 Big-O Notation

### 1.1 What is a Data Structure

Most of you carry around (the same) large black backpack all day.
Think about what's in there, and where it is. Your decisions on what
to put in there and where to put it all were governed by a few
tradeoffs. Are commonly-used things easy to reach? Is the backpack
small and easy to carry? Is it easy to find everything in there? Do
you really *need* all that stuff every day?

Organizing data in a computer is no different; we have to be able to make decisions on how to store data based on understanding these tradeoffs. How much space does this organization take? Is the operation I want to perform fast? Can another programmer understand what the heck is going on here?

As a result, we have to start with intelligent ways of discussing these tradeoffs. Which means, we have to start with…

### 1.2 Algorithm Analysis

In order to measure how well we store our data, we have to measure the
effect that data storage has on our algorithms. So, how do we know our
algorithm is any good? First off, because it works - it computes what we want
it to compute, for all (or nearly all) cases. At some level getting working
algorithms is called *programming*, and at another it's an abstract formal
topic called *program correctness*. But what we mostly will talk about
here is if we have two algorithms, which is better?

So, why should we prefer one algorithm over another? As we talked about,
this could be speed, memory usage, or ease of implementation. All three are
important, but we're going to focus on the first two in this course, because
they are the easiest to measure. We'll start off talking about time because
it is easier to explain - once you get that, the same applies to space. Note
that sometimes (*often*) space is the bigger concern.

We want the algorithm that takes less time. As soon as we talk about "more" or "less", we need to count something. What should we count? Seconds (or minutes) seems an obvious choice, but that's problematic. Real time varies upon conditions (like the computer, or the specific implementation, or what else the computer is working on).

For example, here are two algorithms for computing

public double f_alg1(double x, int n) { // Compute x^8 double x8 = 1.0; for(int i = 0; i < 8; i++) x8 = x8*x; double s = 0.0; for(int k = 1; k <= n; k++) { double f = 1.0; for(int i = 1; i <= k; i++) f = f*i; // Compute k! s = s + x8/f; // Add next term } return s; }

public double f_alg2(double x, int n) { // Compute x^8 double x2 = x*x; double x4 = x2*x2; double x8 = x4*x4; double s = 0.0; double f = 1.0; for(int k = 1; k <= n; k++) { f = f*k; // Compute k! s = s + x8/f; // Add next term } return s; }

We can use Java's `System.currentTimeMillis()`

which returns the
current clock time in milliseconds to time these two functions.
Here are the running times for various values of `n`

:

`n` |
`f_alg1` |
`f_alg2` |
---|---|---|

0 | 0 | 0 |

1000 | 92 | 3 |

2000 | 353 | 6 |

3000 | 788 | 7 |

4000 | 1394 | 10 |

5000 | 2172 | 12 |

Both get slower as n increases, but algorithm 1 gets worse faster. Why does algorithm 1 suck?

What happens if we run one of them on a different computer? Will we get the same times? Probably not– processor speed, operating system, memory, and a host of other things impact how much time it takes to execute a task.

What we need, then, is a machine- and scenario-independent way to describe how much work it takes to run an algorithm. One thing that doesn't change from computer to computer is the number of basic steps in the algorithm (this is a big handwave, but generally true), so lets measure steps.

We are going to vaguely describe a "step" as a primitive operation, such as:

- assignment
- method calls
- arithmetic operation
- comparison
- array index
- dereferencing a pointer
- returning

It is important to note that this has a great degree of flexibility: one person's basic step is another person's multiple steps. To an electrical engineer, taking the not of a boolean is one step because it involves one bit. Adding two 32-bit numbers is 32 (or more!) steps. We won't be quite so anal.

So, when we determine how long an algorithm takes, we count the number
of steps. More steps `= more work =`

slower algorithm, regardless of
the machine.

What does the following code do, and how many steps does it take?

max = a[0]; //2 steps: array index and assignment i = 1; // 1 step: assignment while (i < n) // n comparisons, plus (n-1) times through the loop if (a[i] > max) // 2: array and comparison max = a[i]; //2, but it doesn't always happen! i++; // 2: i=i+1; return max; // 1: we'll count calls and returns as 1.

This gives us: \(2 + 1 + n + (n - 1)\cdot(4 \text{ or } 6) + 1\).

Now, we're not quite done. In the code above, there are circumstances under which it will run faster than others. For example, if the array is in descending order, the code within the if statement is never run at all. If it is in ascending order, it will be run every time through the loop. We call these "best case" and "worst case" run times. We can also consider and talk about the "average" run time.

The best case is usually way too optimistic (most algorithms have best case times that are far away from average). Average case is often hard, because we need probabilistic analysis. In many cases, a worst case *bound*> is a good estimate. That way we can always say, "no matter what, this program will take no more than this many steps." In the formula above, this means the \((4\text{ or }6)\) part just becomes 6.

### 1.3 Comparing algorithms with bounds

In the program above, the number of steps it takes depends on the size
of the array, the `n`

. What we really want is a *function* that takes
the size of the array as input, and returns the number of steps as the
output. Note that is what we developed above.

So when we're comparing algorithms, we don't compare numbers, but functions. But how do we compare functions? Which function below is "bigger"?

Well, sometimes one, and sometimes the other. So what we'll do is
talk about one function *dominating* the other. We'll say that one
function dominates another if it is always bigger than the other for
every x value (for example \(f(x) = x^2\) dominates \(f(x)=-x^2\)).
Neither of the functions in the graph above dominates the other.

One function can dominate another in a particular range as well. \(f(x)=1\) dominates \(f(x) = x^2\) in the range \([0;1]\).

Now, when input sizes are small, everything is fast. Differences
in run time are most relevant when there is a lot of data to process.
So, we colloquially ask **"When n is large, does one dominate the
other?"** Mathematically, we ask if there exists an \(n_0\geq 1\)
such that for all \(n\ge n_0\), \(f(n) \gt g(n)\) (or \(g(n) \gt
f(n)\)).

In this graph, \(f(x)=3x^2-8\) dominates \(f(x)=x^2\) everywhere to
the right of \(x=2\). So \(n_0=2\) - I like to call this the
"*crossover point*".

### 1.4 How can we tell if one will dominate the other for large n?

It turns out it's really easy. It is all about growth rate. Take a look at this graph of \(f(n)= n^2\) vs. \(g(n) = 3n+4\).

What happens if we change the constants in \(g(n)\)? The line moves, but it doesn't change shape, so there's always some point at which f(n) passes it. Here for example is \(f(n)= n^2\) vs. \(g(n) = 30n+40\).

As soon as we start caring about really big n, **it doesn't matter what
the constant coefficients are**, \(f(x)\) will always find a point at
which it will dominate \(g(x)\) everywhere to the right. In the
following table, \(f(n)\) will dominate \(g(n)\) for some large enough
\(n\):

\(f(n)\) | \(g(n)\) |
---|---|

\(n^2\) | \(n\) |

\(n^2-n\) | \(5n\) |

\(\frac{1}{10^{10}}n^2-10^{10}n-10^{30}\) | \(10^{10} n + 10^{300}\) |

### 1.5 Big-O

We talk about the runtime of algorithms by categorizing them into sets of
functions, based **entirely** on their largest growth rate.

The mathematical definition: if two functions \(f(n)\) and \(f'(n)\) are both eventually dominated by the same function \(c\cdot g(n)\), where \(c\) is any positive coefficient, then \(f(n)\) and \(f'(n)\) are both in the same set. We call this set \(O(g(n))\) (pronounced "Big-Oh of \(g(n)\)"). Then we can write \(f(n) \in O(g(n))\) and \((f'(n) \in O(g(n))\).

Since the constant coefficients don't matter we say that \(g(n)\) eventually dominates \(f(n)\) if there is ANY constant we can multiply by \(g(n)\) so that it eventually dominates \(f(n)\). \(f(n)=5n+3\) is obviously \(O(n^2)\). It is also \(O(n)\), since \(f(n)\) is dominated by \(10n\), and 10 is a constant.

The formal definition: For a function \(f(n)\), if there exists a \(c\) and \(n_0\) such that \(c \gt 0\) and \(n_0 \geq 1\), and for all \(n \ge n_0, f(n) \leq c \cdot g(n)\), then \(f(n) \in O(g(n))\).

While \(f(n)=5n+10 \in O(n^2)\), it is also \(O(5n)\). \(O(5n)\) is preferred over \(O(n^2)\) because it is closer to \(f(n)\). \(f(n)\) is ALSO \(\in O(n)\). \((O(n)\) is preferred over \(O(5n)\) because the constant coefficients don't matter, so we leave them off.

## 2 Big-O Applications

### 2.1 Now we know Big-O

We're a week in, and you've already learned the most important part of this class: Big-O notation. You absolutely have to understand this to be a computer scientist. Here are some examples of Big-O notation appearing in academic papers by your faculty:

"Since the program doubles at each step, the size of \(\Psi_n\) is \(O(2^n)\)." – Chris Brown and James Davenport, "The Complexity of Quantifier Elimination and Cylindrical Algebraic Decomposition."

"The initialization in Steps 2–5 and the additions in Steps 10 and 12 all have cost bounded by \(O(n/r)\), and hence do not dominate the complexity."

– Dan Roche, "Chunky and Equal-Spaced Polynomial Multiplication"

"Unfortunately, the expressiveness of the model must be weighed against the \(O(n^3)\) cost of inverting the kernel matrix."

– Gavin Taylor, "Feature Selection for Value Function Approximation."

"Fortunately, efficient dynamic programming techniques exist to solve this exactly in \(O(k^2)\) time."

– Adam Aviv, "Practicality of Accelerometer Side Channels on Smartphones"

"Our P-Ring router, called Hierarchical Ring (HR), is highly fault tolerant, and a rounder of order \(d\) provides guaranteed \(O(log_d P + m)\) range search performance in a stable system with P peers, where m is the number of peers with items in the query range."

– Adina Crainiceanu, et al., "Load Balancing and Range Queries in P2P Systems Using P-Ring."

### 2.2 Comparing Two Algorithms with Big-O

We now have the language and a basic understanding of how to compare two algorithms, let's put it to the test. Suppose we have two sorted arrays of integers, and we need to determine if an integer is in the array. Consider the two search routines over that sorted array.

int find1(int A[], int toFind){ for(int i=0;i<A.length;i++){ if(A[i] == toFind) return i; } return -1; } int find2(int A[], int toFind){ int left = 0; int right = A.length-1; while(true){ if(right-left <= 1){ if(toFind == A[right] && right > 0) return right; else if(toFind == A[left]) return left; else break; //didn't find it } int mid = (left + right)/2; if(toFind < A[mid]) right = mid-1; else if(toFind > A[mid]) left = mid+1; else return mid; // must be equal, so it's found } return -1; }

The question is, **which is faster?** You might say that `find(1)`

should be faster because it has less code. Or, you might say, we can
only know if we run the routines, but we now know a better way with
big-O.

The first thing we need to recognize is that we are concerned with the
*worst-case* performance, which is what Big-O measures. If we find the
items in the array, we can do so really quickly. For example, it could
be the first thing we look at. So the worst case performance is when
we can't find the item in the array, and thus have to search the most
amount for it.

In the worst case analysis, at least for `find1()`

, let's be verbose
at first to identify the total number of steps involved. The would be:

- Initialize the variable
`i`

, one step - Loop for
`A.length`

times:- Compare
`i<A`

, one step - Access
`A[i]`

, one step - Compare
`A[i]`

and`toFind`

, one step

- Compare
- A return (one step) either in the loop or outside

If we say the length of the array is `n`

, then we have 1+3n+1 total
steps, and since we don't care about constants, thats \(O(n)\). You may
notice that the driving force in the anlaysis is the total number of
comparisons that need to be performed. When we say \(O(n)\) for the find
routine, we are really saying it may take *at most* `n`

comparisons to
find what is being searched for.

Turning to `find2()`

, let's now reason about the total
number of comparisons this routine makes. The procedure is different
here because instead of searching from front to back we are leveraging
the fact that the array is sorted and making our first comparison in
the middle. In the worst case, we don't find it, so we then ask: *Is
the item be searched for greater or less than what was just checked?*
If it is greater, we then search in the middle of the right half, and
if it is less than, we search in the middle of the left half. This
repeats until there are only 1 or 2 items left to check.

How many comparisons is that? Well, let's think of it this way, the range of items we are checking in the middle of is decreasing by a factor of 2 each time. As an example, if we have 100 items, we check once in the middle, then in the middle of 50 items, then in the middle of 25 items, then in the middle of 12 items (with rounding), then in the middle of 6 items, then in the middle of 3 items, then in the middle of 1 items. The total number of comparisons is the same as the total number of times we can divide 100 by 2 because we make one comparison per dividing the list in half.

Generalizing the problem a bit, if we have \(n\) items in the array, the total number of comparisons is the same as the number of times 2 divides into \(n\). Put another way, that is the same as solving the following for \(x\) in the following formula:

\begin{equation*} {2}^x \ge n\\ \end{equation*}Or, \(x\) is the number of times we can multiple 2 together until we get over \(n\). To solve for \(x\), we can take the \(log_2\) of both sides to get:

\[ log_2(2^x) \ge log_2(n)\\ x \ge log_2(n)\\ \]

The total number of comparisons requires is at least \(O(log_2(n))\)
which is the same as \(O(log(n)/log(2))\). Considering that \(1/log(2)\) is
a constant, we can remove that and we are left with \(O(log(n))\) as
the worst case performance of `find2()`

. For reference, another name
for this routine is **binary search** and it will come up again later in
the notes.

With all of that anlaysis done, we have `find1()`

is \(O(n)\) and
`find2()`

is \(O(log(n))\), so we can now say that `find2()`

should be
faster, and it is!

### 2.3 Our Main Big-O Buckets and Terminology

As we just saw, big-O is extremely useful because it allows us to put algorithms into buckets, where everything in that bucket requires about the same amount of work. It also allows us to arrange these buckets in terms of performance.

First, though, it's important that we use the correct terminology for these buckets, so we know what we're talking about. In increasing order…

*Constant time*: The run time of these algorithms does not depend on the input size at all. \(O(1)\).*Logarithmic time*: \(O(\log(n))\). Remember, changing the base of the log is exactly the same as multiplying by a coefficient, so we don't tend to care too much if it's natural logarithm, base-2 logarithm, base-10 logarithm, or whatever.*Linear time*: \(O(n)\).*n-log-n time*: \(O(n \log(n))\).*Quadratic time*: \(O(n^2)\).*Cubic time*: \(O(n^3)\).*Polynomial time*: \(O(n^k)\), for some constant \(k\gt 0\).*Exponential time*: \(O(b^n)\), for some constant \(b \gt 1\). Note this is different than polynomial.

Fuzzily speaking, things that are polynomial time or faster, are called "fast," while things that are exponential, are "slow." There is a class of problems that are believed to be slow called the NP-complete problems. If you succeed in either (a) finding a fast algorithm to compute an NP-complete problem, or (b) prove that no such algorithm exists, congratulations! You've just solved a Millenium Problem get a million dollars, and are super-famous. If you take the algorithms class, you'll learn how to do that.

The following table compares each of these buckets with each other, for increasing values of \(n\).

\(n\) | \(\log(n)\) | \(n\) | \(n\log(n)\) | \(n^2\) | \(n^3\) | \(2^n\) |
---|---|---|---|---|---|---|

4 | 2 | 4 | 8 | 16 | 64 | 16 |

8 | 3 | 8 | 24 | 64 | 512 | 256 |

16 | 4 | 16 | 64 | 256 | 4096 | 65536 |

32 | 5 | 32 | 160 | 1024 | 32768 | 4294967296 |

64 | 6 | 64 | 384 | 4096 | 262144 | 18446744073709551616 |

128 | 7 | 128 | 896 | 16384 | 2097152 | 340282366920938463463374607431768211456 |

256 | 8 | 256 | 2048 | 65536 | 16777216 | 115792089237316195423570985008687907853269984665640564039457584007913129639936 |

512 | 9 | 512 | 4608 | 262144 | 134217728 | 13407807929942597099574024998205846127479365820592393377723561443721764030073546976801874298166903427690031858186486050853753882811946569946433649006084096 |

1024 | 10 | 1024 | 10240 | 1048576 | 1073741824 | 179769313486231590772930519078902473361797697894230657273430081157732675805500963132708477322407536021120113879871393357658789768814416622492847430639474124377767893424865485276302219601246094119453082952085005768838150682342462881473913110540827237163350510684586298239947245938479716304835356329624224137216 |

To understand just how painful exponential time algorithms are, note that the number of seconds in the age of the universe has about 18 digits, and the number of atoms in the universe is a number with about 81 digits. That last number in the table has 309 digits. If we could somehow turn every atom in the universe into a little computer, and run all those computers in parallel from the beginning of time, an algorithm that takes \(2^n\) steps would still not complete a problem of size \(n=500\).

### 2.4 Other Big-Stuff

Through this entire section, we are still talking about worst-case run times.

Remember, if \(f(n)\in O(g(n))\), then we can come up with a \(c\) such that for all \(n \ge n_0\), \(f(n)\leq c\cdot g(n)\).

If we replace the second part of that definition with a greater-than
like \(f(n) \ge c \cdot g(n)\), then \(f(n)\in \Omega (g(n))\). That is
read, "big-Omega" of \(g(n)\) and we can think of this as a *lower
bound*, meaning, for an algorithm, it can't essentially be faster than
the \(g(n)\).

Generally, we use Big-Omega when describing the best case performance of an algorithm because we are concerned with the lower-bound on how fast a routine can complete. When we are performing a worst case analysis, we are concerned with bounding how slow a routine can complete, and thus we use Big-O.

As an example of \(\Omega\) analysis, return to the two find routines
above. In that analysis, we said the worst case is \(O(n)\) for
`find1()`

and \(O(log(n))\) for `find2()`

, but what about the best case
analysis? As identified, we could find the searched item on our first
comparison, so both routines are \(\Omega(1)\), or constant time best
case.

Sometimes, when performing an analysis you'll find that the Big-O upper-bound and the Big-Omega lower-bound are the same. When this happens, we call this Big-Theta, and it is a special case where \(f(n)\in O(g(n))\), and \(f(n)\in\Omega(g(n))\). Obviously, big-Θ is most precise "Big" notation. In this class, we'll mostly talk big-O. If you take algorithms, that will mostly be big-Θ.

### 2.5 Analyzing Selection Sort

Let's consider one more routine for analysis, say selection sort, something you should remember from your intro to computing class.

void selectionSort(int A[]){ for(int i=0;i<A.length-1;i++) for(int j=i+1,j<A.length;j++) if(A[i] > A[j]) swap(A,i,j); //swap routine }

Starting with the worst-case analysis, using our big-O analysis tools, we need to identify the dominate step, which in this case is going to be how many comparisons must be made to ensure that all the right elements are swapped to get a sorted ordering. To make the math easier, without generalization, lets say that the array has \(n+1\) elements in it. Then the number of comparisons we must make

\[ \sum^{n}_{i=0} n-i \]

That is, the first time through the inner loop, we do n comparison, then n-1, then n-2, and so on until we do 2, and then finally 1. If we reverse the order, of the summation, say add 1, 2, …, n-1, n, then we can write the summation more simply like so:

\[ \sum^{n}_{i=1} i = \frac{n(n+1)}{2} \]

The solution to the sum of the numbers between 1 and n is known to be the common formula, \(n(n+1)/2)\). To quickly give you the intuition why this is so, consider doubling the items in the summation and then reversing the order of one series of sums and aligning the two series such that the first and the last number in each summation series align. Visually, you'd get something that looked like this:

\begin{array} \quad & 1 &+& 2 &+& 3 &+& \ldots &+& n-2 &+& n-1 &+& n\\ + & n &+& n-1 &+& n-2 &+& \ldots &+& 3 &+& 2 &+& 1\\ \hline \quad & n+1 &+& n+1 &+& n+1 &+& \ldots &+& n+1 &+& n+1 &+& n+1 \quad = \quad n(n+1)\\ \end{array}And since we doubled the list, dividing by 2 gives the desired result of \(n(n+1)/2\) or \(1/2(n^2+n)\). As the \(n^2\) dominates, selection sort is \(O(n^2)\)

Let's consider now a best case analysis of selection sort. How fast could selection sort possibly run? It turns out that it is \(\Omega(n^2)\). No matter the input length \(n\), it will take at least \(\Omega(n^2)\) comparison steps because the routine can't be sure if it is sorted ahead of time. Thus \(n^2\) is a lower bound and an upper bound on performance for selection sort for a best-case and worst-case scenario.

As the the upper and lower bound are the same for the best-case and
worst-case scenario, then we can say that in the average-case that
selection sort is Θ(n^{2}).

## 3 Big-O of Arrays and Linked Lists

### 3.1 Analyzing Data Structures we already know

We are already quite comfortable with two types of data structures: arrays, and linked lists (we also know array lists, but those require some more complicated analysis). Today we're going to apply our analysis skills to the operations we commonly do with these data structures.

### 3.2 Access

"Access" means getting at data stored in that data structure. For example, getting the third element.

In an array, this is really easy! `theArray[3]`

Does the time this
takes depend upon the size of the array? No! So, it's constant time.
\(O(1)\).

In a Linked List, this is harder. Now, if we were looking for the
first element, or the third element, or the hundredth element, then
that does NOT depend upon the size of the list, and that would be
\(O(1)\). However, remember that we are interested in *worst case*, and
if we think about this as a game where an adversary gets to choose
which element to make our algorithm the slowest, then what would he
choose?

You might think that the adversary should choose the last element, but that could be O(1) because many linked list are doubly linked with a reference to the head and tail of the list and pointers between nodes pointing forward and backwards. In fact, the worst case element to reference is right in the middle, which is furtherest from the head and tail references requiring at least \(n/2\) pointer follows to be reached. Thus, access for Linked Lists is \(O(n)\).

(As an aside, you might be saying, why not just store a reference to the middle item too to remove the worst case? Then, the worst case shifts to the middle of the two halfs, requring \(n/4\) pointer follows, and we are back at \(O(n)\). Taking this one step further, you might just say, why not just store two more references to the two node in the middle of the two halves, but then the worst case shifts again to be the middle of any of the four segments leaving us with \(n/8\) and still \(O(n)\). You can keep doing this, adding references ad infinitum, but then you need to store and manage \(n\) references to each of the \(n\) nodes: what structure are you going to store them in?! Another Linked Lists an Array? And, then you're back where you started, and the logic falls a part.)

### 3.3 Adding an Element to the Front or Back

In an array, this is hard! Because we now have an additional element to store, we have to make our array bigger, requiring us to build a new array of size \(n+1\), move every element into it, and add the new one. That "move every element into it," will take \(O(n)\) time.

In a linked list, this is really easy! If we have a pointer to the first and last link in the chain, this takes the same amount of time, no matter how long the list is! \(O(1)\).

It's worth noting that removing an element requires a similar process.

### 3.4 Adding an Element to a Sorted List or Array

The scenario: assume the array or list is in sorted order, and we have a new element to add. How long will it take (as always, in the worst case)?

This is a two-step process: (1) find where the element goes, and (2) add the element.

In arrays, we recently saw two ways to do (1); one where we cycled
through the entire array looking for the proper location (\(O(n)\),
and one where we repeatedly halved the array until we found the spot
(this is called *binary search*, and takes \(O(log(n))\)). Doing
step (2) is very similar to what we had to do for adding an element to
the front or back, so that step is \(O(n)\). So, in total, this takes
us \(O(log(n)) + O(n)\), which is \(O(n)\).

Alternatively, you could skip binary search, and do a sequential search (go from element to element, front to back until you find it), moving elements to the new array as you go. Once you find the location to add your new element, add it to the new array, and continue moving elements. This gives you two loops, each of which move roughly half the array. Again, in total, this gives us \(O(n)\).

In a linked list, doing step (1) is \(O(n)\) (see why?). Adding the element at that point is \(O(1)\). So, in total, \(O(n)\).

In summary, it's \(O(n)\) for either an array or a linked list. But there's a big difference in how this breaks down. For arrays, the \(O(n)\) really comes from the "insertion" part, whereas for linked list the insertion part is easy and the \(O(n)\) comes from the "search" part. This difference will come into play in some applications later on.

### 3.5 Traversal

Traversal is our word for "visiting every element." It should be clear that for both arrays and linked lists, this is \(O(n)\).

### 3.6 Why did we do this?

The executive summary is as follows:

Array | Linked List | |
---|---|---|

Access | \(O(1)\) | \(O(n)\) |

Insertion/Remove from front/back | \(O(n)\) | \(O(1)\) |

Sorted insertion/remove | \(O(n)\) | \(O(n)\) |

Traversal | \(O(n)\) | \(O(n)\) |

We'll be doing this all semester, for each of our data structures. As you can see, the choice between arrays and linked lists, comes down to what you're doing with them. Doing lots of access? Use an array. Adding and removing elements often? Use a linked list. This give-and-take is at the center of Data Structures.

Credit: Ric Crabbe, Gavin Taylor and Dan Roche for the original versions of these notes.