Powered by repeated innovations in chip manufacturing, computers have
grown exponentially more powerful over the last several decades. As a
result, we have access to unparalleled computational resources and data.
For example, a single
NASA satellite collects 20 terabytes of satellite images, more than
8 billion searches
are made on Google, and estimates
suggest the internet creates more than 300 million terabytes of data
*every single day*. Simultaneously, we are quickly approaching
the physical limit of how many transistors can be packed on a single
chip. In order to learn from the data we have and continue expanding our
computational abilities into the future, fast and efficient algorithms
are more important than ever.

At first glance, an algorithm that performs only a few operations per
item in our data set is efficient. However, these algorithms can be too
slow when we have lots and lots of data. Instead, we turn to randomized
algorithms that can run even faster. Randomized algorithms typically
exploit some source of randomness to run on only a small part of the
data set (or use only a small amount of space) while still returning an
*approximately* correct result.

We can run randomized algorithms in practice to see how well they
work. But we also want to *prove* that they work and understand
why. Today, we will solve two problems using randomized algorithms.
Before we get to the problems and algorithms, weâ€™ll build some helpful
probability tools.

Consider a random variable \(X\). For example, \(X\) could be the outcome of a fair dice roll and be equal to \(1,2,3,4,5\) or \(6\), each with probability \(\frac{1}{6}\). Formally, we use \(\Pr(X=x)\) to represent the probability that the random variable \(X\) is equal to the outcome \(x\). The expectation of a discrete random variable is \[ \mathbb{E}[X] = \sum_{x} x \Pr(X=x). \] For example, the expected outcome of a fair dice roll is \(\mathbb{E}[X] = 1 \times \frac{1}{6} + 2 \times \frac{1}{6} + 3 \times \frac{1}{6} + 4 \times \frac{1}{6} + 5 \times \frac{1}{6} + 6 \times \frac{1}{6} = \frac{21}{6}\). Note: If the random variable is continuous, we can similarly define its expected value using an integral.

The expected value tells us where the random variable is on average but weâ€™re also interested in how closely the random variable concentrates around its expectation. The variance of a random variable is \[ \textrm{Var}[X] = \mathbb{E}\left[(X - \mathbb{E}[X])^2\right]. \] Notice that the variance is larger when the random variable is often far from its expectation. In the figure below, can you identify the expected value for each of the three distributions? Which distribution has the largest variance? Which has the smallest?