Randomness

Today we talk about how to generate random numbers in Python. Why? We often need access to a random number generator to simulate random processes in the world. Rolling dice probably comes to mind, and of course if we are creating a game in Python, randomness is essential. What about data? Data scientists frequently need to be able to sample from their data to run experiments, sometimes because the dataset is so huge that we need to be more efficient with what we compute. You want to choose a random set of your data to avoid biases. This obviously extends to when we want to collect our data because we should collect evenly across all data sources ... and randomness comes into play here. You don't want to poll all political leanings in one town, but rather a few in each town to get a more representative sample of the population.

Without getting into statistics in this class, today will show you the basic randomness tools at your disposal.

How does a computer create a random number?

The gory details are beyond the scope of this class, but it's helpful to understand where these numbers come from. Random number generators use an algorithm (an equation) to pick a pseudorandom number. We call it pseudorandom because it is an approximation to the properties you should see from sequences of truly random numbers (such as from natural processes). In a computer program, these pseudorandom numbers start from a pre-set seed number, and then they generate the next random number based on that seed. The subsequent random number comes from the prior random number, and so forth. Here's an example of one such equation (a linear congruential generator) to generate the next pseudorandom number:

next_random = (a * prior_random + c) % m

The variables in this equation (a, c, m) are hard-coded ahead of time. For instance, the m determines the range of numbers it generates! Think about it; each number will be between 0 and m.

The above is an older algorithm that isn't used so much anymore because of some less desirable properties of the random sequences it generates. Suffice to say, when you call a random library's functions, it's using this type of algorithm to make its decisions.

The random library

Import the random package to access python's core random functionality.

import random

Common functions for generating one random number

  1. random.random() # float [0,1)
  2. random.uniform(5,10) # float [5,10)
  3. random.randrange(5,10) # int [5,10)

Common functions for manipulating lists of values

  1. random.choice(mylist) # returns one value randomly selected from mylist
  2. random.sample(mylist,k) # returns a list of k random values from mylist
  3. random.shuffle(mylist) # shuffles the given list, destructive, no return value

You should be able to understand this program.

import random

def powerball():
    nums = []
    for i in range(5):
        nums.append(random.randrange(1,70))
    nums.append(random.randrange(1,27))
    return nums

# Generate 100 powerball tickets!
tickets = []
for i in range(100):
    tickets.append(powerball())

# Pick 2 as winners (not actually how powerball works)
winners = random.sample(tickets,2)

for win in winners:
    print("WINNER!", win)
# Roll dice and compute the average.
# Try this for small and large numbers. What should the average be?
import random

N = int(input('How many dice rolls? '))

total = 0
for i in range(N):
    roll = random.randint(1,6)
    total += roll

print('Average is', total/N)