Christophe Pallier

Sept 2013

The aim is to assess how well the proportion in a sample estimates the proportion in a population.

- Create a population of N=1 million voters out of which 52% support a candidate A, and the rest support candidate B. Use a numpy array.

```
import numpy as np
N = 1000000
p = .52
pop = np.zeros(N)
pop[0:int(N*p)] = 1
np.mean(pop)
```

- Extract a random sample of s=100 voters and compute the proportion of A supporters.

```
s = 100
import random
samp = random.sample(pop, s)
np.mean(samp)
```

- Rerun the preceding process 1000 times and plot an histogram of the estimated proportions (using matplotlib.pyplot.hist).

```
sampmeans = [ np.mean(random.sample(pop, s)) for i in range(1000) ]
import matplotlib.pyplot as plt
plt.hist(sampmeans)
```

Exercice: Modify s, the size of the sample, and compute the standard deviation of the sampled means as a function of s.