# 加州大学伯克利分校Stat2.2x Probability 概率初步学习笔记: Final

Stat2.2x Probability（概率）课程由加州大学伯克利分校（University of California, Berkeley）于2014年在edX平台讲授。

PROBLEM 1

A box contains 8 dark chocolates, 8 milk chocolates, and 8 white chocolates. (It’s amazing how this box keeps replenishing itself and reappearing. It’s like the Magic Pudding. Australians will know what I mean, and the rest of you might enjoy finding out. It’s one of the classics of children’s literature.) A simple random sample of 6 chocolates is drawn. Find:

a) the expected number of dark chocolates

b) the SE of the number of dark chocolates

c) the chance that there are fewer than 2 dark chocolates

d) the chance that the second and third chocolates drawn are dark, given that the first and fourth chocolates drawn are not dark

e) the expected number of dark chocolates among the last four draws

Solution

This is hypergeometric distribution (Zeros and Ones: Sum of a sample without replacement), $n=6, N=24, G=8$.

1a) $$E(\text{dark chocolates})=n\cdot\frac{G}{N}=6\times\frac{8}{24}=2$$

1b) $$SE(\text{dark chocolates})=\sqrt{n\cdot\frac{G}{N}\cdot\frac{N-G}{N}}\cdot\sqrt{\frac{N-n}{N-1}}$$ $$=\sqrt{6\times\frac{8}{24}\times\frac{16}{24}}\times\sqrt{\frac{24-6}{24-1}}\doteq1.021508$$

1c) $$P(\text{fewer than 2 dark chocolates})=\sum_{x=0}^{1}\frac{C_{G}^{x}\cdot C_{N-G}^{n-x}}{C_{N}^{n}}$$ $$=\sum_{x=0}^{1}\frac{C_{8}^{x}\times C_{16}^{6-x}}{C_{24}^{6}}\doteq0.319118$$ R code:

sum(dhyper(0:1, 8, 16, 6))
 0.319118

1d) $$P(\text{2nd and 3rd are dark | 1st and 4th are not dark})$$ $$=\frac{8}{22}\times\frac{7}{21}\doteq0.1212121$$

1e) Given no information about any other draw, the last four draws are probabilistically the same as any other four, say the first four. $$E(\text{dark chocolates among the last four draws})=4\times\frac{8}{24}\doteq1.333333$$

PROBLEM 2

The casino is offering a “house special” at roulette: there are 8 chances in 38 to win, and the bet pays 3 to 1. Suppose you bet $\$1$on the house special, 200 times, independently. Find: a) your expected average net gain per bet (and then pledge that you will never play this game) b) the chance that you come out ahead c) the chance that you lose more than$\$20$

Solution

2a) Sample mean with replacement: $$E=3\times\frac{8}{38}+(-1)\times\frac{30}{38}\doteq-0.1578947$$

2b) Let $x$ be the number of winning times. $$3x+(-1)\cdot(200-x) > 0\Rightarrow x > 50\Rightarrow x\geq51$$ Binomial distribution $n=200, k=51:200, p=\frac{8}{38}$: $$P(\text{come out ahead})=\sum_{k=51}^{200}C_{200}^{k}\times(\frac{8}{38})^k\times(\frac{30}{38})^{200-k}\doteq0.0750046$$ R code:

sum(dbinom(51:200, 200, 8/38))
 0.0750046

2c) $$3x+(-1)\cdot(200-x) < -20\Rightarrow x < 45\Rightarrow x\leq44$$ $$P(\text{lose more than 20})=\sum_{k=0}^{44}C_{200}^{k}\times(\frac{8}{38})^k\times(\frac{30}{38})^{200-k}\doteq0.6660572$$ R code:

sum(dbinom(0:44, 200, 8/38))
 0.6660572

PROBLEM 3

Households in a large city contain an average of 2.2 people, with an $SD$ of 1.2 people. A simple random sample of 625 households is taken.

a) Approximately what is the chance that there are more than 1400 people in the sampled households?

b) How would your answer to a) have been different had the sample been drawn with replacement?

Solution

3a) Sample sum without replacement but the correction factor is very close to 1 since the city is very large. $\mu=2.2, \sigma=1.2, n=625$: $$SE=\sqrt{n}\cdot\sigma=\sqrt{625}\times1.2=30$$ $$Z=\frac{1400.5-n\cdot\mu}{SE}$$ Calculating by R:

n = 625; mu = 2.2
z = (1400.5 - n * mu) / 30
1 - pnorm(z)
 0.1976625

Thus the chance is around $19.77\%$.

3b) It wouldn‘t. Because the city is large so the correction factor is very close to 1, that is, the chance will be the same whether draw with replacement or without replacement.

PROBLEM 4

There are three boxes. Box I contains one gold coin and one silver coin. Box II contains two silver coins. Box III contains two gold coins. A box is selected at random, and then one coin is selected at random from that box. Given that the coin is gold, what is the chance that the other coin in the box is gold? [No, the answer is not 1/2.]

Solution

Bayes Rules: $$P(\text{box 3 | the first coin is gold})=\frac{\text{the first coin is gold and it is from box 3}}{\text{the first coin is gold}}$$ $$=\frac{\frac{1}{3}\times1}{\frac{1}{3}\times\frac{1}{2}+\frac{1}{3}\times0+\frac{1}{3}\times1}=\frac{2}{3}$$

PROBLEM 5

A coin is tossed $n$ times. There is about $95\%$ chance that the proportion of heads is in the range $.49$ to $.51$. The number of tosses $n$ is closest to:

a) 1,000

b) 5,000

c) 10,000

d) 50,000

Solution

Sample proportion of ones. $p=0.5$ and the interval $.49$ to $.51$ has to be $0.5\pm2SE$, thus $$2SE=0.01\Rightarrow SE=0.005$$ On the other hand $$SE=\sqrt{\frac{p\cdot(1-p)}{n}}=\sqrt{\frac{\frac{1}{4}}{n}}=0.005\Rightarrow n=10000$$

FINAL EXAM

PROBLEM 1

Suppose you are trying to estimate the percent of women in a city. Other things being equal, a simple random sample of 0.1% of the population of a city that has 2,000,000 people is ________ as a simple random sample of 0.1% of the population of a city that has 500,000 people. Fill in the blank with the best of the following choices.

a) about 1/4 times as accurate

b) about 1/2 times as accurate

d) about 2 times as accurate

e) about 4 times as accurate

Solution

Square Root Law. $$2\times10^6\times0.1\%=2000,\ 5\times10^5\times0.1\%=500$$ $$\Rightarrow\sqrt{\frac{2000}{500}}=2$$ Thus the former is about 2 times as accurate as the latter. d) is correct.

PROBLEM 2

A group of 30 people consists of 15 children, 10 men, and 5 women. Tom and Jerry are two of the men in the group. Five people are picked at random without replacement.

2A Find the chance the first person picked is a man, given that the fourth and fifth people picked are children.

2B Find the chance that more than two women are picked.

2C Find the chance that Tom and Jerry both get picked.

Solution

2A) $$P(\text{1st person is a man | 4th and 5th are children})=\frac{10}{28}\doteq0.3571429$$

2B) Hypergeometric distribution $$P(\text{more than 2 women})=\sum_{x=3}^{5}\frac{C_{5}^{x}\cdot C_{25}^{5-x}}{C_{30}^{5}}\doteq0.02193592$$ R code:

sum(dhyper(3:5, 5, 25, 5))
 0.02193592

2C) Both of Tom and Jerry get picked means we only have to select 3 persons among other 28 remaining people: $$P(\text{both of Tom and Jerry get selected})=\frac{C_{28}^{3}}{C_{30}^{5}}\doteq0.02298851$$ R code:

choose(28, 3) / choose(30, 5)
 0.02298851

PROBLEM 3