## Math 203 Mcgill Assignment

MATH 203: Principles of Statistics 1 Assignment 4 Solutions 1) Problem 4.30 Let X denote the number of points awarded. Then we have that: Outcome of Appeal Number of Cases X Plaintiff trial win - reversed 71 -1 Plaintiff trial win - affirmed/dismissed 240 5 Defendant trial win - reversed 68 -3 Defendant trial win - affirmed/dismissed 299 5 Total 678 – Then, we have that the probability distribution for X is as follows: P (X = −3) = 68 = 0.100 678 P (X = −1) = 71 = 0.105 678 P (X = 5) = 240 + 299 = 0.795 678 The graph for the probability distribution for X is: 1 0.0 0.2 0.4 P(X=x) 0.6 0.8 1.0 Probability Distribution for X −3 −2 −1 0 1 2 3 4 5 x 2) Problem 4.44 From the probability distribution for X we have: X µ = E(X) = xP (X = x) = 1(0.40) + 2(0.54) + 3(0.02) + 4(0.04) = 1.70 x∈Rx 2 E(X ) = X x2 P (X = x) = 12 (0.40) + 22 (0.54) + 32 (0.02) + 42 (0.04) = 3.38 x∈Rx σ 2 = E(X 2 ) − µ2 = 3.38 − 1.702 = 0.49 √ √ σ = σ 2 = 0.49 = 0.7 Thus, the mean is µ = 1.70 and the standard deviation is σ = 0.7. We can interpret the mean as follows: if we take a very large (to infinite) sample of blades of water hyacinth and observe the number of insect eggs on the blades, the sample mean of the number of insect eggs will be 1.70. 3) Problem 4.46 Let X = winnings in the Florida lottery. The probability distribution for X is then: 2 x P (X = x) $-1 22,999,999/23,000,000 $ 6,999,999 1/23,000,000 The expected net winnings is thus: µ = E(X) = (−1)(22, 999, 999/23, 000, 000) + (6, 999, 999)(1/23, 000, 000) = $ − 0.70 So the average winnings of all those who play the lottery is $ - 0.70. That is, if we take a very large (to infinite) sample of individuals playing the Florida lottery and observe their net winnings, the sample mean of that sample will be -0.70. That is, if we independently sample n lottery tickets and gain amounts x1 , . . . , xn , the sample average from the x1 , . . . , xn will converge to µ as n grows larger (to infinity!) As a practical interpretation: on average, if you buy a lottery ticket you will lose money. For interest and an exercise in critical thinking: Note that this interpretation doesn’t quite work in this case. According to Wikipedia, in the Florida Pick-6 Lotto, 6 balls are drawn each Wednesday and Saturday without replacement from 53 numbers. Contestants pick numbers in advance and are required to match all 6 numbers, but 53 the order doesn’t matter. Note that there are = 22957480 different combinations of numbers, 6 confirming the figure given of approximately 1 in 23 million. So, we can consider that our Lotto player perhaps chooses the same set of 6 numbers and plays them at each Lotto drawing. The winning numbers at each drawing are independent (unless the contestant is rigging the lottery somehow!). So, here we can easily consider a sequence of independent realizations x1 , . . . , xn as n goes to infinity: our lotto player simply keeps playing! The tricky part comes in with regard to the grand prize: it changes as time goes on! Once the grand prize is won, it usually resets to a lower number. So, we cannot really consider a sequence of realizations x1 , . . . , xn where n increases to infinity, because once the jackpot is won, the sequence breaks down (they are no longer from the same probability distribution). Additionally, when nobody wins at a drawing, the grand prize amount generally increases. A more technically correct interpretation would be this: If the lotto player were to purchase a ticket with each possible set of 6 numbers out of the 53, the average winnings per ticket would be approximately $−0.70. 4) Problem 4.64 Let X = the number of brands that use tap water. 3 (a) We will check the characteristics of a binomial random variable 1 - This experiment consists of n = 5 identical trials. 2 - There are only 2 possible outcomes for each trial: A brand of bottled water either uses tap water (S) or not (F ). 3 - The probability of S remains the same from trial to trial. In this case, p = P (S) ≈ 0.25 for each trial. 4 - The trials are independent. Since the number of bottles of water from which to sample is large compared to the sample size of 5, the trials are close enough to being independent. 5 - X = the number of brands of bottled water using tap water in 5 trials. Thus, X ≈ Binomial(n = 5, p = 0.25). (b) The formula for finding the binomial probabilities in this case is: 5 P (X = x) = (0.25)x (0.75)5−x for x = 0, 1, 2, 3, 4, 5 x (c) P (X = 2) = 5 2 (0.25)2 (0.75)3 (d) P (X ≤ 1) = P (X = 0) + P (X = 1) = 5 0 (0.25)0 (0.75)5 + 5 1 (0.25)1 (0.75)4 5) Problem 4.136 Let X = the number of beach trees damaged by fungi in 20 trials. Then X is a binomial random variable with n = 20 and p = 0.25. (a) P (X < 10) = 9 X P (X = x) = P (X = 0) + P (X = 1) + ... + P (X = 9) x=0 20 20 20 0 20 1 19 = (0.25) (0.75) + (0.25) (0.75) + ... + (0.25)9 (0.75)19 0 1 9 4 (b) P (X > 15) = 20 X P (X = x) = P (X = 16) + P (X = 17) + P (X = 18) + P (X = 19) + P (X = 20) x=16 20 20 20 16 4 17 3 = (0.25) (0.75) + (0.25) (0.75) + (0.25)18 (0.75)2 16 17 18 20 20 19 1 + (0.25) (0.75) + (0.25)20 (0.75)0 19 20 (c) µ = E(X) = np = (20)(0.25) = 5 6) Problem 5.38 Let X denote the amount of miraculin produced. We are told that X ≈ Normal(µ = 105.3, σ = 8). Then using the standard normal table, we have: (a) 120 − 105.3 ) 8 = P (Z > 1.84) P (X > 120) = P (Z > = P (Z > 0) − P (0 < Z < 1.84) = 0.5 − 0.4671 = 0.0329 (b) 100 − 105.3 110 − 105.3 <Z< ) 8 8 = P (−0.66 < Z < 0.59) P (100 < X < 110) = P ( = P (−0.66 < Z < 0) + P (0 < Z < 0.59) = P (0 < Z < 0.66) + P (0 < Z < 0.59) = 0.2454 + 0.2224 = 0.4678 5 (c) P (X < a) = 0.25 a − 105.3 ⇒ P (Z < ) = P (Z < z0 ) = 0.25 8 ⇒ P (Z < z0 ) = P (Z < 0) − P (z0 < Z < 0) = P (Z > 0) − P (0 < Z < −z0 ) ⇒ P (0 < Z < −z0 ) = P (Z > 0) − P (Z < z0 ) = 0.5 − 0.25 = 0.25 Looking up the area 0.25 in the table, we have that −z0 = 0.67, thus a − 105.3 8 ⇒ a = 8(−0.67) + 105.3 z0 = −0.67 = = 99.94 7) Problem 5.42 Let X denote the carapace length of green sea turtles. We are told that X has a normal distribution with µ = 55.7 and σ = 11.5. Then: (a) The probability of catching a sea turtle that is considered legal is 40 − 55.7 60 − 55.7 <Z< ) 11.5 11.5 = P (−1.37 < Z < 0.37) P (40 < X < 60) = P ( = P (−1.37 < Z < 0) + P (0 < Z < 0.37) = P (0 < Z < 1.37) + P (0 < Z < 0.37) = 0.4147 + 0.1443 = 0.5590 6 Then, the probability of capturing a sea turtle that is considered illegal is P (X < 40) + P (X > 60) = 1 − P (40 < X < 60) = 1 − 0.5590 = 0.4410 (b) We want to find the maximum limit, L, such that only 10% of turtles captured have shell lengths greater than L; thus we want P (X > L) = 0.10. So we have: L − 55.7 ) 11.5 = P (Z > z0 ) P (X > L) = P (Z > = 0.10 We also have that: P (Z > z0 ) = P (Z > 0) − P (0 < Z < z0 ) ⇒ 0.10 = 0.5 − P (0 < Z < z0 ) ⇒ P (0 < Z < z0 ) = 0.50 − 0.1 = 0.40 From the normal table, we can find that z0 ≈ 1.28, so: L − 55.7 11.5 ⇒ L = 1.28 × 11.5 + 55.7 z0 = 1.28 = = 70.42 7 8) Problem 5.90 (a) Let X denote the percentage of body fat in men. We are told that X ≈ Normal(µ = 15, σ = 0.2). Then for any particular man, the probability of being obese is: P (Obese) = P (X ≥ 20) 20 − 15 = P (Z ≥ ) 0.2 = P (Z ≥ 2.5) = P (Z ≥ 0) − P (0 ≤ Z ≤ 2.5) = 0.5 − 0.4938 = 0.0062 Now let Y be the number of men in the U.S. Army who are obese in a sample of 10,000. The random variable Y is Binomial with n = 10000 and p = 0.0062. Then we have that: E(Y ) = µY = np = 10000(0.0062) = 62 p √ σY = npq = 10000(0.0062)(1 − 0.0062) = 7.85 The interval µY ± 3σY is: 10000(0.0062) ± p 10000(0.0062)(1 − 0.0062) = 62 ± 3(7.85) ⇒ (38.45, 85.55) Since the interval lies in the range (0, 10000), we can use the normal approximation. Taking into account the continuity correction, we have: P (Y < 50) = P (Y ≤ 49) 49 + 0.5 − 62 ) ≈ P (Z ≤ 7.85 −12.5 = P (Z ≤ ) 7.85 = P (Z ≤ −1.59) = P (Z ≤ 0) − P (−1.59 ≤ Z ≤ 0) = P (Z > 0) − P (0 < Z < 1.59) = 0.5 − 0.4441 = 0.0559 8 If we do not make the continuity correction, this approximation becomes: P (Y < 50) = P (Y ≤ 49) 49 − 62 ≈ P (Z ≤ ) 7.85 = P (Z ≤ −1.66) = P (Z ≤ 0) − P (−1.66 ≤ Z ≤ 0) = P (Z > 0) − P (0 ≤ Z ≤ 1.66) = 0.5 − 0.4515 = 0.0485 Note that to validate the use of the normal approximation, we could also check that n > 30, np = 62 > 5 and nq = 9938 > 5. (b) In part a) we found that the probability of being obese was P (X > 20) = 0.0062, so that 0.62% of the general population of American men are obese. In the sample of 10,000 army men, there are 30 obese men so that pˆ = 0.003, i.e. 0.3% of men in the American army are obese (since the sample size is quite large, pˆ is a good approximation for the true percentage in the American army men population). Since 0.003 < 0.0062, we can indeed conclude that the army was successful in reducing the percentage of obese men below the percentage in the general American men population. 9) Problem 6.47 Let Xi denote the bacterial count of the ith health care worker who washes their hands. We are given the mean (µ = 69) and standard deviation (σ = 106). Then the probability that the sum of the bacterial counts from 50 hand washers is greater than 1510 is equivalent to: P 50 X i=1 ! Xi > 1510 P50 =P i=1 Xi 50 9 1510 > 50 ! ¯ > 30.2). = P (X ¯ is approximately normally As the sample size is large (n > 30), by the Central Limit Theorem, X √ distributed with mean µ = 69 and standard deviation σ/ 50 = 14.99 ¯ X −µ 30.2 − 69 √ > 14.99 σ/ n = P (Z > −2.59) ¯ > 30.2) = P P (X = P (−2.59 < Z < 0) + P (Z > 0) = P (0 < Z < 2.59) + P (Z > 0) = 0.4952 + 0.5 = 0.9952 The approximate probability that the total bacterial count for the sample of 50 hand washers is greater than 1510 is 99.5%. Alternatively, if we consider the bacterial count per hand, we obtain the following: Let Xi denote the bacterial count on one hand of the ith health care worker who washes their hands. We are given the mean (µ = 69) and standard deviation (σ = 106) of the bacterial count per hand. In the random sample, there are 50 health care workers, so a total of 100 hands. Then the probability that the sum of the bacterial counts from the 50 hand washers is greater than 1510 is equivalent to: P 100 X ! Xi > 1510 P100 =P i=1 i=1 Xi 100 1510 > 100 ! ¯ > 15.1). = P (X ¯ is approximately normally As the sample size is large (n > 30), by the Central Limit Theorem, X √ distributed with mean µ = 69 and standard deviation σ/ 100 = 10.6, so we have: ¯ 15.1 − 69 X −µ √ > 10.6 σ/ n = P (Z > −5.08) ¯ > 15.1) = P P (X = P (−5.08 < Z < 0) + P (Z > 0) = P (0 < Z < 5.08) + P (Z > 0) ≈ 0.5 + 0.5 = 1 10

Help

MATH 203: Principles of Statistics 1 Assignment 3 Solutions 1) Problem 3.78 Define the following events: A: {Student chose stated option} (i.e. repair the car) B: {Student did not choose stated option} C: {Emotion state is guilt} D: {Emotion state is anger} E: {Emotion state is is neutral} 45 = 0.789 57 50 b) P (D | B) = = 0.450 111 60 45 c) P (A) = = 0.351 and P (A | C) = = 0.789. Since P (A) 6= P (A | C), then the events A 171 57 and C are not independent. a) P (A | C) = 60 57 45 Alternatively, since P (A) = and P (C) = and P (A∩C) = , we see that P (A)P (C) = 171 171 171 45 60 57 = 0.117 6= P (A ∩ C) = = 0.263. 171 171 171 2) Problem 3.80 Define the following events: A: {internet user has wireless connection} B: {internet user uses Twitter} From the exercise, we are given that P (A) = 0.54 and P (B | A) = 0.25. Thus, P (A ∩ B) = P (B | A)P (A) = (0.54)(0.25) = 0.135 1 3) Problem 3.85 Define the following event: R: {Fish is red snapper} From the exercise, we are given that P (RC ) = 0.77. a) P (R) = 1 − P (RC ) = 1 − 0.77 = 0.23 b) Let Ri denote the event that customer i is served red snapper. Then P (At least one customer is served red snapper) = 1 − P (No customers are served red snapper) = 1 − P (R1C ∩ R2C ∩ R3C ∩ R4C ∩ R5C ) Now assuming that each customer is independent and has the same probability P (RiC ) = 0.77, this simplifies to: 1 − P (R1C )P (R2C )P (R3C )P (R4C )P (R5C ) = 1 − (0.77)5 = 1 − 0.271 = 0.729 Remark: The assumption we make above is unreasonable since any given restaurant will get its fish from the same vendor and so the restaurant will either serve red snapper to its customers or not. It’s unreasonable to assume that whether or not each customer in the same restaurant will be served red snapper are independent events (i.e. that R1 , R2 , R3 , R4 , R5 are independent). 4) Problem 3.96 Define the following events: JF : {jet fire occurs} F F : {flash fire occurs} We are given that P (JF ) = 0.01 and P (F F |JF C ) = 0.01. We want to find the probability that either a jet fire or a flash fire will occur, that is P (JF ∪ F F ). First, we assume that jet fires and flash fires are mutually exclusive events, that is P (JF ∩ F F ) = 0. We then have that: P (JF ∪ F F ) = P (JF ) + P (F F ) 2 However, we still need to find P (F F ). To do so, we need to use the law of total probability: P (F F ) = P (F F ∩ JF C ) + P (F F ∩ JF ) = P (F F |JF C )P (JF C ) + P (F F |JF )P (JF ) But, we assumed that F F and JF are mutually exclusive, so P (JF ∩F F ) = 0 and thus P (F F |JF ) = 0. So, P (F F ) = P (F F |JF C )P (JF C ) = 0.01 × 0.99 = 0.0099 So, finally, we can find the probability of the union: P (JF ∪ F F ) = P (JF ) + P (F F ) = 0.01 + 0.0099 = 0.0199 5) Problem 3.148 Define the following events: H: {person has HIV} P: {positive test} N: {negative test} We are given that P (H) = 0.008, P (P | H) = 0.99, P (N | H C ) = 0.99. a) Using Bayes’ Rule we have: P (H ∩ P ) P (H)P (P | H) = P (P ) P (H)P (P | H) + P (H C )P (P | H C ) (0.008)(0.99) 0.00792 = = (0.008)(0.99) + (0.992)(0.01) 0.00792 + 0.00992 P (H | P ) = = 0.44395 b) In East Asia, P (H) = 0.001. Using Bayes’ Rule, the probability is: P (H)P (P | H) P (H ∩ P ) = P (P ) P (H)P (P | H) + P (H C )P (P | H C ) (0.001)(0.99) 0.00099 = = (0.001)(0.99) + (0.999)(0.01) 0.00099 + 0.00999 P (H | P ) = = 0.09016 3 c) We are interested in the probability P (H | P on first ∩ P on second) Since the tests are independent, we have that: P (P on first ∩ P on second | H) = P (P on first | H)P (P on second | H) Then, using Bayes’s Rule, we have: P (H ∩ P on first ∩ P on second) P (P on first ∩ P on second) P (P on first ∩ P on second | H)P (H) = P (P on first ∩ P on second | H)P (H) + P (P on first ∩ P on second | H C )P (H C ) P (P on first | H)P (P on second | H)P (H) = P (P on first | H)P (P on second | H)P (H) + P (P on first | H C )P (P on second | H C )P (H C ) (0.99)(0.99)(0.008) = (0.99)(0.99)(0.008) + (0.01)(0.01)(0.992) P (H | P on first ∩ P on second) = = 0.987506297 d) In East Asia, the probability is: P (H ∩ P on first ∩ P on second) P (P on first ∩ P on second) P (P on first ∩ P on second | H)P (H) = P (P on first ∩ P on second | H)P (H) + P (P on first ∩ P on second | H C )P (H C ) P (P on first | H)P (P on second | H)P (H) = P (P on first | H)P (P on second | H)P (H) + P (P on first | H C )P (P on second | H C )P (H C ) (0.99)(0.99)(0.001) = (0.99)(0.99)(0.001) + (0.01)(0.01)(0.999) P (H | P on first ∩ P on second) = = 0.9075 6) Problem 3.150 Define the following events: H: {NDE detects a hit} D: {Defect exists} 4 We are given that P (H | D) = 0.97, P (H | DC ) = 0.005, P (D) = 1/100 = 0.01. In order to find P (D | H), we must first find P (H) using the law of total probability: P (H) = P (H ∩ D) + P (H ∩ DC ) = P (H | D)P (D) + P (H | DC )P (DC ) = (0.97)(0.01) + (0.005)(0.99) = 0.01465 Then, we have P (D | H) = P (D ∩ H) P (H | D)P (D) (0.97)(0.01) = = = 0.6621 P (H) P (H) 0.01465 Note that we could have directly used Bayes’ Rule to obtain this probability: P (D ∩ H) P (H | D)P (D) = P (H) P (H | D)P (D) + P (H | DC )P (DC ) (0.97)(0.01) = = 0.6621 (0.97)(0.01) + (0.005)(0.99) P (D | H) = 7) Problem 3.151 From exercise 3.90, we defined the following events: I: {Intruder} N: {No Intruder} A: {System A sounds alarm} B: {System B sounds alarm} C: A ∩ B Also from exercise 3.90, we computed P (A ∩ B | I) = P (C | I) = 0.855 and P (A ∩ B | N ) = P (C | N ) = 0.02. Then, we have: P (C) = P (C | I)P (I) + P (C | N )P (N ) = (0.855)(0.4) + (0.02)(0.6) = 0.354 P (I ∩ C) = P (C | I)P (I) = (0.855)(0.4) = 0.342 5 Thus P (I | C) = P (I ∩ C) 0.342 = = 0.966 P (C) 0.354 8) Problem 4.20 a) Since X is either −2, −1, 0, 1, or 2, we have P (X ≤ 0) = P ({X = −2} ∪ {X = −1} ∪ {X = 0}) = P (X = −2) + P (X = −1) + P (X = 0) = 0.10 + 0.15 + 0.40 = 0.65 In the same way, we have: b) P (X > −1) = P (X = 0) + P (X = 1) + P (X = 2) = 0.40 + 0.30 + 0.04 = 0.75 c) P (−1 ≤ X ≤ 1) = P (X = −1) + P (X = 0) + P (X = 1) = 0.15 + 0.40 + 0.30 = 0.85 d) P (X < 2) = 1 − P (X = 2) = 1 − 0.05 = 0.95 e) P (−1 < X < 2) = P (X = 0) + P (X = 1) = 0.40 + 0.30 = 0.70 f) P (X < 1) = 1 − P (X ≥ 1) = 1 − (P (X = 1) + P (X = 2)) = 1 − (0.30 + 0.05) = 0.65 9) Problem 4.24 Let X be the number of delphacid eggs on the blade. (a) Using the percentages given in the table, we have that the probability distribution for X is: x 1 2 3 4 P (X = x) 0.40 0.54 0.02 0.04 (b) P (X ≥ 3) = P (X = 3) + P (X = 4) = 0.02 + 0.04 = 0.06 10) Problem 4.27 (a) P (X = 1) = (0.23)(0.77)1−1 = 0.23 (b) P (X = 5) = (0.23)(.77)5−1 = (0.23)(0.77)4 = 0.081 (c) P (X ≥ 2) = 1 − P (X < 2) = 1 − P (X = 1) = 1 − 0.23 = 0.77. 6 Remark: Justification of formula for probability distribution: Let X denote the number of cartridges sampled until a contaminated one is found. Then clearly X can take on values 1, 2, 3, ... (i.e. we can draw a contaminated cartridge on our first draw, in which case x = 1, or we can draw a contaminated cartridge on our second draw in which case x = 2, etc.) Let S represent a “success” (picking a contaminated cartridge) and F , a “failure” (picking a clean cartridge). Then if X = x, it means we pick a contaminated cartridge on the xth draw, and thus the x − 1 draws before this one were all clean cartridges. It follows that our sequence of failures/successes in this case is of the form F F...F S, where there are x − 1 consecutive failures followed by one success. Assuming each trial or draw is independent, then to find the P (X = x) we must find the probability of having x − 1 consecutive failures followed by one success. Since we are given that the P (S) = 0.23 and P (F ) = 0.77, we then have that P (X = x) = P (F F...F S) = P (F )P (F )...P (F )P (S) = {P (F )}x−1 P (S) = (0.77)x−1 (0.23) 7

Help

## Leave a Comment

(0 Comments)