Topic 16 | Statistics & Probability

Conditional probability

Year 10 core: conditional probability P(A|B), 'given that' language, independent vs dependent events, sampling without replacement, two-way tables, and tree diagrams with conditional branches.

50-65 min Printable practice Answer key Challenge included
How to use this page

Read the explanation, work through the examples, then complete the core practice before printing.

Study progress: Not started

What you will learn

Worked example 0 Real-world example: medical testing

A screening test for a rare disease has the following properties: the disease affects 1%1\% of the population. If a person has the disease, the test is positive 95%95\% of the time. If a person does not have the disease, the test is positive 3%3\% of the time (false positive). What is the probability that a person who tests positive actually has the disease?

  1. Let DD = has disease, T+T^+ = tests positive.
  2. P(D)=0.01P(D) = 0.01, P(T+D)=0.95P(T^+ \mid D) = 0.95, P(T+D)=0.03P(T^+ \mid \overline{D}) = 0.03.
  3. P(T+)=P(T+D)P(D)+P(T+D)P(D)=0.95×0.01+0.03×0.99=0.0095+0.0297=0.0392P(T^+) = P(T^+ \mid D) \cdot P(D) + P(T^+ \mid \overline{D}) \cdot P(\overline{D}) = 0.95 \times 0.01 + 0.03 \times 0.99 = 0.0095 + 0.0297 = 0.0392.
  4. P(DT+)=P(T+D)P(D)P(T+)=0.00950.03920.242P(D \mid T^+) = \dfrac{P(T^+ \mid D) \cdot P(D)}{P(T^+)} = \dfrac{0.0095}{0.0392} \approx 0.242.

Key idea: even with an accurate test, only about 24%24\% of positive results are true positives when the disease is rare. This is why confirmatory testing exists.

1. Conditional probability

The conditional probability of event AA given event BB has occurred is:

Conditional probability

Definition

P(AB)=P(A and B)P(B),P(B)>0P(A \mid B) = \dfrac{P(A \text{ and } B)}{P(B)}, \quad P(B) > 0

Read P(AB)P(A \mid B) as “the probability of AA given BB.”

The vertical bar "\mid" means “given that” or “knowing that.” Watch for these phrases in word problems:

Worked example 1 Using the formula

In a group of 4040 students, 2525 study maths, 1818 study science, and 1010 study both. Find P(sciencemaths)P(\text{science} \mid \text{maths}).

  1. P(maths)=2540P(\text{maths}) = \dfrac{25}{40}.
  2. P(science and maths)=1040P(\text{science and maths}) = \dfrac{10}{40}.
  3. P(sciencemaths)=P(science and maths)P(maths)=10/4025/40=1025=25=0.4P(\text{science} \mid \text{maths}) = \dfrac{P(\text{science and maths})}{P(\text{maths})} = \dfrac{10/40}{25/40} = \dfrac{10}{25} = \dfrac{2}{5} = 0.4.

Interpretation: of the 2525 maths students, 1010 also study science — that is 40%40\%.

2. Independent and dependent events

Two events are independent if knowing one occurred does not change the probability of the other:

P(AB)=P(A)(independence test)P(A \mid B) = P(A) \quad \text{(independence test)}

Equivalently, for independent events: P(A and B)=P(A)×P(B)P(A \text{ and } B) = P(A) \times P(B).

If P(AB)P(A)P(A \mid B) \neq P(A), the events are dependent.

Worked example 2 Testing for independence

From the example above: P(science)=1840=0.45P(\text{science}) = \dfrac{18}{40} = 0.45 and P(sciencemaths)=0.4P(\text{science} \mid \text{maths}) = 0.4.

Since 0.40.450.4 \neq 0.45, the events “studies science” and “studies maths” are dependent. Knowing a student studies maths slightly decreases the probability they study science.

3. Two-way tables for conditional probability

A two-way table provides all the information needed to calculate conditional probabilities directly from counts.

Worked example 3 Conditional probability from a two-way table

A survey of 200200 adults records exercise habits and health ratings:

Good healthPoor healthTotal
Exercises regularly9030120
Does not exercise404080
Total13070200

(a) P(good healthexercises)=90120=34=0.75P(\text{good health} \mid \text{exercises}) = \dfrac{90}{120} = \dfrac{3}{4} = 0.75.

(b) P(good healthdoes not exercise)=4080=12=0.5P(\text{good health} \mid \text{does not exercise}) = \dfrac{40}{80} = \dfrac{1}{2} = 0.5.

(c) P(exercisesgood health)=901300.692P(\text{exercises} \mid \text{good health}) = \dfrac{90}{130} \approx 0.692.

Note: P(AB)P(BA)P(A \mid B) \neq P(B \mid A) in general. Part (a) and part (c) have different values.

4. Tree diagrams with conditional branches

When events are dependent, the branches of a tree diagram show conditional probabilities. This is especially useful for sampling without replacement.

Start3/5R2/5B2/42/4RB3/41/4RBRR: 3/5 x 2/4 = 6/20RB: 3/5 x 2/4 = 6/20BR: 2/5 x 3/4 = 6/20BB: 2/5 x 1/4 = 2/201st card2nd card (without replacement)
Tree diagram: drawing 2 cards from a hand of 3 red and 2 black cards without replacement.

The second-draw probabilities are conditional on the first draw’s result. After drawing a red card first, only 22 red and 22 black remain out of 44 total.

Worked example 4 Using the tree diagram

From the tree diagram above, find:

(a) P(at least one red)=P(RR)+P(RB)+P(BR)=620+620+620=1820=910P(\text{at least one red}) = P(RR) + P(RB) + P(BR) = \dfrac{6}{20} + \dfrac{6}{20} + \dfrac{6}{20} = \dfrac{18}{20} = \dfrac{9}{10}.

(b) P(2nd card is red1st card is black)=34P(\text{2nd card is red} \mid \text{1st card is black}) = \dfrac{3}{4}.

This can be read directly from the tree: on the lower branch (1st card black), the “R” branch has probability 34\dfrac{3}{4}.


Practice

Fluency

Tier 1: basic skills

    1. In a class of 3030 students, 1212 play basketball. What is P(basketball)P(\text{basketball})?
    2. Of the 1212 basketball players, 55 are also in the swim team. What is P(swimbasketball)P(\text{swim} \mid \text{basketball})?
    3. A bag has 66 red and 44 blue marbles. One marble is drawn and not replaced. If the first marble was red, what is P(red on 2nd draw)P(\text{red on 2nd draw})?
    4. Events AA and BB satisfy P(A)=0.3P(A) = 0.3, P(B)=0.5P(B) = 0.5, P(A and B)=0.15P(A \text{ and } B) = 0.15. Find P(AB)P(A \mid B).
    5. Are AA and BB in Q4 independent? Justify.
    6. A two-way table shows: 2020 males own a pet, 1515 males do not, 2525 females own a pet, 1010 females do not. Find P(petmale)P(\text{pet} \mid \text{male}).
    7. Using the same table, find P(malepet)P(\text{male} \mid \text{pet}).
    8. A coin is tossed and a die is rolled. Are the events “heads” and “rolling a 6” independent? Explain.
    9. Two cards are drawn without replacement from a deck of 5252. Find P(2nd card is heart1st card is heart)P(\text{2nd card is heart} \mid \text{1st card is heart}).
    10. State the formula for P(AB)P(A \mid B).
Reasoning

Tier 2: mixed practice

    1. A box contains 55 green and 33 yellow balls. Two balls are drawn without replacement. Draw a tree diagram with conditional probabilities on each branch, and find P(both green)P(\text{both green}).

    2. In a school of 400400 students, 240240 study French, 180180 study German, and 8080 study both. Find: (a) P(GermanFrench)P(\text{German} \mid \text{French}), (b) P(FrenchGerman)P(\text{French} \mid \text{German}).

    3. A survey finds:

      Supports policyOpposes policyTotal
      Under 30453075
      30 and over354075
      Total8070150

      (a) Find P(supportsunder 30)P(\text{supports} \mid \text{under 30}) and P(supports30 and over)P(\text{supports} \mid \text{30 and over}). (b) Is there an association between age group and opinion? Justify.

    4. Events AA and BB are such that P(A)=0.6P(A) = 0.6, P(B)=0.4P(B) = 0.4, and P(AB)=0.75P(A \mid B) = 0.75. Find P(A and B)P(A \text{ and } B) and determine whether AA and BB are independent.

    5. Three machines produce items. Machine X makes 50%50\% of items with a 2%2\% defect rate. Machine Y makes 30%30\% with a 3%3\% defect rate. Machine Z makes 20%20\% with a 5%5\% defect rate. An item is selected at random. Find the probability it is defective.

Reasoning

Tier 3: explain and apply

    1. Using the machine data from Tier 2 Q5, an item is found to be defective. Find the probability it came from Machine Z.
    2. Explain, with a numerical example, why P(AB)P(BA)P(A \mid B) \neq P(B \mid A) in general. Why is confusing these two a common and dangerous error in medical or legal contexts?
    3. A jar contains 44 red and 66 blue marbles. Three marbles are drawn without replacement. Find P(all three are blue)P(\text{all three are blue}) using a chain of conditional probabilities.
    4. Two events satisfy P(AB)=0.5P(A \mid B) = 0.5 and P(BA)=0.25P(B \mid A) = 0.25. If P(B)=0.4P(B) = 0.4, find P(A)P(A).

Challenge

Reasoning

Harder reasoning

    1. A game show has three doors. Behind one door is a prize; behind the other two, nothing. You pick a door. The host, who knows what is behind each door, opens a different door to reveal no prize. You are offered the chance to switch. Using conditional probability, show that switching gives you a 23\dfrac{2}{3} chance of winning.
    2. In a population, 0.5%0.5\% use a certain drug. A drug test has a 99%99\% true positive rate and a 2%2\% false positive rate. Find the probability that a person who tests positive actually uses the drug. Comment on the usefulness of the test.
    3. Prove that if AA and BB are independent, then AA and B\overline{B} (the complement of BB) are also independent.
    4. Five cards are dealt from a standard deck of 5252. Find the probability that all five are spades, using a chain of conditional probabilities.
Answers

Answer key

Attempt the practice first. When you're ready to check, expand the answers below.

Show the full answer key

Tier 1

    1. P(basketball)=1230=25=0.4P(\text{basketball}) = \dfrac{12}{30} = \dfrac{2}{5} = 0.4.
    2. P(swimbasketball)=5120.417P(\text{swim} \mid \text{basketball}) = \dfrac{5}{12} \approx 0.417.
    3. After removing a red marble: 55 red and 44 blue remain out of 99. P(red on 2nd draw)=59P(\text{red on 2nd draw}) = \dfrac{5}{9}.
    4. P(AB)=P(A and B)P(B)=0.150.5=0.3P(A \mid B) = \dfrac{P(A \text{ and } B)}{P(B)} = \dfrac{0.15}{0.5} = 0.3.
    5. Yes, AA and BB are independent because P(AB)=0.3=P(A)P(A \mid B) = 0.3 = P(A). Knowing BB occurred does not change the probability of AA.
    6. P(petmale)=2020+15=2035=470.571P(\text{pet} \mid \text{male}) = \dfrac{20}{20 + 15} = \dfrac{20}{35} = \dfrac{4}{7} \approx 0.571.
    7. P(malepet)=2020+25=2045=490.444P(\text{male} \mid \text{pet}) = \dfrac{20}{20 + 25} = \dfrac{20}{45} = \dfrac{4}{9} \approx 0.444.
    8. Yes, they are independent. The outcome of the coin does not affect the die, and vice versa. P(heads and 6)=12×16=112=P(heads)×P(6)P(\text{heads and 6}) = \dfrac{1}{2} \times \dfrac{1}{6} = \dfrac{1}{12} = P(\text{heads}) \times P(\text{6}).
    9. After removing one heart, 1212 hearts remain out of 5151 cards. P(2nd heart1st heart)=1251=417P(\text{2nd heart} \mid \text{1st heart}) = \dfrac{12}{51} = \dfrac{4}{17}.
    10. P(AB)=P(A and B)P(B)P(A \mid B) = \dfrac{P(A \text{ and } B)}{P(B)}, where P(B)>0P(B) > 0.

Tier 2

    1. First draw: P(G)=58P(G) = \dfrac{5}{8}, P(Y)=38P(Y) = \dfrac{3}{8}. If 1st is green: P(G)=47P(G) = \dfrac{4}{7}, P(Y)=37P(Y) = \dfrac{3}{7}. If 1st is yellow: P(G)=57P(G) = \dfrac{5}{7}, P(Y)=27P(Y) = \dfrac{2}{7}. P(both green)=58×47=2056=514P(\text{both green}) = \dfrac{5}{8} \times \dfrac{4}{7} = \dfrac{20}{56} = \dfrac{5}{14}.
    2. (a) P(GermanFrench)=80240=130.333P(\text{German} \mid \text{French}) = \dfrac{80}{240} = \dfrac{1}{3} \approx 0.333. (b) P(FrenchGerman)=80180=490.444P(\text{French} \mid \text{German}) = \dfrac{80}{180} = \dfrac{4}{9} \approx 0.444.
    3. (a) P(supportsunder 30)=4575=0.6P(\text{supports} \mid \text{under 30}) = \dfrac{45}{75} = 0.6. P(supports30 and over)=35750.467P(\text{supports} \mid \text{30 and over}) = \dfrac{35}{75} \approx 0.467. (b) Yes, there is an association. The conditional probabilities differ: younger respondents are more likely to support the policy (60%60\% vs 47%47\%). If there were no association, both groups would have the same support rate of 801500.533\dfrac{80}{150} \approx 0.533.
    4. P(A and B)=P(AB)×P(B)=0.75×0.4=0.3P(A \text{ and } B) = P(A \mid B) \times P(B) = 0.75 \times 0.4 = 0.3. For independence: P(A)×P(B)=0.6×0.4=0.24P(A) \times P(B) = 0.6 \times 0.4 = 0.24. Since 0.30.240.3 \neq 0.24, AA and BB are not independent.
    5. P(defective)=0.50×0.02+0.30×0.03+0.20×0.05=0.010+0.009+0.010=0.029P(\text{defective}) = 0.50 \times 0.02 + 0.30 \times 0.03 + 0.20 \times 0.05 = 0.010 + 0.009 + 0.010 = 0.029. The probability that a randomly selected item is defective is 2.9%2.9\%.

Tier 3

    1. P(Z and defective)=0.20×0.05=0.010P(\text{Z and defective}) = 0.20 \times 0.05 = 0.010. P(Zdefective)=P(Z and defective)P(defective)=0.0100.0290.345P(\text{Z} \mid \text{defective}) = \dfrac{P(\text{Z and defective})}{P(\text{defective})} = \dfrac{0.010}{0.029} \approx 0.345. There is about a 34.5%34.5\% chance the defective item came from Machine Z.
    2. Example: P(diseasepositive test)P(positive testdisease)P(\text{disease} \mid \text{positive test}) \neq P(\text{positive test} \mid \text{disease}). A test might detect 95%95\% of sick people (P(T+D)=0.95P(T^+ \mid D) = 0.95), but if the disease is rare, P(DT+)P(D \mid T^+) could be much lower (e.g. 0.160.16). Confusing the two — called the “prosecutor’s fallacy” in legal contexts — leads to wildly wrong conclusions. In medicine, it means overestimating how likely a patient is to have the disease after a positive test.
    3. P(1st blue)=610P(\text{1st blue}) = \dfrac{6}{10}. P(2nd blue1st blue)=59P(\text{2nd blue} \mid \text{1st blue}) = \dfrac{5}{9}. P(3rd bluefirst two blue)=48=12P(\text{3rd blue} \mid \text{first two blue}) = \dfrac{4}{8} = \dfrac{1}{2}. P(all three blue)=610×59×12=30180=16P(\text{all three blue}) = \dfrac{6}{10} \times \dfrac{5}{9} \times \dfrac{1}{2} = \dfrac{30}{180} = \dfrac{1}{6}.
    4. P(A and B)=P(AB)×P(B)=0.5×0.4=0.2P(A \text{ and } B) = P(A \mid B) \times P(B) = 0.5 \times 0.4 = 0.2. Also P(A and B)=P(BA)×P(A)P(A \text{ and } B) = P(B \mid A) \times P(A), so 0.2=0.25×P(A)0.2 = 0.25 \times P(A), giving P(A)=0.20.25=0.8P(A) = \dfrac{0.2}{0.25} = 0.8.

Challenge

    1. Label the doors 1,2,31, 2, 3. Suppose the prize is behind door 11 (by symmetry, the argument is the same for any door). You pick door 11: host opens door 22 or 33; switching loses. You pick door 22: host must open door 33; switching wins. You pick door 33: host must open door 22; switching wins. So switching wins 22 out of 33 times. Formally: let WW = prize behind your door. P(W)=13P(W) = \dfrac{1}{3}. Given the host reveals a losing door, P(prize behind other doorhost reveals)=23P(\text{prize behind other door} \mid \text{host reveals}) = \dfrac{2}{3}.
    2. P(uses drug)=0.005P(\text{uses drug}) = 0.005. P(T+user)=0.99P(T^+ \mid \text{user}) = 0.99. P(T+non-user)=0.02P(T^+ \mid \text{non-user}) = 0.02. P(T+)=0.005×0.99+0.995×0.02=0.00495+0.0199=0.02485P(T^+) = 0.005 \times 0.99 + 0.995 \times 0.02 = 0.00495 + 0.0199 = 0.02485. P(userT+)=0.004950.024850.199P(\text{user} \mid T^+) = \dfrac{0.00495}{0.02485} \approx 0.199. Only about 20%20\% of positive results are true positives. The test produces many false positives because the user base is so small. A confirmatory (more specific) test is essential.
    3. P(A and B)=P(A)P(A and B)=P(A)P(A)P(B)P(A \text{ and } \overline{B}) = P(A) - P(A \text{ and } B) = P(A) - P(A) \cdot P(B) (using independence) =P(A)(1P(B))=P(A)P(B)= P(A)(1 - P(B)) = P(A) \cdot P(\overline{B}). Since P(A and B)=P(A)P(B)P(A \text{ and } \overline{B}) = P(A) \cdot P(\overline{B}), events AA and B\overline{B} are independent.
    4. P(all 5 spades)=1352×1251×1150×1049×948=13×12×11×10×952×51×50×49×48=154440311875200=33666400.000495P(\text{all 5 spades}) = \dfrac{13}{52} \times \dfrac{12}{51} \times \dfrac{11}{50} \times \dfrac{10}{49} \times \dfrac{9}{48} = \dfrac{13 \times 12 \times 11 \times 10 \times 9}{52 \times 51 \times 50 \times 49 \times 48} = \dfrac{154440}{311875200} = \dfrac{33}{66640} \approx 0.000495.

Prefer paper? Print the answer key as a separate booklet: open print view ->