Scientific inquiry: variables, validity, argument

Year 9 answers

Fluency

E.g. “Does increasing water temperature reduce the time for $5$ g of salt to dissolve in $100$ mL of water?”
“If water temperature is increased, then the time for salt to dissolve will decrease, because particles move faster at higher temperature, giving more frequent collisions with the solvent.”
IV: mass of fertiliser per plant. DV: tomato yield (mass or count of fruit). Controlled: variety of tomato, size of pot, amount of water, light exposure, soil type, duration of experiment.
The IV is the variable the experimenter changes; the DV is what is measured and is expected to respond to changes in the IV.
A control group is a comparison group that does not receive the treatment/change, showing what happens without the IV. Example: placebo group in a drug trial.

Reasoning

Validity: the experiment measures what it claims to measure. Reliability: repeated measurements give consistent results. Accuracy: measurements are close to the true value.
Systematic error — off by a consistent $+2^{\circ}\text{C}$ .
Reliable (very consistent), reasonably accurate if the true time is near $12.40$ s. Validity depends on whether timing actually measures what we want.
E.g. using a cheap bathroom scale that always reads $2$ kg too low — gives repeatable (reliable) but inaccurate readings; still not valid for a “true weight” study.
Repeating averages out random fluctuations, improving reliability. It does not fix systematic errors, which push every reading the same way.

Problem solving

IV: drop height (e.g. 25, 50, 75, 100, 125 cm). DV: bounce height (cm) from the floor to top of first bounce. Controlled: same ball, same surface, same ball release technique (no push), same temperature, same measurer. Data: measure bounce height 3 times per drop height; average. Analysis: plot bounce height (y) vs drop height (x); look for a linear trend and comment on outliers.
Issues: (i) $n = 1$ ; no replication; (ii) no control (different plants, different conditions — uncontrolled confounds); (iii) a single outcome (taller) doesn’t prove the fertiliser is responsible; (iv) no randomisation; (v) no measure of variability.
Bias sources: (i) selective reporting of favourable results; (ii) study design choices favouring the sponsor (short duration, specific group). Address by independent replication, pre-registering the study, and full public data access.
Investigate first: were the two outliers from a procedural mistake (e.g. different method)? If yes, exclude and state this. If there’s no mistake, keep them but report them — they may reflect real variation. Use a consistent rule (e.g. outlier test) rather than discarding data to make the result look cleaner.

Reasoning

Correlation does not imply causation. Both ice-cream sales and drownings rise in summer because of higher temperatures and more swimming; the common cause is hot weather, not ice cream.
Reaction time falls from 280 ms (0 mg) to 245 ms (150 mg), showing caffeine shortens reaction time up to a point. At 200 mg it rises again (260 ms), suggesting a “too much” effect (jitteriness, over-arousal). Plausible interpretation: moderate doses improve alertness; high doses may impair.
Claim: LED bulbs are more efficient than incandescents. Evidence: typical LED outputs $\sim 60\%$ of input as visible light vs $\sim 5\%$ for incandescents; LEDs use $10$ W to produce about the same light as a $60$ W incandescent. Reasoning: both convert electrical input into light and heat; LEDs use semiconductor electroluminescence, which diverts little energy to heat, while incandescents rely on a heated filament where most energy becomes heat. Therefore, for the same useful light, LEDs use far less electrical input, which is the definition of higher efficiency.

Reasoning

In a double-blind trial, neither side can consciously or unconsciously influence outcomes. If doctors knew, they might treat the drug group differently (more attentive care, interpret symptoms differently); if patients knew, the placebo effect and reporting would differ. Double-blinding removes both channels of bias.
Study B deserves more weight: much larger $n$ (better reliability), researcher-measured weight (more accurate and valid than self-report), longer duration (captures real effects). Study A’s small sample and self-reported DV make both reliability and validity weaker.
“Proof” in everyday use means certainty. In science, no finite set of observations can establish certainty — there may always be a future experiment that contradicts a theory. Science “supports” hypotheses tentatively and is open to revision. Paradoxically, this is a strength: self-correction is why science advances, while dogma that claims proof cannot be improved.
Claim: higher hours of study are associated with higher test scores. Evidence: positive trend on the graph. Reasoning: more time on content could plausibly improve retention and skill. But correlation is not causation; confounding variables (e.g. motivation, sleep, subject aptitude, prior knowledge) might cause both more study and better scores. A controlled experiment or statistical control is needed to distinguish the effect of study time itself.