"I can't get through my day without coffee" is a common
statement from many students. Assumed benefits include keeping
students awake during boring lectures and making them more alert
for exams and tests. But this is purely anecdotal evidence, and
firmer data is needed. Students in an introductory statistics
class designed an experiment to measure memory retention with
and without drinking a cup of coffee 1 hr prior to a test. Because
of disagreements between members of the class, two different experimental
designs were used.
Design 1:
Thirty students were randomized to one of three groups; no coffee, 1 cup of coffee, or 2 cups of coffee to be consumed 1 hour before the test. The test consists of a series of words flashed on a screen and then the student has to write down as many of the words that can be remembered. Here are summary statistics about the test:

(a) Why were the students randomized to the three groups?
Randomization ensures that the influence of all other, uncontrollable,
factors is roughly equal in all groups. Randomization does not
remove the influence of other factors; it only makes their effect
roughly equal in all groups.
(b) Explain the difference between the standard deviation
of 3.156 for the no coffee group and the standard error of .998
for the same group.
The sample standard deviation of 3.156 refers to the variation
of individual readings around the sample mean. The estimated standard
error of .998 refers to the variation of the sample mean in repeated
samples of the same size from the same population.
The next 3 questions refer to the following.
The students started by doing a statistical test comparing the "no coffee" and "1 cup" groups. Here is the output from the comparison.

(c) Interpret the 95% confidence interval given in the above
output.
We are 95% confidence that the difference in the mean number of
words remembered (1 cup-no coffee) is in this interval. We are
95% confident that the mean number of words remembered by the
1 cup groups is between -1.89 and 2.69 more than the no coffee
group.
(d) What is the null and alternate hypothesis in symbols and
words.
H: µ1 cup - µno coffee = 0 The mean number of words remembered by the two groups is the same
A: µ1 cup - µno coffee > 0 The mean number of words remembered by the 1 cup group is greater than
the no coffee group.
Note this is a one-sided hypothesis.
(e) What is the p-value, and what do you conclude?
The reported p-value above is two-sided. The one-sided p-value
is .7183/2 = .359.
We fail to reject the null hypothesis, and conclude that there
is no evidence that mean number of words remembered by the 1 cup
group is larger than the no coffee group.
The next 3 questions refer to the following:
One student pointed out that comparison of all three groups simultaneously can be done using another method. Here is the output:
(f) What are the null and alternate hypotheses for this test?
H: µno coffee = µ1 cup = µ2 cups
A: not all of the means are equal.
(g) What is the p-value, and what do you conclude from the
p-value.
The p-value is <.0001.
We reject the null hypothesis and conclude that there is strong
evidence that not all the means are equal.
(h) After looking at the results of (g), a multiple-comparison
procedure was performed. Why is this necessary, and what do you
conclude?
A multiple comparison procedure is needed because rejection of
the null hypothesis does not give any information as to which
means differ from the rest.
Based on the multiple comparison circles, it appears that the 2 cup group has larger mean than the other two groups, but that you cannot distinguish between the 1 cup and no coffee groups.
The next 2 questions refer to the following:
Another group of students said that a "paired" or a
"blocked" design was preferable.
(i) Carefully explain how to conduct such an experiment.
Each person must be tested three time. Randomly assign the amount
of coffee to three days and test the person three times.
Each person should be randomized separately.
(j) What are the advantages of such an experiment over the
completely randomized design used by the first group and under
what conditions would it be preferred?
The advantages are that person-to-person variation can be removed
from the comparison of the group means.
A RCB design should be used when ever you suspect that there is
substantial heterogeneity in the experimental units but the experimenter
can recognize one of the sources of heterogeneity in advance.