Overview

This lesson shows the basic procedure of the one-sample hypothesis test.

Objectives

After completing this module, students should be able to:

  1. Formulate a null and research hypothesis.
  2. Select and explain the test type, p-value threshold, and number of tails.
  3. Calculate a test statistic, p value, and 95% CI.
  4. Determine whether to reject or fail to reject the null based on 1-3.

Reading

Lander, Chapter 15.3.1. Schumacker, Ch. 10

1 The testing procedure summarized

Having discussed the statistical testing procedure in a rough way, let’s now make our procedure and notation a bit more precise.

Here is the basic procedure:

  1. Formulate null hypothesis and research hypothesis
  2. Select test type (one- or two-tailed) and critical p-value
  3. Gather data and calculate test statistic
  4. Reject the null or not, and write up results: “we reject the null hypothesis in favor of the research hypothesis” or “we cannot reject the null hypothesis”

1.1 1. Hypotheses

We begin with our hypotheses. We usually denote our research or “alternative” hypothesis \(H_{a}\) and our null hypothesis \(H_{0}\).

Research hypothesis: \(H_{a}: \mu > 0\). (One-tailed.)

Research hypothesis: \(H_{a}: \mu \neq 0\). (Two-tailed.)

Null hypothesis: \(H_{0}: \mu = 0\).

Note that these are still just examples. The null doesn’t have to be \(0\). It can be \(3\) if that’s the prevailing view, or something more complicated. We will discuss some of these more complex tests in the lessons to come.

1.2 2. Choose the test type and critical p-value

Now we select our test type and critical p-value. Usually a test should be two-tailed because that is more conservative, which means there are two rejection regions and two rejection thresholds, a high and a low. The size of these regions is determined by the p-value you select, which is generally 0.05 (this p-value threshold is often denoted \(\alpha\) or alpha.) If you choose 0.05 and a two-tailed test, remember that each region is now 0.025% of the total.

1.3 3-4 Data, test statistic, and rejecting the null

Our data can usually summarized with three numbers: \(\bar{x}\), \(s\), and \(n\).

As we have seen before, there are a number of equivalent ways to conduct our test.

  1. If you are doing a two-tailed test with a p threhsold of 0.05 (eg), you can construct a 95% CI around the sample mean as we saw in the previous Module (using the t distribution), and ask whether that encompasses the null hypothesis \(\mu\) value; if not, you reject the null. If your p-value threshold (\(\alpha\)) is other than 0.05, you just construct the appropriate (1-p) confidence interval.

  2. Calculate the test statistic and see whether it falls into the rejection region. The test statistic is just the usual:

\[\textrm{Test statistic } = \frac{\bar{x} - \mu_{0}}{se}\]

Ie, how many standard errors the mean is from the null hypothesis. If your critical p-value is 0.05 and your test is two-tailed, then the rejection region is any test statistic larger than qt(.975,99) or less than qt(.025,99) (assuming your \(n\) is 100 in this example). If your test is one-tailed and your p-value is 0.01, then the rejection region would be any test statistic larger than qt(.99,99) if your research hypothesis was greater than the null, and any value less than qt(.01,99) if the hypothesis was less than the null. If the test statistic falls into the rejection region, you reject the null.

  1. Calculate the test statistic, and from that, calculate the p value for your data: that is, the chance of getting something as extreme as that or more. If you’re doing a one-tailed test, and your t statistic is positive, the p-value is 1-pt(tstatistic,99) (assuming your \(n\) is 100); if the t statistic is negative, the p-value is just pt(tstatistic,99). And if it’s a two-tailed test, your p-value is twice that. Finally, once again, if your p-value here is less than your threshold value (eg, 0.05), you reject the null.

Of these three approaches, the second is the most standard. Generally one does a two-tailed test, with an \(\alpha\) of 0.05, and then constructs the t statistic from the data and rejects the null if that statistic is greater than (or lesser than) the critical value (ie, whether it’s in the rejection region).

1.4 Example

  1. Research hypothesis: \(H_{a}: \mu > 1\).

Null hypothesis: \(H_{0}: \mu = 1\).

  1. What is our rejection region or p-value threshold? Our p-value or \(\alpha\) is usually 0.05, but we still must decide whether it is a one-tailed or two-tailed test. The research hypothesis above would suggest a one-tailed test, but it is nevertheless better practice in most cases to go with the more stringent two-tailed test (which is technically: \(H_{a}: \mu \neq 1\)). Thus we have two rejection regions, one in each tail, each of size 0.025.

  2. Data: \(\bar{x} = 3\), \(s = 5\), \(n = 100\).

\[\textrm{Test statistic } = \frac{\bar{x} - \mu_{0}}{se}\]

Our standard error is \(se = 0.5\) and our t statistic is therefore \(4\).

Our critical value (rejection region) is any t statistic greater than qt(.975,99) = 1.98 or less than qt(.025,99) = -1.98.

  1. \(4\) is clearly in the rejection region so we can reject the null hypothesis in favor of the research hypothesis. (Note that if it passes the two-tailed test, it would have passed the one-tailed test as well.) Thus we reject our existing theory that the population mean was \(1\), and provisionally accept our better guess that it is \(3\).

Equivalently, we could have directly calculated our p-value, which would be the area in the right tail greater than 4 plus the symmetrical area in the left tail, or

2*(1-pt(4,99))
[1] 0.0001222515

(Make sure you understand this calculation!) This is obviously much less than 0.05, so again we would reject the null.

2 Calculating in R

Is is important to be able to calculate and understand this test by hand, but of course R also has functions for this built in. For instance, say we want to calculate whether the mean stopping distance of in the cars data set was different from 60 feet.

t.test(cars$dist,alternative="two.sided",mu=60)

    One Sample t-test

data:  cars$dist
t = -4.6703, df = 49, p-value = 2.372e-05
alternative hypothesis: true mean is not equal to 60
95 percent confidence interval:
 35.65642 50.30358
sample estimates:
mean of x 
    42.98 

R calculates the t statistic, degrees of freedom, and the p-value, although it leaves it to you to interpret this result. One can also do a one-sided test using alternative="greater" or alternative="less".