INTRODUCTION TO HYPOTHESIS TESTING Chapters 9 and 11
D orit N
evo, 2013
1
INFERENTIAL STATISTICS
¢ We often use statistics to test theories. — Theory: a prediction, or a group of predictions, about how
people, physical entities, and built devices behave.
¢ Theories begin as predictions, which are then repeatedly tested in various settings to either strengthen or refute them. — Such testing often involves statistical inference, defined as
the drawing of conclusions about a population of interest based on findings from samples obtained from that population.
¢ We will cover: — Hypothesis testing — Analysis of Variance (ANOVA) — Regression Analysis
HYPOTHESES TESTING
¢ Statistics is very much about expectations. We aim to test specific expectations that we have about the population’s parameters using sample statistics. We call these expectations hypotheses. — A hypothesis is some specific claim that we wish to
test.
¢ We study the probability of our sample’s outcome given the hypothesized distribution of the population
We believe that Our sample mean is
Possible?
X~N(10,2) 12 Probably yes
X~N(10,2) 20 Probably not
HYPOTHESES
¢ We differentiate between the research hypothesis and the null hypothesis. — Example: A bank manager argues that, on average,
people carry $50 or more in their wallet. This claim is the null hypothesis. The research hypothesis contains the other side of this claim, that is – that people carry less than $50. We can also write it as: ¢ H0: Average amount of money ≥ $50 ¢ H1: Average amount of money < $50
— where H0 is used to notate the null hypothesis and H1 (sometimes denoted HA) is used to notate the research hypothesis, commonly referred to as the alternative hypothesis.
HYPOTHESES
¢ Researchers tell you that, on average, people have 200 or fewer friends on Facebook. However, you believe that Facebook users, in fact, have more than 200 friends. You can set your hypotheses as: — H0: Average number of friends ≤ 200 — H1: Average number of friends > 200
¢ A statistics professor wants to know if her section’s grade average is different than that of other sections of the course. The average for all other sections is ‘75’. To help the professor learn if her section’s grade average is different than that of other sections, we need to set up the following hypotheses: — H0: Section’s grade average = 75 — H1: Section’s grade average ≠ 75
THREE FORMS OF HYPOTHESES
H0: Average amount of money ≥ $50
H1: Average amount of money < $50
Lower- Tail Test
H0: Section average = 75
H1: Section average ≠ 75
Two- Tail Test H0: Average
number of friends ≤ 200
H1: Average number of friends > 200
Upper- Tail Test
-4 -3 -2 -1 0 1 2 3 4
Z
The ‘equal’ sign (=, ≥, ≤) always goes in the null hypothesis
A FINAL NOTE ON HYPOTHESES
¢ Hypotheses must always be (1) exhaustive and (2) mutually exclusive. — Exhaustive means that the hypotheses should cover every
possible option.
— mutually exclusive means there is no overlap between hypotheses. The following hypotheses are, therefore, incorrect because the ‘= 20’ option is in both the null and alternative hypotheses: ¢ H0: Average weeks of job seeking ≥ 20 ¢ H1: Average weeks of job seeking ≤ 20
Not Exhaustive Exhaustive
H0: Average weeks of job seeking = 20 H1: Average weeks of job seeking > 20
H0: Average weeks of job seeking ≤ 20 H1: Average weeks of job seeking > 20
WRITE THE HYPOTHESES
¢ Angry Birds is one of the most popular games on various platforms. According to one source, people collectively spend 300 million minutes per day playing Angry Birds and it supposedly costs the US economy billions of dollars in lost work time. Suppose you believe this number (300 million minutes) is too high. You conduct some more research to find out that there are roughly 30 million daily active users of Angry Birds, which means that on average each player spends (300 million minutes/30 million users) 10 minutes per day playing Angry Birds, according to the original claim. You plan to use a sample of students at your school to test this claim and need to set up your hypotheses first.
D orit N
evo, 2013
8
HYPOTHESES
H0: Average time playing (µ) ≥ 10 H1: Average time playing (µ) < 10
D orit N
evo, 2013
9
MORE PRACTICE
D orit N
evo, 2013
10
¢ Using the table as your guide for the three types of tests (lower-tail, upper-tail, and two- tail), write the hypotheses for the following tests: — A two-tail test for the average
number of sleep hours — An upper-tail test for the
average hours spent grooming — A lower-tail test for the average
hours spent on educational activities
— A two-tail test for the average hours spent eating and drinking
— A two-tail test for the average hours spent on leisure and sports
— An upper-tail test for the average hours spent working
Activity Hours
Sleeping 8.4
Leisure and Sports 3.6
Educational Activities 3.4
Working 3.0
Other 2.2
Traveling 1.5
Eating and Drinking 1.1
Grooming 0.8
Total 24.0
EXAMPLE
¢ Suppose that we are considering an investment in a used textbook store and are interested in determining the amount of money that customers are likely to spend when shopping at the store.
¢ Based on historical data provided by the current owner, the average customer making purchases at the store spent $100 with a standard deviation of $35.
¢ We are concerned that sales may have decreased over the last year. We hypothesize that: — H0: Average spending (µ) ≥ 100 — H1: Average spending (µ) < 100
¢ With µ being the hypothesized average spending (the parameter of interest) for the population (all customers who have made purchases at the store).
D orit N
evo, 2013
11
WE TAKE A SAMPLE
¢ Data from twenty random customers shows that the mean spending in our sample is ‘$93.27’. — Our sample mean is indeed lower than the
hypothesized mean, but we can attribute it to sampling error.
— Should we reject the null hypothesis based on this one sample?
DECISION RULE
¢ The question we need to ask ourselves is: — how possible is it for us to take a random sample of
size n=20 out of a population with a mean of $100 and a standard deviation of $35 and obtain a sample mean of ‘$93.27’?
— If it is reasonably possible, then we have no reason to reject the null hypothesis.
— If the chances of finding this sample mean are extremely low, then we should conclude that the null hypothesis is likely false and we should reject it, believing that the true average spending is, in fact, lower than $100.
¢ So, just how possible is it that we could obtain a sample mean of ‘$93.27’ given a population mean of $100 and standard deviation of $35?
D orit N
evo, 2013
13
THE SAMPLING DISTRIBUTION OF THE MEAN
D orit N
evo, 2013
14
To answer this question we need to know:
SAMPLING DISTRIBUTIONS
¢ A sampling distribution refers to the probability distribution of any sample statistic. The sampling distribution of the mean describes the probabilities attached to all values of the mean of samples that are repeatedly taken from the same population.
¢ Let us continue with our used textbook store example to better understand this concept.
D orit N
evo, 2013
15
SAMPLING DISTRIBUTION
¢ Recall that we took a random sample of twenty customers (i.e., n=20), and measured how much each of the customers spent. We then computed the mean spending for that sample to be ‘$93.27’.
¢ Let us repeat this sampling procedure 200 times, each time taking a different sample of twenty customers and computing the mean spending for each sample.
D orit N
evo, 2013
16
Sample (n=20) Sample Mean (X) 1 $93.27 2 $104.34 3 $100.49 … …
199 $96.65 200 $111.65
Average over 200 samples: $99.04 Standard Deviation: $11.14
GRAPHICALLY
D orit N
evo, 2013
17
F re
q u
en cy
F re
q u
en cy
200 samples of size 20
1,000 samples of size 20
CENTRAL LIMIT THEOREM (PART I)
D orit N
evo, 2013
18
¢ The central limit theorem states that the sampling distribution of the mean of a random sample of any size (n) drawn from a normally distributed population also follows a normal distribution with a mean of µ and a standard deviation of .
¢ This standard deviation ( ) is called the standard error of the mean. — In our example we thus know that:
σ n
σ n
X ~ N(100, 35 20 )
BUT WHAT ABOUT NON-NORMAL POPULATIONS?
¢ Consider this dice-rolling example: — When we roll a single six-sided die, we can obtain any
one of the values from ‘1’ to ‘6’, each with a probability of ‘1/6’. Defining a random variable as ‘the outcome of a single die roll’, this variable follows the uniform and discrete distribution.
D orit N
evo, 2013
19 0 0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
1 2 3 4 5 6
¢ Instead of rolling a single die, we will roll two dice. And, instead of looking at the outcome on the individual die, we will define our random variable as ‘the average of the values obtained from two rolled dice’. — For example, if one die rolled on ‘3’ and the other on
‘5’, then we will record the outcome of this roll as ‘4’.
¢ Exercise: — List all possible values that the random variable can
obtain along with their probabilities
D orit N
evo, 2013
20
GRAPHICALLY
D orit N
evo, 2013
21 0.000
0.020
0.040
0.060
0.080
0.100
0.120
0.140
0.160
0.180
1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6
AS WE INCREASE THE NUMBER OF DICE ROLLED (10 IN THE GRAPH BELOW)…
D orit N
evo, 2013
22 0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6
CENTRAL LIMIT THEOREM (PART II)
D orit N
evo, 2013
23
¢ even when the population is not normally distributed, the mean will follow an approximately normal distribution for large enough samples (n≥30), with a mean of µ and a standard deviation of . σ
n
PUTTING IT ALL TOGETHER
D orit N
evo, 2013
24
The mean of a sample taken from a normally distributed population (e.g., the spending example) will follow a normal distribution, with the mean equal to the population mean and a standard deviation equal to the population standard deviation divided by the square root of the sample size (hence the distribution is narrower).
The mean of a sufficiently large sample taken from a non-normally distributed population (e.g., the dice example) will follow an approximately normal distribution, with the mean equal to the population mean and a standard deviation equal to the population standard deviation divided by the square root of the sample size.
Using statistical notaion:
EXAMPLE 9.1(A)
¢ The foreman of a bottling plant has observed that the amount of soda in each “32-ounce” bottle is a normally distributed random variable, with a mean of 32.2 ounces and a standard deviation of 0.3 ounce. — If a customer buys one bottle, what is the probability
that the bottle will contain more than 32 ounces?
¢ Answer: — We know that X~N(32.2,0.3) — Using the standard normal table:
D orit N
evo, 2013
25 7486.2514.1)67.Z(P 3. 2.3232XP)32X(P =−=−>=⎟ ⎠
⎞ ⎜ ⎝
⎛ − >
σ
µ− =>
EXAMPLE 9.1(B)
¢ If a customer buys a carton of four bottles, what is the probability that the mean amount of the four bottles will be greater than 32 ounces? — We are no longer asked about X. Instead we are
asked about the mean of X
D orit N
evo, 2013
26
P(X > 32) = ?
INTERPRET THE QUESTION
¢ We know that — X ~ N(32.2 , 0.3) — n=4 (a carton of four bottles) — The central limit theorem à
¢ Therefore:
D orit N
evo, 2013
27
€
X ~ N(32.2, 0.3 4 ) €
X ~ N(µ, σ n )
SOLUTION
¢ Answer: There is about a 91% chance that the mean of the four bottles contains more than 32oz.
D orit N
evo, 2013
28
€
Z = X − µX σX / n
= 32 − 32.2 0.3/2
= −1.33
P(X > 32) = P(Z > −1.33) = 0.9082
BACK TO HYPOTHESIS TESTING
D orit N
evo, 2013
29
EXAMPLE
¢ The Freshman 15 refers to the weight gained (in lbs) during a student's first year at college. A researcher set out to challenge this belief claiming that the weight gain is in fact much lower. Based on a sample of 100 students the researcher found an average weight gain of 13lbs. Assuming the standard deviation of weight gain is known to be 10lbs, what can the researcher conclude?
D orit N
evo, 2013
30
€
H0 : µ ≥15 lbs [or µ =15] HA : µ <15 lbs
TEST STATISTIC
¢ The first thing we need to do is to convert our sample mean into a Z-score. This Z-score is known as the test statistic
¢ In our example:
D orit N
evo, 2013
31
Z = X −µ σ
n
Z = 13−15 10 / 100
= −2
P-VALUE
¢ The probability of finding a sample mean as extreme as the one we have found is called the p- value of the test.
¢ Our decision rule for whether or not to reject the null hypothesis will be based on this probability.
D orit N
evo, 2013
32
P(X <13)→ P(Z < 13−15 10 100
) = P(Z < −2) = 0.0228
THE SIGNIFICANCE OF THE TEST
¢ We compare the p-value to a desired significance level known as α (alpha).
¢ In statistics, when we say that a finding is ‘statistically significant’, it means that the finding is unlikely to have occurred by chance. Our level of significance is the maximum chance probability we are willing to tolerate.
¢ Hence, we form the following decision rule for our hypothesis testing context: — For any p-value smaller than α we reject the
null hypothesis.
D orit N
evo, 2013
33
IN OUR EXAMPLE
¢ If we choose α=0.05 then: P-value = 0.028 < 0.05 = α
¢ Therefore we reject the null hypothesis and conclude that the average weight gain is likely lower than 15lbs.
D orit N
evo, 2013
34
ANOTHER EXAMPLE (11.34)
¢ Spam email has become a serious and costly nuisance. An office manager believes that the average amount of time spent by office workers reading and deleting spam exceeds 25 minutes per day. To test this belief, he takes a random sample of 18 workers and measures the amount of time each spends reading and deleting spam. The results are listed below. If the population of times is normal with a standard deviation of 12 minutes, can the manager infer at the 1% significance level that he is correct?
D orit N
evo, 2013
35 35 48 29 44 17 21 32 28 34
23 13 9 11 30 42 37 43 48
INTERPRETATION ¢ “An office manager believes that the average amount of time spent by
office workers reading and deleting spam exceeds 25 minutes per day” — H0: µ ≤ 25 — H1: µ > 25
¢ “a random sample of 18 workers” — n=18
¢ “measures the amount of time each spends reading and deleting spam” — X is a random variable defined as the time spent reading and deleting
spam
¢ “The results are listed below.” — We can compute the sample mean as 30.22
¢ “If the population of times is normal with a standard deviation of 12 minutes” — The hypothesized distribution is X~N(25,12)
¢ “can the manager infer at the 1% significance level that he is correct” — α=0.01
D orit N
evo, 2013
36
STEP 1: COMPUTE THE TEST STATISTIC
¢ The test statistic is the Z-score corresponding to our sample mean
¢ We know that X~N(25,12) and therefore, based on the central limit theorem, we know that:
¢ Test statistic:
D orit N
evo, 2013
37
X ~ N(25, 12 18 )
Z = 30.22− 2512 18
=1.845
STEP 2: FIND THE P-VALUE
¢ This is an upper level test — We are looking for the probability of obtaining a
sample mean at least as high as the one we have found:
— P(Z > 1.85) = 1 – 0.9678 = 0.0322 ¢ I round Z to 1.85 to be able to look it up in the table
D orit N
evo, 2013
38
STEP 3: DECISION RULE AND CONCLUSION
¢ We will reject the null hypothesis if p-value is less than α. — P-value = 0.0322 and the questions states that α is 0.01 — Since p-value > α we do not reject the null hypothesis.
¢ By not rejecting the null hypothesis we conclude that the time spent on spam likely does not exceed 25 minutes.
¢ But what if we chose α to be 5%?
— In this case p-value < α and we reverse our conclusion and reject the null hypothesis
¢ It is not uncommon to change our conclusion as α changes. — We use the p-value to indicate the highest level of significance
of our test. — In this above example, our test would be significant at the 5%
level but not at the 1% level.
D orit N
evo, 2013
39
IN RESEARCH PAPERS
D orit N
evo, 2013
40
A TWO-TAIL TEST EXAMPLE
¢ A statistics professor wishes to determine if her section’s grade average is different than that of other sections of the course. The average for all other sections is 75 with a standard deviation of 4. The professor collected test scores from 25 students and found the sample average was 72. She wishes to conduct a hypothesis test at the 5% level of significance (that is, α=0.05).
D orit N
evo, 2013
41
STEP 1: HYPOTHESES
¢ This is a two tail test in which we would reject the null hypothesis if the sample mean is significantly smaller or significantly larger than the hypothesized mean.
D orit N
evo, 2013
42
H0 :µ = 75 HA :µ ≠ 75
STEP 2: COMPUTE THE TEST STATISTIC
¢ Let X represent statistics grades. — Assuming the null hypothesis to be true, we believe
that X is normally distributed with a mean of 75 and a standard deviation of 4.
— We also know that the sampling distribution of X-bar (the average starting salary) is:
¢ To obtain the test statistic, we convert our sample mean of 72 to a Z score using the hypothesized distribution:
D orit N
evo, 2013
43
X ~ N(75, 4 25 )
Z = 72− 75 4 25
= −3.75
STEP 3: P-VALUE
¢ What is the probability of finding a test statistics as extreme as the one we have found? — P( Z < -3.75) = 0.00009
¢ Because this is a two-tail test p-value needs to be doubled before we compare it to α — p-value = 2*P(Z≤|test statistic|) = 2*0.00009 =
0.00018
D orit N
evo, 2013
44
STEP 4: DECISION RULE AND CONCLUSION
¢ Because our p-value of ‘0.00018’ is less than α (0.05) we reject the null hypothesis
¢ At α=0.05 we can reject the null hypothesis and conclude that the average grade is different than 75.
D orit N
evo, 2013
45
KEY TERMS
¢ A Hypothesis is a claim we wish to test ¢ The sampling distribution of the mean is created
by taking repeated samples of size n from a given population, computing the mean for each sample, and charting the distribution of these means. — We rely on the central limit theorem to know the
sampling distribution of the mean in most cases (see chapter 9.1 for a review)
¢ Test statistic is the value we use to conduct our test. It is a Z-score calculated based on the sample statistic and the hypothesized distribution
¢ P-value is the probability of finding a test statistic as extreme (either as high or as low) as the one we have found
D orit N
evo, 2013
46
ANOTHER APPROACH (SIMILAR BUT NOT EXACTLY THE SAME)
¢ A baseball player claims that the average speed of his fastball pitch is greater than that of his rival, who averages 90 mph. He collects data on the speed of 100 fastballs and finds his average speed is 90.8 mph. Assuming that we know the standard deviation of a fastball pitch to be 3.85 mph, what can we conclude about the pitcher’s claim?
D orit N
evo, 2013
47
STEP 1: HYPOTHESES
¢ The pitcher would like to prove that his fastball has a higher speed than that of his rival. Therefore, the null hypothesis is that it is not higher (i.e. it is the same speed or lower) and the alternative is that it is higher: — H0: µ ≤ 90 — H1: µ > 90
¢ with µ being the hypothesized average pitch speed.
D orit N
evo, 2013
48
STEP 2: COMPUTE THE TEST STATISTIC
¢ Let X represent pitching speed (in mph). — Assuming the null hypothesis to be true, we believe
that X is normally distributed with a mean of 90mph and a standard deviation of 3.85mph.
— We also know that the sampling distribution of X-bar (the average pitching speed) is:
¢ To obtain the test statistic, we convert our sample mean of 90.8 to a Z score using the hypothesized distribution:
D orit N
evo, 2013
49
X ~ N(90, 3.85 100
)
Z = 90.8− 90 3.85 100
= 2.078
STEP 3: FIND THE CRITICAL VALUE
¢ To determine the critical value, we need to know the desired significance level (α). — Let us assume that we would like to conduct the test
at the 1% level of significance: α=0.01. — The critical value is computed based on α, such that
P(Z ≥ |critical value|) = α. — In our example: use the normal distribution table to
find the critical value Z*, such that: P(Z ≥ Z*) = 0.01.
¢ Note that we are using the ‘greater than’ relationship here since this is an upper-tail test — our critical value should be at the upper tail of the
distribution.
¢ You’ll see that our critical value Z* is 2.326.
D orit N
evo, 2013
50
GRAPHICALLY
¢ The shaded blue area is called the rejection region. — If our sample values fall within this region we would
reject the null hypothesis
D orit N
evo, 2013
51
STEP 4: COMPARE THE TWO VALUES
¢ Our test statistic does not fall within the rejection region and therefore we do not reject the null hypotheses. — Test statistic 2.078 < 2.326 critical value — Do not reject the null hypothesis
D orit N
evo, 2013
52
STEP 5: CONCLUSION
¢ We were not able to reject the null hypotheses concluding that there is no evidence to support the claim that the pitcher’s fastballs are faster than his rival’s
D orit N
evo, 2013
53
A LOWER TAIL TEST EXAMPLE
¢ The director of a state agency claims that the average starting salary for clerical employees in the state is less than $30,000 per year. To test this claim, she has collected a random sample of 100 starting salaries of clerks from across the state and found that the sample mean is $29,570. — State the null and alternative hypotheses — Conduct the test, assuming the population standard
deviation is known to be $2,500 and the significance level for the test is 0.05
Dorit Nevo, 2013 54
STEP 1: HYPOTHESES
¢ This is a lower tail test in which we would only reject the null hypothesis if the test statistic is significantly smaller than the hypothesized parameter.
D orit N
evo, 2013
55
€
H0 :µ ≥ $30,000 HA :µ < $30,000
Dorit Nevo, 2013 56
-4 -3 -2 -1 0 1 2 3 4 Z
ONE-TAIL TEST (LOWER TAIL)
Lower tail rejection region
STEP 2: COMPUTE THE TEST STATISTIC
¢ Let X represent starting salaries ($). — Assuming the null hypothesis to be true, we believe
that X is normally distributed with a mean of $30,000 and a standard deviation of $2,500.
— We also know that the sampling distribution of X-bar (the average starting salary) is:
¢ To obtain the test statistic, we convert our sample mean of $29,570 to a Z score using the hypothesized distribution:
D orit N
evo, 2013
57
X ~ N(30000, 2500 100
)
Z = 29570−30000 2500 100
= −1.72
STEP 3: CRITICAL VALUE
¢ The question defines the desired significance level for the test as 0.05
¢ Looking for Z critical such that P(Z<-Z*)=0.05 (lower rejection region only)
¢ From the table we can find that Z* = -1.645
D orit N
evo, 2013
58
STEP 4: COMPARE THE TWO VALUES
D orit N
evo, 2013
59
€
Zstatistic = −1.72 < −1.645 = Zcritical
STEP 5: CONCLUSION
¢ The computed Z statistics is more extreme than our critical Z value and therefore falls in the rejection region. At α=0.05 we can reject the null hypothesis and conclude that the average salary is less than 30,000.
D orit N
evo, 2013
60
A TWO-TAIL TEST EXAMPLE
¢ A statistics professor wishes to determine if her section’s grade average is different than that of other sections of the course. The average for all other sections is 75 with a standard deviation of 4. The professor collected test scores from 25 students and found the sample average was 72. She wishes to conduct a hypothesis test at the 5% level of significance (that is, α=0.05).
D orit N
evo, 2013
61
STEP 1: HYPOTHESES
¢ This is a two tail test in which we would reject the null hypothesis if the test statistic is significantly smaller or significantly larger than the hypothesized parameter.
D orit N
evo, 2013
62
H0 :µ = 75 HA :µ ≠ 75
STEP 2: COMPUTE THE TEST STATISTIC
¢ Let X represent statistics grades. — Assuming the null hypothesis to be true, we believe
that X is normally distributed with a mean of 75 and a standard deviation of 4.
— We also know that the sampling distribution of X-bar (the average starting salary) is:
¢ To obtain the test statistic, we convert our sample mean of 72 to a Z score using the hypothesized distribution:
D orit N
evo, 2013
63
X ~ N(75, 4 25 )
Z = 72− 75 4 25
= −3.75
STEP 3: CRITICAL VALUE
¢ Because this is a two tail test we should have two rejection regions — At both tails of the distribution
¢ This means that α should be split in half for each of the tails: — The question defines the desired significance level at 0.05 — Looking for Z critical such that P(Z<-Z*)=0.025 (this is the
lower tail) — From the table we can find that Z* = -1.96
D orit N
evo, 2013
64
STEP 4: COMPARE THE TWO VALUES
D orit N
evo, 2013
65
Zstatistic = −3.75< −1.96 = Zcritical
STEP 5: CONCLUSION
¢ The computed Z statistics is more extreme than our critical Z value at the lower tail, and therefore falls in the rejection region.
¢ At α=0.05 we can reject the null hypothesis and conclude that the average grade is different than 75.
D orit N
evo, 2013
66
TYPE I AND TYPE II ERRORS Still Chapter 11
D orit N
evo, 2013
67
STARTING WITH AN EXAMPLE
¢ It has been claimed the average wireless phone customer receives no more than 8 spam text messages each day, with a standard deviation of 0.8. To test this claim (at the 5% significance level), you collect data from 60 wireless customers and record the number of spam messages received during a single day. Your sample average is 8.2.
¢ Step 1: Hypotheses — H0: µ ≤ 8 — H1: µ > 8
¢ Step 2: critical value
— This is an upper tail test, α is 0.05, therefore our critical value is 1.645
D orit N
evo, 2013
68
CONT’D
D orit N
evo, 2013
69
¢ Step 3: test statistic
¢ Steps 4 and 5: — Testing at the 5% level of significance, we compare
the test statistic to our critical value of 1.645 and conclude that the null hypothesis should be rejected, because 1.94>1.645. In other words, we conclude that the average customer receives more than 8 spam text messages per day
Z = 8.2−8 0.8 60
=1.94
TYPE I AND TYPE II ERRORS
¢ When testing hypotheses we can make two kinds of errors: — Reject a true null hypothesis
¢ Concluding that the mean number of spam messages is more than 8 when in fact it is eight
¢ This is called a type I error
— Not reject a false null hypothesis ¢ Concluding that the mean number of spam messages is not
more than eight when in fact it is. ¢ This is called a type II error
Dorit Nevo, 2013 70
ERROR SUMMARY
Hypothesis testing indicates that you should
In actuality
H0 is true (“do not reject”)
H0 is false (“reject”)
Not reject H0
Reject H0
Dorit Nevo, 2013 71
Type I error (prob. α)
Type II error (prob. β)
TYPE I ERROR IN OUR EXAMPLE
¢ We assumed the null hypothesis to be true and computed the probability of finding the sample mean that we observed. — In our example the probability of finding a sample of size
60 with a mean of 8.2 spam message is: P(Z ≥ 1.94)=0.026, which is quite low.
— Based on our significance level, α, we concluded that we should reject the null hypothesis: we believe that it is false.
¢ Assume that the null hypothesis is, in fact, true. — By rejecting a true null hypothesis we have committed a
type I error. — Because our rejection rule is based on α (the level of
significance), α is also the probability of committing a type I error.
D orit N
evo, 2013
72
THE TRADE-OFF
¢ If we wish to reduce the chance of committing a type I error, we need to use smaller values of α.
¢ Why not use the lowest α possible? — Because we also have to consider the probability of
committing a type II error (not rejecting a false null hypothesis).
— This probability is called β, and it is inversely related to α
D orit N
evo, 2010
73
SUMMARY
¢ We covered Chapters 11 as well as section 9.1 — Logic and process of hypothesis testing
¢ Next class – Quiz ¢ Next topics: chapter 10 (confidence intervals) and
12 (other single population tests)
D orit N
evo, 2010
74
For the project you should identify three research questions that can be addressed through different hypothesis testing procedures. Your submission should include the following components.
1. Introduction: Briefly describe (in words) each of the three research questions and your motivation for studying them (not more than two pages). For each test you should state your null hypothesis and alternative hypothesis.
2. Hypothesis tests: for each of the three tests clearly discuss: (1) the data used to conduct the test, (2) any assumption that you need to make in order to conduct the test including support for making these assumptions, (3) the test and its results, (4) the statistical significance of your findings.
3. Summary: summarize your findings form the analysis. What do you conclude about the hypothesis you raised? Identify key limitations of your analysis (including any limitations of the data set), contributions that you believe your analysis makes to the readers, and any interesting directions for future analysis (not more than three pages)
Project sample is shown as below:
Introduction: The Doll Computer Company makes its own computers and delivers them directly to customers who order them via the Internet.
To achieve its objective of speed, Doll makes each of its five most popular computers and transports them to warehouses from which it generally takes 1 day to deliver a computer to the customer.
This strategy requires high levels of inventory that add considerably to the cost.
To lower these costs the operations manager wants to use an inventory model. He notes demand during lead time is normally distributed and he needs to know the mean to compute the optimum inventory level.
He observes 25 lead time periods and records the demand during each period.
The manager would like a 95% confidence interval estimate of the mean demand during lead time. Assume that the manager knows that the standard deviation is 75 computers
Hypothesis Tests:
25 observed lead time shown below
235 374 309 499 253
421 361 514 462 369
394 439 348 344 330
261 374 302 466 535
386 316 296 332 334
Interpret the question
· X represents demand
a. X~N (µ,75)
· δx=75 (standard deviation)
· n=25
· We would like to create a 95% confidence interval for the mean demand based on our sample of 25 lead time periods
· What do we need to know?
x̄ + Zα/2
· Compute the sample mean (X-bar) = 370.16
· Finding Zα/2
a. Looking for Zα/2 such that:
b. P(-Zα/2<Z< Zα/2)=0.95
c. Easiest:
Look for the lower tail probability in the normal table
P(Z<-Z α/2)=0.025
d. Therefore:
Zα/2=1.96
The computation is not done, but the format is pretty much like shown above.

Get help from top-rated tutors in any subject.
Efficiently complete your homework and academic assignments by getting help from the experts at homeworkarchive.com