INTRODUCTION_TO_HYPOTHESIS_TESTING_Chapters_9_and_11Dorit_Nevo

INTRODUCTION TO HYPOTHESIS TESTING Chapters 9 and 11

D orit N

evo, 2013

INFERENTIAL STATISTICS

¢  We often use statistics to test theories. —  Theory: a prediction, or a group of predictions, about how

people, physical entities, and built devices behave.

¢  Theories begin as predictions, which are then repeatedly tested in various settings to either strengthen or refute them. —  Such testing often involves statistical inference, defined as

the drawing of conclusions about a population of interest based on findings from samples obtained from that population.

¢  We will cover: —  Hypothesis testing —  Analysis of Variance (ANOVA) —  Regression Analysis

HYPOTHESES TESTING

¢ Statistics is very much about expectations. We aim to test specific expectations that we have about the population’s parameters using sample statistics. We call these expectations hypotheses. —  A hypothesis is some specific claim that we wish to

test.

¢ We study the probability of our sample’s outcome given the hypothesized distribution of the population

We believe that Our sample mean is

Possible?

X~N(10,2) 12 Probably yes

X~N(10,2) 20 Probably not

HYPOTHESES

¢ We differentiate between the research hypothesis and the null hypothesis. —  Example: A bank manager argues that, on average,

people carry $50 or more in their wallet. This claim is the null hypothesis. The research hypothesis contains the other side of this claim, that is – that people carry less than $50. We can also write it as: ¢  H0: Average amount of money ≥ $50 ¢  H1: Average amount of money < $50

—  where H0 is used to notate the null hypothesis and H1 (sometimes denoted HA) is used to notate the research hypothesis, commonly referred to as the alternative hypothesis.

HYPOTHESES

¢  Researchers tell you that, on average, people have 200 or fewer friends on Facebook. However, you believe that Facebook users, in fact, have more than 200 friends. You can set your hypotheses as: —  H0: Average number of friends ≤ 200 —  H1: Average number of friends > 200

¢  A statistics professor wants to know if her section’s grade average is different than that of other sections of the course. The average for all other sections is ‘75’. To help the professor learn if her section’s grade average is different than that of other sections, we need to set up the following hypotheses: —  H0: Section’s grade average = 75 —  H1: Section’s grade average ≠ 75

THREE FORMS OF HYPOTHESES

H0: Average amount of money ≥ $50

H1: Average amount of money < $50

Lower- Tail Test

H0: Section average = 75

H1: Section average ≠ 75

Two- Tail Test H0: Average

number of friends ≤ 200

H1: Average number of friends > 200

Upper- Tail Test

-4 -3 -2 -1 0 1 2 3 4

The ‘equal’ sign (=, ≥, ≤) always goes in the null hypothesis

A FINAL NOTE ON HYPOTHESES

¢  Hypotheses must always be (1) exhaustive and (2) mutually exclusive. —  Exhaustive means that the hypotheses should cover every

possible option.

—  mutually exclusive means there is no overlap between hypotheses. The following hypotheses are, therefore, incorrect because the ‘= 20’ option is in both the null and alternative hypotheses: ¢  H0: Average weeks of job seeking ≥ 20 ¢  H1: Average weeks of job seeking ≤ 20

Not Exhaustive Exhaustive

H0: Average weeks of job seeking = 20 H1: Average weeks of job seeking > 20

H0: Average weeks of job seeking ≤ 20 H1: Average weeks of job seeking > 20

WRITE THE HYPOTHESES

¢  Angry Birds is one of the most popular games on various platforms. According to one source, people collectively spend 300 million minutes per day playing Angry Birds and it supposedly costs the US economy billions of dollars in lost work time. Suppose you believe this number (300 million minutes) is too high. You conduct some more research to find out that there are roughly 30 million daily active users of Angry Birds, which means that on average each player spends (300 million minutes/30 million users) 10 minutes per day playing Angry Birds, according to the original claim. You plan to use a sample of students at your school to test this claim and need to set up your hypotheses first.

D orit N

evo, 2013

HYPOTHESES

H0: Average time playing (µ) ≥ 10 H1: Average time playing (µ) < 10

D orit N

evo, 2013

MORE PRACTICE

D orit N

evo, 2013

¢  Using the table as your guide for the three types of tests (lower-tail, upper-tail, and two- tail), write the hypotheses for the following tests: —  A two-tail test for the average

number of sleep hours —  An upper-tail test for the

average hours spent grooming —  A lower-tail test for the average

hours spent on educational activities

—  A two-tail test for the average hours spent eating and drinking

—  A two-tail test for the average hours spent on leisure and sports

—  An upper-tail test for the average hours spent working

Activity Hours

Sleeping 8.4

Leisure and Sports 3.6

Educational Activities 3.4

Working 3.0

Other 2.2

Traveling 1.5

Eating and Drinking 1.1

Grooming 0.8

Total 24.0

EXAMPLE

¢  Suppose that we are considering an investment in a used textbook store and are interested in determining the amount of money that customers are likely to spend when shopping at the store.

¢  Based on historical data provided by the current owner, the average customer making purchases at the store spent $100 with a standard deviation of $35.

¢  We are concerned that sales may have decreased over the last year. We hypothesize that: —  H0: Average spending (µ) ≥ 100 —  H1: Average spending (µ) < 100

¢  With µ being the hypothesized average spending (the parameter of interest) for the population (all customers who have made purchases at the store).

D orit N

evo, 2013

WE TAKE A SAMPLE

¢ Data from twenty random customers shows that the mean spending in our sample is ‘$93.27’. —  Our sample mean is indeed lower than the

hypothesized mean, but we can attribute it to sampling error.

—  Should we reject the null hypothesis based on this one sample?

DECISION RULE

¢ The question we need to ask ourselves is: —  how possible is it for us to take a random sample of

size n=20 out of a population with a mean of $100 and a standard deviation of $35 and obtain a sample mean of ‘$93.27’?

—  If it is reasonably possible, then we have no reason to reject the null hypothesis.

—  If the chances of finding this sample mean are extremely low, then we should conclude that the null hypothesis is likely false and we should reject it, believing that the true average spending is, in fact, lower than $100.

¢ So, just how possible is it that we could obtain a sample mean of ‘$93.27’ given a population mean of $100 and standard deviation of $35?

D orit N

evo, 2013

THE SAMPLING DISTRIBUTION OF THE MEAN

D orit N

evo, 2013

To answer this question we need to know:

SAMPLING DISTRIBUTIONS

¢ A sampling distribution refers to the probability distribution of any sample statistic. The sampling distribution of the mean describes the probabilities attached to all values of the mean of samples that are repeatedly taken from the same population.

¢ Let us continue with our used textbook store example to better understand this concept.

D orit N

evo, 2013

SAMPLING DISTRIBUTION

¢ Recall that we took a random sample of twenty customers (i.e., n=20), and measured how much each of the customers spent. We then computed the mean spending for that sample to be ‘$93.27’.

¢ Let us repeat this sampling procedure 200 times, each time taking a different sample of twenty customers and computing the mean spending for each sample.

D orit N

evo, 2013

Sample (n=20) Sample Mean (X) 1 $93.27 2 $104.34 3 $100.49 … …

199 $96.65 200 $111.65

Average over 200 samples: $99.04 Standard Deviation: $11.14

GRAPHICALLY

D orit N

evo, 2013

F re

q u

en cy

F re

q u

en cy

200 samples of size 20

1,000 samples of size 20

CENTRAL LIMIT THEOREM (PART I)

D orit N

evo, 2013

¢ The central limit theorem states that the sampling distribution of the mean of a random sample of any size (n) drawn from a normally distributed population also follows a normal distribution with a mean of µ and a standard deviation of .

¢ This standard deviation ( ) is called the standard error of the mean. —  In our example we thus know that:

σ n

X ~ N(100, 35 20 )

BUT WHAT ABOUT NON-NORMAL POPULATIONS?

¢ Consider this dice-rolling example: —  When we roll a single six-sided die, we can obtain any

one of the values from ‘1’ to ‘6’, each with a probability of ‘1/6’. Defining a random variable as ‘the outcome of a single die roll’, this variable follows the uniform and discrete distribution.

D orit N

evo, 2013

19 0 0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

1 2 3 4 5 6

¢  Instead of rolling a single die, we will roll two dice. And, instead of looking at the outcome on the individual die, we will define our random variable as ‘the average of the values obtained from two rolled dice’. —  For example, if one die rolled on ‘3’ and the other on

‘5’, then we will record the outcome of this roll as ‘4’.

¢ Exercise: —  List all possible values that the random variable can

obtain along with their probabilities

D orit N

evo, 2013

GRAPHICALLY

D orit N

evo, 2013

21 0.000

0.020

0.040

0.060

0.080

0.100

0.120

0.140

0.160

0.180

1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6

AS WE INCREASE THE NUMBER OF DICE ROLLED (10 IN THE GRAPH BELOW)…

D orit N

evo, 2013

22 0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6

CENTRAL LIMIT THEOREM (PART II)

D orit N

evo, 2013

¢  even when the population is not normally distributed, the mean will follow an approximately normal distribution for large enough samples (n≥30), with a mean of µ and a standard deviation of . σ

PUTTING IT ALL TOGETHER

D orit N

evo, 2013

The mean of a sample taken from a normally distributed population (e.g., the spending example) will follow a normal distribution, with the mean equal to the population mean and a standard deviation equal to the population standard deviation divided by the square root of the sample size (hence the distribution is narrower).

The mean of a sufficiently large sample taken from a non-normally distributed population (e.g., the dice example) will follow an approximately normal distribution, with the mean equal to the population mean and a standard deviation equal to the population standard deviation divided by the square root of the sample size.

Using statistical notaion:

EXAMPLE 9.1(A)

¢ The foreman of a bottling plant has observed that the amount of soda in each “32-ounce” bottle is a normally distributed random variable, with a mean of 32.2 ounces and a standard deviation of 0.3 ounce. —  If a customer buys one bottle, what is the probability

that the bottle will contain more than 32 ounces?

¢ Answer: —  We know that X~N(32.2,0.3) —  Using the standard normal table:

D orit N

evo, 2013

25 7486.2514.1)67.Z(P 3. 2.3232XP)32X(P =−=−>=⎟ ⎠

⎞ ⎜ ⎝

⎛ − >

µ− =>

EXAMPLE 9.1(B)

¢  If a customer buys a carton of four bottles, what is the probability that the mean amount of the four bottles will be greater than 32 ounces? —  We are no longer asked about X. Instead we are

asked about the mean of X

D orit N

evo, 2013

P(X > 32) = ?

INTERPRET THE QUESTION

¢ We know that —  X ~ N(32.2 , 0.3) —  n=4 (a carton of four bottles) —  The central limit theorem à

¢ Therefore:

D orit N

evo, 2013

€

X ~ N(32.2, 0.3 4 ) €

X ~ N(µ, σ n )

SOLUTION

¢ Answer: There is about a 91% chance that the mean of the four bottles contains more than 32oz.

D orit N

evo, 2013

€

Z = X − µX σX / n

= 32 − 32.2 0.3/2

= −1.33

P(X > 32) = P(Z > −1.33) = 0.9082

BACK TO HYPOTHESIS TESTING

D orit N

evo, 2013

EXAMPLE

¢ The Freshman 15 refers to the weight gained (in lbs) during a student's first year at college. A researcher set out to challenge this belief claiming that the weight gain is in fact much lower. Based on a sample of 100 students the researcher found an average weight gain of 13lbs. Assuming the standard deviation of weight gain is known to be 10lbs, what can the researcher conclude?

D orit N

evo, 2013

€

H0 : µ ≥15 lbs [or µ =15] HA : µ <15 lbs

TEST STATISTIC

¢ The first thing we need to do is to convert our sample mean into a Z-score. This Z-score is known as the test statistic

¢  In our example:

D orit N

evo, 2013

Z = X −µ σ

Z = 13−15 10 / 100

= −2

P-VALUE

¢ The probability of finding a sample mean as extreme as the one we have found is called the p- value of the test.

¢ Our decision rule for whether or not to reject the null hypothesis will be based on this probability.

D orit N

evo, 2013

P(X <13)→ P(Z < 13−15 10 100

) = P(Z < −2) = 0.0228

THE SIGNIFICANCE OF THE TEST

¢ We compare the p-value to a desired significance level known as α (alpha).

¢  In statistics, when we say that a finding is ‘statistically significant’, it means that the finding is unlikely to have occurred by chance. Our level of significance is the maximum chance probability we are willing to tolerate.

¢ Hence, we form the following decision rule for our hypothesis testing context: —  For any p-value smaller than α we reject the

null hypothesis.

D orit N

evo, 2013

IN OUR EXAMPLE

¢  If we choose α=0.05 then: P-value = 0.028 < 0.05 = α

¢ Therefore we reject the null hypothesis and conclude that the average weight gain is likely lower than 15lbs.

D orit N

evo, 2013

ANOTHER EXAMPLE (11.34)

¢ Spam email has become a serious and costly nuisance. An office manager believes that the average amount of time spent by office workers reading and deleting spam exceeds 25 minutes per day. To test this belief, he takes a random sample of 18 workers and measures the amount of time each spends reading and deleting spam. The results are listed below. If the population of times is normal with a standard deviation of 12 minutes, can the manager infer at the 1% significance level that he is correct?

D orit N

evo, 2013

35 35 48 29 44 17 21 32 28 34

23 13 9 11 30 42 37 43 48

INTERPRETATION ¢  “An office manager believes that the average amount of time spent by

office workers reading and deleting spam exceeds 25 minutes per day” —  H0: µ ≤ 25 —  H1: µ > 25

¢  “a random sample of 18 workers” —  n=18

¢  “measures the amount of time each spends reading and deleting spam” —  X is a random variable defined as the time spent reading and deleting

spam

¢  “The results are listed below.” —  We can compute the sample mean as 30.22

¢  “If the population of times is normal with a standard deviation of 12 minutes” —  The hypothesized distribution is X~N(25,12)

¢  “can the manager infer at the 1% significance level that he is correct” —  α=0.01

D orit N

evo, 2013

STEP 1: COMPUTE THE TEST STATISTIC

¢ The test statistic is the Z-score corresponding to our sample mean

¢ We know that X~N(25,12) and therefore, based on the central limit theorem, we know that:

¢ Test statistic:

D orit N

evo, 2013

X ~ N(25, 12 18 )

Z = 30.22− 2512 18

=1.845

STEP 2: FIND THE P-VALUE

¢ This is an upper level test —  We are looking for the probability of obtaining a

sample mean at least as high as the one we have found:

—  P(Z > 1.85) = 1 – 0.9678 = 0.0322 ¢  I round Z to 1.85 to be able to look it up in the table

D orit N

evo, 2013

STEP 3: DECISION RULE AND CONCLUSION

¢  We will reject the null hypothesis if p-value is less than α. —  P-value = 0.0322 and the questions states that α is 0.01 —  Since p-value > α we do not reject the null hypothesis.

¢  By not rejecting the null hypothesis we conclude that the time spent on spam likely does not exceed 25 minutes.

¢  But what if we chose α to be 5%?

—  In this case p-value < α and we reverse our conclusion and reject the null hypothesis

¢  It is not uncommon to change our conclusion as α changes. —  We use the p-value to indicate the highest level of significance

of our test. —  In this above example, our test would be significant at the 5%

level but not at the 1% level.

D orit N

evo, 2013

IN RESEARCH PAPERS

D orit N

evo, 2013

A TWO-TAIL TEST EXAMPLE

¢ A statistics professor wishes to determine if her section’s grade average is different than that of other sections of the course. The average for all other sections is 75 with a standard deviation of 4. The professor collected test scores from 25 students and found the sample average was 72. She wishes to conduct a hypothesis test at the 5% level of significance (that is, α=0.05).

D orit N

evo, 2013

STEP 1: HYPOTHESES

¢ This is a two tail test in which we would reject the null hypothesis if the sample mean is significantly smaller or significantly larger than the hypothesized mean.

D orit N

evo, 2013

H0 :µ = 75 HA :µ ≠ 75

STEP 2: COMPUTE THE TEST STATISTIC

¢ Let X represent statistics grades. —  Assuming the null hypothesis to be true, we believe

that X is normally distributed with a mean of 75 and a standard deviation of 4.

—  We also know that the sampling distribution of X-bar (the average starting salary) is:

¢ To obtain the test statistic, we convert our sample mean of 72 to a Z score using the hypothesized distribution:

D orit N

evo, 2013

X ~ N(75, 4 25 )

Z = 72− 75 4 25

= −3.75

STEP 3: P-VALUE

¢ What is the probability of finding a test statistics as extreme as the one we have found? —  P( Z < -3.75) = 0.00009

¢ Because this is a two-tail test p-value needs to be doubled before we compare it to α —  p-value = 2*P(Z≤|test statistic|) = 2*0.00009 =

0.00018

D orit N

evo, 2013

STEP 4: DECISION RULE AND CONCLUSION

¢  Because our p-value of ‘0.00018’ is less than α (0.05) we reject the null hypothesis

¢  At α=0.05 we can reject the null hypothesis and conclude that the average grade is different than 75.

D orit N

evo, 2013

KEY TERMS

¢  A Hypothesis is a claim we wish to test ¢  The sampling distribution of the mean is created

by taking repeated samples of size n from a given population, computing the mean for each sample, and charting the distribution of these means. —  We rely on the central limit theorem to know the

sampling distribution of the mean in most cases (see chapter 9.1 for a review)

¢  Test statistic is the value we use to conduct our test. It is a Z-score calculated based on the sample statistic and the hypothesized distribution

¢  P-value is the probability of finding a test statistic as extreme (either as high or as low) as the one we have found

D orit N

evo, 2013

ANOTHER APPROACH (SIMILAR BUT NOT EXACTLY THE SAME)

¢ A baseball player claims that the average speed of his fastball pitch is greater than that of his rival, who averages 90 mph. He collects data on the speed of 100 fastballs and finds his average speed is 90.8 mph. Assuming that we know the standard deviation of a fastball pitch to be 3.85 mph, what can we conclude about the pitcher’s claim?

D orit N

evo, 2013

STEP 1: HYPOTHESES

¢ The pitcher would like to prove that his fastball has a higher speed than that of his rival. Therefore, the null hypothesis is that it is not higher (i.e. it is the same speed or lower) and the alternative is that it is higher: —  H0: µ ≤ 90 —  H1: µ > 90

¢ with µ being the hypothesized average pitch speed.

D orit N

evo, 2013

STEP 2: COMPUTE THE TEST STATISTIC

¢ Let X represent pitching speed (in mph). —  Assuming the null hypothesis to be true, we believe

that X is normally distributed with a mean of 90mph and a standard deviation of 3.85mph.

—  We also know that the sampling distribution of X-bar (the average pitching speed) is:

¢ To obtain the test statistic, we convert our sample mean of 90.8 to a Z score using the hypothesized distribution:

D orit N

evo, 2013

X ~ N(90, 3.85 100

)

Z = 90.8− 90 3.85 100

= 2.078

STEP 3: FIND THE CRITICAL VALUE

¢ To determine the critical value, we need to know the desired significance level (α). —  Let us assume that we would like to conduct the test

at the 1% level of significance: α=0.01. —  The critical value is computed based on α, such that

P(Z ≥ |critical value|) = α. —  In our example: use the normal distribution table to

find the critical value Z*, such that: P(Z ≥ Z*) = 0.01.

¢ Note that we are using the ‘greater than’ relationship here since this is an upper-tail test —  our critical value should be at the upper tail of the

distribution.

¢ You’ll see that our critical value Z* is 2.326.

D orit N

evo, 2013

GRAPHICALLY

¢ The shaded blue area is called the rejection region. —  If our sample values fall within this region we would

reject the null hypothesis

D orit N

evo, 2013

STEP 4: COMPARE THE TWO VALUES

¢ Our test statistic does not fall within the rejection region and therefore we do not reject the null hypotheses. —  Test statistic 2.078 < 2.326 critical value —  Do not reject the null hypothesis

D orit N

evo, 2013

STEP 5: CONCLUSION

¢ We were not able to reject the null hypotheses concluding that there is no evidence to support the claim that the pitcher’s fastballs are faster than his rival’s

D orit N

evo, 2013

A LOWER TAIL TEST EXAMPLE

¢ The director of a state agency claims that the average starting salary for clerical employees in the state is less than $30,000 per year. To test this claim, she has collected a random sample of 100 starting salaries of clerks from across the state and found that the sample mean is $29,570. —  State the null and alternative hypotheses —  Conduct the test, assuming the population standard

deviation is known to be $2,500 and the significance level for the test is 0.05

Dorit Nevo, 2013 54

STEP 1: HYPOTHESES

¢ This is a lower tail test in which we would only reject the null hypothesis if the test statistic is significantly smaller than the hypothesized parameter.

D orit N

evo, 2013

€

H0 :µ ≥ $30,000 HA :µ < $30,000

Dorit Nevo, 2013 56

-4 -3 -2 -1 0 1 2 3 4 Z

ONE-TAIL TEST (LOWER TAIL)

Lower tail rejection region

STEP 2: COMPUTE THE TEST STATISTIC

¢ Let X represent starting salaries ($). —  Assuming the null hypothesis to be true, we believe

that X is normally distributed with a mean of $30,000 and a standard deviation of $2,500.

—  We also know that the sampling distribution of X-bar (the average starting salary) is:

¢ To obtain the test statistic, we convert our sample mean of $29,570 to a Z score using the hypothesized distribution:

D orit N

evo, 2013

X ~ N(30000, 2500 100

)

Z = 29570−30000 2500 100

= −1.72

STEP 3: CRITICAL VALUE

¢ The question defines the desired significance level for the test as 0.05

¢ Looking for Z critical such that P(Z<-Z*)=0.05 (lower rejection region only)

¢ From the table we can find that Z* = -1.645

D orit N

evo, 2013

STEP 4: COMPARE THE TWO VALUES

D orit N

evo, 2013

€

Zstatistic = −1.72 < −1.645 = Zcritical

STEP 5: CONCLUSION

¢  The computed Z statistics is more extreme than our critical Z value and therefore falls in the rejection region. At α=0.05 we can reject the null hypothesis and conclude that the average salary is less than 30,000.

D orit N

evo, 2013

A TWO-TAIL TEST EXAMPLE

D orit N

evo, 2013

STEP 1: HYPOTHESES

¢ This is a two tail test in which we would reject the null hypothesis if the test statistic is significantly smaller or significantly larger than the hypothesized parameter.

D orit N

evo, 2013

H0 :µ = 75 HA :µ ≠ 75

STEP 2: COMPUTE THE TEST STATISTIC

¢ Let X represent statistics grades. —  Assuming the null hypothesis to be true, we believe

that X is normally distributed with a mean of 75 and a standard deviation of 4.

—  We also know that the sampling distribution of X-bar (the average starting salary) is:

¢ To obtain the test statistic, we convert our sample mean of 72 to a Z score using the hypothesized distribution:

D orit N

evo, 2013

X ~ N(75, 4 25 )

Z = 72− 75 4 25

= −3.75

STEP 3: CRITICAL VALUE

¢  Because this is a two tail test we should have two rejection regions —  At both tails of the distribution

¢  This means that α should be split in half for each of the tails: —  The question defines the desired significance level at 0.05 —  Looking for Z critical such that P(Z<-Z*)=0.025 (this is the

lower tail) —  From the table we can find that Z* = -1.96

D orit N

evo, 2013

STEP 4: COMPARE THE TWO VALUES

D orit N

evo, 2013

Zstatistic = −3.75< −1.96 = Zcritical

STEP 5: CONCLUSION

¢  The computed Z statistics is more extreme than our critical Z value at the lower tail, and therefore falls in the rejection region.

¢  At α=0.05 we can reject the null hypothesis and conclude that the average grade is different than 75.

D orit N

evo, 2013

TYPE I AND TYPE II ERRORS Still Chapter 11

D orit N

evo, 2013

STARTING WITH AN EXAMPLE

¢  It has been claimed the average wireless phone customer receives no more than 8 spam text messages each day, with a standard deviation of 0.8. To test this claim (at the 5% significance level), you collect data from 60 wireless customers and record the number of spam messages received during a single day. Your sample average is 8.2.

¢  Step 1: Hypotheses —  H0: µ ≤ 8 —  H1: µ > 8

¢  Step 2: critical value

—  This is an upper tail test, α is 0.05, therefore our critical value is 1.645

D orit N

evo, 2013

CONT’D

D orit N

evo, 2013

¢ Step 3: test statistic

¢ Steps 4 and 5: —  Testing at the 5% level of significance, we compare

the test statistic to our critical value of 1.645 and conclude that the null hypothesis should be rejected, because 1.94>1.645. In other words, we conclude that the average customer receives more than 8 spam text messages per day

Z = 8.2−8 0.8 60

=1.94

TYPE I AND TYPE II ERRORS

¢ When testing hypotheses we can make two kinds of errors: —  Reject a true null hypothesis

¢  Concluding that the mean number of spam messages is more than 8 when in fact it is eight

¢  This is called a type I error

—  Not reject a false null hypothesis ¢  Concluding that the mean number of spam messages is not

more than eight when in fact it is. ¢  This is called a type II error

Dorit Nevo, 2013 70

ERROR SUMMARY

Hypothesis testing indicates that you should

In actuality

H0 is true (“do not reject”)

H0 is false (“reject”)

Not reject H0

Reject H0

Dorit Nevo, 2013 71

Type I error (prob. α)

Type II error (prob. β)

TYPE I ERROR IN OUR EXAMPLE

¢  We assumed the null hypothesis to be true and computed the probability of finding the sample mean that we observed. —  In our example the probability of finding a sample of size

60 with a mean of 8.2 spam message is: P(Z ≥ 1.94)=0.026, which is quite low.

—  Based on our significance level, α, we concluded that we should reject the null hypothesis: we believe that it is false.

¢  Assume that the null hypothesis is, in fact, true. —  By rejecting a true null hypothesis we have committed a

type I error. —  Because our rejection rule is based on α (the level of

significance), α is also the probability of committing a type I error.

D orit N

evo, 2013

THE TRADE-OFF

¢  If we wish to reduce the chance of committing a type I error, we need to use smaller values of α.

¢ Why not use the lowest α possible? —  Because we also have to consider the probability of

committing a type II error (not rejecting a false null hypothesis).

—  This probability is called β, and it is inversely related to α

D orit N

evo, 2010

SUMMARY

¢ We covered Chapters 11 as well as section 9.1 —  Logic and process of hypothesis testing

¢ Next class – Quiz ¢ Next topics: chapter 10 (confidence intervals) and

12 (other single population tests)

D orit N

evo, 2010

For the project you should identify three research questions that can be addressed through different hypothesis testing procedures. Your submission should include the following components.

1. Introduction: Briefly describe (in words) each of the three research questions and your motivation for studying them (not more than two pages). For each test you should state your null hypothesis and alternative hypothesis.

2. Hypothesis tests: for each of the three tests clearly discuss: (1) the data used to conduct the test, (2) any assumption that you need to make in order to conduct the test including support for making these assumptions, (3) the test and its results, (4) the statistical significance of your findings.

3. Summary: summarize your findings form the analysis. What do you conclude about the hypothesis you raised? Identify key limitations of your analysis (including any limitations of the data set), contributions that you believe your analysis makes to the readers, and any interesting directions for future analysis (not more than three pages)

Project sample is shown as below:

Introduction: The Doll Computer Company makes its own computers and delivers them directly to customers who order them via the Internet.

To achieve its objective of speed, Doll makes each of its five most popular computers and transports them to warehouses from which it generally takes 1 day to deliver a computer to the customer.

This strategy requires high levels of inventory that add considerably to the cost.

To lower these costs the operations manager wants to use an inventory model. He notes demand during lead time is normally distributed and he needs to know the mean to compute the optimum inventory level.

He observes 25 lead time periods and records the demand during each period.

The manager would like a 95% confidence interval estimate of the mean demand during lead time. Assume that the manager knows that the standard deviation is 75 computers

Hypothesis Tests:

25 observed lead time shown below

235 374 309 499 253

421 361 514 462 369

394 439 348 344 330

261 374 302 466 535

386 316 296 332 334

Interpret the question

· X represents demand

a. X~N (µ,75)

· δx=75 (standard deviation)

· n=25

· We would like to create a 95% confidence interval for the mean demand based on our sample of 25 lead time periods

· What do we need to know?

x̄ + Zα/2

· Compute the sample mean (X-bar) = 370.16

· Finding Zα/2

a. Looking for Zα/2 such that:

b. P(-Zα/2<Z< Zα/2)=0.95

c. Easiest:

Look for the lower tail probability in the normal table

P(Z<-Z α/2)=0.025

d. Therefore:

Zα/2=1.96

The computation is not done, but the format is pretty much like shown above.

blog10

Get help from top-rated tutors in any subject.