Chi_Sq_DATA

Employee_Ref Promoted Management_Training_Program
1 No No
2 Yes Yes
3 No No
4 Yes Yes
5 No No
6 No No
7 Yes Yes
8 Yes Yes
9 Yes Yes
10 Yes Yes
11 No No
12 No No
13 Yes Yes
14 No No
15 Yes No
16 Yes Yes
17 No Yes
18 Yes Yes
19 Yes Yes
20 Yes Yes
21 Yes Yes
22 Yes Yes
23 No No
24 Yes Yes
25 No No
26 Yes Yes
27 No No
28 Yes Yes
29 Yes Yes
30 No No
31 No No
32 Yes Yes
33 Yes Yes
34 No No
35 Yes Yes
36 No No
37 Yes Yes
38 No No
39 Yes Yes
40 No No
41 No No
42 Yes Yes
43 Yes Yes
44 Yes Yes
45 Yes Yes
46 No No
47 No No
48 No No
49 Yes No
50 Yes Yes
51 Yes Yes
52 Yes Yes
53 Yes Yes
54 Yes Yes
55 No No
56 Yes Yes
57 No No
58 Yes Yes
59 No No
60 No No
61 Yes Yes
62 Yes Yes
63 Yes Yes
64 Yes Yes
65 No No
66 Yes Yes
67 No No
68 No No
69 No No
70 No No
71 No No
72 No No
73 Yes Yes
74 No Yes
75 No No
76 No No
77 No No
78 Yes Yes
79 No Yes
80 Yes Yes
81 Yes Yes
82 No Yes
83 No No
84 No No
85 No No
86 No No
87 No No
88 No No
89 No No
90 Yes Yes
91 No No
92 Yes Yes
93 No Yes
94 Yes Yes
95 Yes No
96 Yes Yes
97 Yes Yes
98 Yes Yes
99 No No
100 No No
101 No No
102 Yes Yes
103 Yes Yes
104 Yes Yes
105 Yes Yes
106 Yes Yes
107 Yes Yes
108 Yes Yes
109 No No
110 Yes Yes
111 No No
112 Yes Yes
113 No No
114 Yes Yes
115 Yes No
116 No No
117 Yes Yes
118 No No
119 No Yes
120 Yes Yes
121 No No
122 Yes Yes
123 Yes Yes
124 No No
125 Yes Yes
126 No No
127 Yes Yes
128 Yes Yes
129 Yes Yes
130 Yes Yes
131 Yes No
132 No No
133 Yes Yes
134 Yes Yes
135 No No
136 Yes Yes
137 Yes Yes
138 Yes Yes
139 No No
140 No No Here is where the "HIDE" command was used
141 No No
142 No No
143 No No
144 No No
145 Yes Yes
146 No No We are testing the idea that Promotion = function (Going to the training program).
147 Yes Yes
148 Yes Yes
149 No No
150 No Yes Chi square test with Excel formula
Formula =COUNTIF(B2:B151,$A$154)
# Promoted Training Program Formula =CHISQ.TEST(C154:C155,D154:D155)
Yes
No p-value =
Total
In Per Cent
# Promoted Training Program
Yes
No
Total

ANOVA_DATA

Item_# Week_Code Satisfaction_Level Office_Code Office
1 Week 2 38 1 Boston Week # Boston Atlanta Los Angeles Portland
2 Week 4 40 1 Boston 1 38 42 44 31
3 Week 6 30 1 Boston 2 40 45 43 43
4 Week 8 42 1 Boston 3 30 44 38 44
5 Week 10 36 1 Boston 4 42 43 40 43
6 Week 12 41 1 Boston 5 36 46 37 36
7 Week 14 40 1 Boston 6 41 42 35 43
8 Week 16 42 1 Boston 7 40 35 37 46
9 Week 18 36 1 Boston 8 42 43 35 45
10 Week 20 41 1 Boston 9 36 46 40 46
11 Week 22 40 1 Boston 10 41 43 37 36
12 Week 24 35 1 Boston 11 40 42 36 40
13 Week 26 34 1 Boston 12 35 45 39 41
14 Week 28 41 1 Boston 13 34 39 40 40
15 Week 30 39 1 Boston 14 41 39 36 39
16 Week 32 42 1 Boston 15 39 44 44 38
17 Week 2 42 2 Atlanta 16 42 43 39 42
18 Week 4 45 2 Atlanta Average 38.6 42.6 38.8 40.8
19 Week 6 44 2 Atlanta
20 Week 8 43 2 Atlanta
21 Week 10 46 2 Atlanta
22 Week 12 42 2 Atlanta
23 Week 14 35 2 Atlanta
24 Week 16 43 2 Atlanta
25 Week 18 46 2 Atlanta
26 Week 20 43 2 Atlanta
27 Week 22 42 2 Atlanta
28 Week 24 45 2 Atlanta
29 Week 26 39 2 Atlanta
30 Week 28 39 2 Atlanta
31 Week 30 44 2 Atlanta
32 Week 32 43 2 Atlanta
33 Week 2 44 3 Los Angeles
34 Week 4 43 3 Los Angeles
35 Week 6 38 3 Los Angeles
36 Week 8 40 3 Los Angeles
37 Week 10 37 3 Los Angeles
38 Week 12 35 3 Los Angeles
39 Week 14 37 3 Los Angeles
40 Week 16 35 3 Los Angeles
41 Week 18 40 3 Los Angeles
42 Week 20 37 3 Los Angeles
43 Week 22 36 3 Los Angeles
44 Week 24 39 3 Los Angeles
45 Week 26 40 3 Los Angeles
46 Week 28 36 3 Los Angeles
47 Week 30 44 3 Los Angeles
48 Week 32 39 3 Los Angeles
49 Week 2 31 4 Portland
50 Week 4 43 4 Portland
51 Week 6 44 4 Portland
52 Week 8 43 4 Portland
53 Week 10 36 4 Portland
54 Week 12 43 4 Portland
55 Week 14 46 4 Portland
56 Week 16 45 4 Portland
57 Week 18 46 4 Portland
58 Week 20 36 4 Portland
59 Week 22 40 4 Portland
60 Week 24 41 4 Portland
61 Week 26 40 4 Portland
62 Week 28 39 4 Portland
63 Week 30 38 4 Portland
64 Week 32 42 4 Portland
40.2
40

As mentioned earlier, the mid-term will have conceptual and quantitative multiple-choice questions. You need to read all 4 chapters and you need to be able to solve problems in all 4 chapters in order to do well in this test.

The following are for review and learning purposes only. I am not indicating that identical or similar problems will be in the test. As I have indicated in the class syllabus, all the exams in this course will have multiple-choice questions and problems.

Suggestion: treat this review set as you would an actual test. Sit down with your one page of notes and your calculator, and give it a try. That way you will know what areas you still need to study.

ADMN 210

Answers to Review for Midterm #1

1) Classify each of the following as nominal, ordinal, interval, or ratio data.

a. The time required to produce each tire on an assembly line – ratio since it is numeric with a valid 0 point meaning “lack of”

b. The number of quarts of milk a family drinks in a month - ratio since it is numeric with a valid 0 point meaning “lack of”

c. The ranking of four machines in your plant after they have been designated as excellent, good, satisfactory, and poor – ordinal since it is ranking data only

d. The telephone area code of clients in the United States – nominal since it is a label

e. The age of each of your employees - ratio since it is numeric with a valid 0 point meaning “lack of”

f. The dollar sales at the local pizza house each month - ratio since it is numeric with a valid 0 point meaning “lack of”

g. An employee’s identification number – nominal since it is a label

h. The response time of an emergency unit - ratio since it is numeric with a valid 0 point meaning “lack of”

2) True or False: The highest level of data measurement is the ratio-level measurement.

True (you can do the most powerful analysis with this kind of data)

3) True or False: Interval- and ratio-level data are also referred to as categorical data.

False (Interval and ratio level data are numeric and therefore quantitative, NOT qualitative….Nominal is qualitative)

4) A small portion or a subset of the population on which data is collected for conducting statistical analysis is called __________.

A sample! A population is the total group, a census IS the population, and a data set can be either a sample or a population.

5) One of the advantages for taking a sample instead of conducting a census is this:

a sample is more accurate than census

a sample is difficult to take

a sample cannot be trusted

a sample can save money when data collection process is destructive

6) Selection of the winning numbers is a lottery is an example of __________.

convenience sampling

random sampling

nonrandom sampling

regulatory sampling

7) A type of random sampling in which the population is divided into non-overlapping subpopulations is called __________.

stratified random sampling

cluster sampling

systematic random sampling

regulatory sampling

8) A type of random sampling in which every kth item (where k is some number) in the population is selected for inclusion in the sample is called __________.

stratified random sampling

cluster sampling

systematic sampling

regulatory sampling

9) Judgment sampling is an example of __________.

convenience sampling

random sampling

nonrandom (non-probabilistic) sampling

justice department sampling

10) For the following data, construct a frequency distribution with six classes.

57 23 35 18 21

26 51 47 29 21

46 43 29 23 39

50 41 19 36 28

31 42 52 29 18

28 46 33 28 20

Class width = (high – low)/6 = (57 – 18)/6 = 6.5. Let’s round up to 7 for convenience. NOTE: each student will have something slightly different!

Class Interval Frequency

18 - under 25 8 just count up how many observations are 18 through 24

25 - under 32 8

32 - under 39 3

39 - under 46 4

46 - under 53 6

53 - under 60 1

TOTAL 30

11) What type of graph would be most appropriate for the frequency distribution above?

Pie chart

Bar chart

Pareto diagram

Histogram

12) For the following frequency distribution, determine the relative frequency, percent, and the cumulative frequency.

*Round your answer to 3 decimal places, the tolerance is +/-0.001.

Class Interval Frequency Relative Frequency Percent Cumulative Frequency

20–under 25 17 17/82 = .207* 20.7% 17

25–under 30 20 20/82 = .244* 24.4% 17 + 20 = 37

30–under 35 16 .195* 19.5% 37 + 16 = 53

35–under 40 15 .183* 18.3% 53 + 15 = 68

40–under 45 8 .098* 9.8% 68 + 8 = 76

45–under 50 6 .073* 7.3% 76 + 6 = 82

TOTAL 82 1.000 100.0%

13) True or False: Frequency distribution is a summary of data presented in the form of class intervals and frequencies.

True – that’s the definition of a frequency distribution!

14) True or False: The range of a data set is defined as the difference between the mean and the median.

False – Range is the difference between the highest and lowest numbers in the data!

15) True or False: The sum of the relative frequencies of a grouped data set is always equal to one.

True – don’t forget, relative frequencies are just decimal versions of percentages, and percentages have to add up to 100%.

16) The U.S. Department of the Interior releases figures on mineral production. Following are the values (in billions of dollars) of the 15 leading states in nonfuel mineral production in the United States in 2008.

1.68, 1.81, 1.85, 1.89, 2.05, 2.05, 2.08, 2.74, 3.21, 3.30, 4.00, 4.17, 4.20, 6.48, 7.84

a. Calculate the mean, median, and mode.

Mean = sum of all data/15 = $3.29 billion

Median: the position = 2*(15+1)/4 = 8th location = $2.74 billion

Mode: 2.05 since it is the only value that appears more than once

b. Calculate the range, interquartile range, sample variance, and sample standard deviation.

Range = 7.84 – 1.68 = 6.16

Interquartile range = Q3 – Q1.

Q1 is at the following location: (15+1)/4 = 4th location = $1.89 billion

Q3 is at the following location: 3*(15+1)/4 = 12th location = $4.17 billion

So Interquartile range = 4.17 – 1.89 = 2.28

NOTE: make sure you understand what quartiles mean!

Sample variance = 3.3321 (See below)

Sample standard deviation = 1.8254 (see below)

Value ($ billions)

X-mean

squared

1.68

-1.61

2.5921

1.81

-1.48

2.1904

1.85

-1.44

2.0736

1.89

-1.40

1.9600

2.05

-1.24

1.5376

2.05

-1.24

1.5376

2.08

-1.21

1.4641

2.74

-0.55

0.3025

3.21

-0.08

0.0064

3.30

0.01

0.0001

4.00

0.71

0.5041

4.17

0.88

0.7744

4.20

0.91

0.8281

6.48

3.19

10.1761

7.84

4.55

20.7025

TOTAL

49.35

46.6496

MEAN

3.29

Variance

3.3321

=46.6496/(15-1)

SD

1.8254

=sqrt(3.3321)

c. Compute the coefficient of skewness for these data and interpret. [Ignore]

Just use the Data Analysis portion of Excel and interpret. It is 1.48, so there is a right skew of the data (slightly long right hand tail)

17) The following graphic of residential housing data (selling price and size in square feet) indicates:

a correlation close to -1

a correlation close to 0 (no relation between the two variables)

a correlation close to 1

a negative relationship between the two variables

18) The Polk Company reported that the average age of a car on U.S. roads in a recent year was 7.5 years.

a) Suppose the distribution of ages of cars on U.S. roads is approximately bell-shaped. If 99.7% of the ages are between 1 year and 14 years, what is the standard deviation of car age?

We know that 99.7% of the data are within 3 standard deviations of the mean = 6.5 years (I found that from 14 – 7.5 or 7.5 – 1). So 6.5/3 = 2.167.

b) Suppose the standard deviation is 1.7 years and the mean is 7.5 years. Between what two values would 95% of the car ages fall?

95% of the data falls within 2 standard deviations of the mean.

So 7.5 + 2 * 1.7 = 10.9, and 7.5 – 2 * 1.7 = 4.1.

19) A large manufacturing firm tests job applicants who recently graduated from college. The test scores are bell shaped with a mean of 500 and a standard deviation of 50.

a) What proportion of people get scores between 400 and 600?

Points are 2 standard deviations away, so 95%

b) What proportion of people get scores higher than 450?

Point is 1 standard deviation away, so 68/2 + 50 = 84%

c) Management is considering placing a new hire in an upper level management position if the person scores in the upper 0.15% of the distribution. What is the lowest score a college graduate can earn to qualify for the position? 

(X – 500)/50 = 3 SDs, so X = 500 + 3 * 50 = 650

20) According to the Bureau of Labor Statistics, the average annual salary of a worker in Detroit, Michigan, is $35,748. Suppose the median annual salary for a worker in this group is $31,369 and the mode is $29,500.

a) Is the distribution of salaries for this group skewed? If so, how and why?

Since these three measures are not equal, the distribution is skewed. The distribution is skewed to the right because the mean is greater than the median.

b) Which of these measures of central tendency would you use to describe these data? Why?

Often, the median is preferred in reporting income data because it yields information about the middle of the data while ignoring extremes.

21) True or False: The median is the most frequently occurring value in a set of data. False – the MODE is the most frequently occurring, not the median

22) True or False: A disadvantage of the mean as the measure of central tendency is that it is affected by extremely large or extremely small values in the data set.

True – that’s why you use the median for data sets with outliers!

23) True or False: The variance is the average of the squared deviations about the arithmetic mean for a set of numbers.

True

24) What is the median for the following five numbers? 223, 264, 216, 218, 229

Put the data in order: 216, 218, 223, 229, 256

The center number is the median = 223

25) The second quartile of a data set is always equal to its ________.

Median (by definition)

26) The sum of deviations from the mean for a data set is equal to __________.

Zero…that’s why we have to square the deviations to find the variance and standard deviation!

27) Scores obtained by students in an advanced placement test has a symmetric mound shaped (bell shaped) distribution with a mean of 70 and a standard deviation of 10. What is the proportion of students who received between 60 and 80 points.

60 is 1 standard deviation to the left of center and 80 is 1 standard deviation to the right, so by the empirical rule the answer is about 68%

28) For the previous problem, what is the proportion of students who received less than 50 points?

Find the Z point for 50: (50 – 70)/10 = -2. The area between -2 and +2 is 95%, so the area “less than 50” is (100% - 95%)/2 = 2.5%

29) The following joint probability table contains a breakdown on the age and gender of U.S. physicians in a recent year, as reported by the American Medical Association.

Age of U.S. Physicians

< 35

35 - 44

45 - 54

55 - 64

> 65

TOTAL

Male

0.11

0.20

0.19

0.12

0.16

0.78

Female

0.07

0.08

0.04

0.02

0.01

0.22

TOTAL

0.18

0.28

0.23

0.14

0.17

1.00

a) What is the probability that one randomly selected physician is 35–44 years old?

P(35 – 44) = .28/1.00 = .28

NOTE: in a probability table (as opposed to a frequency table like the one in example #31), you don’t really have to be dividing by the total since the total is 1.00. I write it in to remind you that you MUST divide by something when you are finding probabilities!

b) What is the probability that one randomly selected physician is both a woman and 45–54 years old?

P(woman and 45 – 54) = intersection = 0.04/1.00 = .04

c) What is the probability that one randomly selected physician is a man or is 35–44 years old?

P(man or 35 – 44) = .78 + .28 - .20 = .86/1.00 = .86

d) What is the probability that one randomly selected physician is less than 35 years old or 55–64 years old?

P(< 35 or 55 – 64) = .18/1.00 + .14/1.00 = .32 (NOTE: no need to subtract anything since there are no “common points”…that is, those two categories are mutually exclusive)

e) What is the probability that one randomly selected physician is a woman if she is 45–54 years old?

P(woman | 45 – 54) = .04/.23 = 0.1739

f) What is the probability that a randomly selected physician is neither a woman nor 55–64 years old?

P(not woman and not 55 – 64) = P(man and <54 or >65)

= (.11+.2+.19+.16)/1.00 = .66

30) Purchasing Survey asked purchasing professionals what sales traits impressed them most in a sales representative. Seventy-eight percent selected "thoroughness." Forty percent responded "knowledge of your own product." The purchasing professionals were allowed to list more than one trait. Suppose 27% of the purchasing professionals listed both "thoroughness" and "knowledge of your own product" as sales traits that impressed them most. A purchasing professional is randomly sampled.

a) Make a probability table including the above information.

b) What is the probability that the professional selected "thoroughness" or "knowledge of your own product"?

Mentioned knowledge

Didn’t mention knowledge

TOTAL

Mentioned thoroughness

.27

.78 - .27 = .51

.78

Didn’t mention thoroughness

.40 - .27 = .13

.60 - .51 = .09

1 - .78 = .22

TOTAL

.40

1 - .40 = .60

1.00

So P(thorough or knowledge) = (.78 + .40 - .27)/1.00 = .91

c) What is the probability that the professional selected neither "thoroughness" nor "knowledge of your own product"?

P(neither thorough nor knowledge) = P(not thorough and not knowledge)

= intersection = 0.09/1.00 = 0.09

d) If it is known that the professional selected "thoroughness," what is the probability that the professional selected "knowledge of your own product"?

P(knowledge | thorough) = .27/.78 = 0.346

e) What is the probability that the professional did not select "thoroughness" and did select "knowledge of your own product"?

P(didn’t mention thoroughness and did mention knowledge) = intersection

= 0.13/1.00 = 0.13

31) The table below contains data from a sample of 200 people regarding opinion about the latest congressional plan to eliminate anti-trust exemptions for professional baseball (broken down by gender).

OPINION ABOUT THE PLAN

For

Neutral

Against

Totals

Female

38

54

12

104

Male

12

36

48

96

Totals

50

90

60

200

Please show your work for parts "a" through "e" or no credit will be given!

a) What is the probability that a person selected at random is for the plan?

P(for) = 50/200 = .25

b) If we know that the person is a female, what is the probability that the person is for the plan?

P(for | female) = 38/104 = .365

c) What is the probability that the person is male and against the plan?

P(male and against) = 48/200 = .24

d) What is the probability that the person is male or is neutral about the plan?

P(male or neutral) = (96+90-36)/200 = .75

e) Is opinion about the plan related to gender, or are opinion and gender independent? Please use statistical concepts and numerical calculations in your answer, or no credit will be given.

Check to see if P(A) = P(A|B) = P(A|C) etc.

Is P(for the plan) = P(for | female)? .25 ≠ .365 so NOT independent

32) True or False: If two events are independent, the joint probability of the two events is always equal to the product of the marginal probabilities of two events.

True – Think about it…P(A and B) = P(A) * P(B | A). But if A and B are independent, then P(B | A) is the same as P(B). In other words, if A and B are independent, the P(A and B) = P(A) * P(B). We will use that in chapter 5 and more!

33) True or False: If the conditional probability of an event A given another event B is same as the marginal probability of the event A, then events A and B are mutually exclusive.

False – as I just said, if P(A | B) = P(A), that means that A and B are independent…that doesn’t mean that A and B are mutually exclusive. Remember: if A and B are mutually exclusive, then if one happens, the other can’t…in other words, P(A and B) = 0.

34) If the occurrence or non-occurrence of one event does not affect the occurrence or non-occurrence of another event, the two events are ________________________. Independent (by definition)

35) A listing of all elementary outcomes (i.e. the outcomes which cannot be broken down into other events) of an experiment (i.e. a decision making situation under uncertainty) is called a __________.

sample space

36) How many different combinations of a 3-member debating team can be formed from a group of 16 qualified students?

16C3 = 16!/3!(16-3)! = 16 * 15 * 14/(3 * 2 * 1) = 560

Page 9

70

80

90

100

110

120

130

140016001800200022002400

Square Feet

Selling Price ($1,000)

ADMN 210

Review for Midterm #1

As mentioned earlier, the mid-term will have conceptual and quantitative multiple-choice questions. You need to read all 4 chapters and you need to be able to solve problems in all 4 chapters in order to do well in this test.

The following are for review and learning purposes only. I am not indicating that identical or similar problems will be in the test. As I have indicated many times, all the exams in this course will have multiple-choice questions and problems.

Suggestion: treat this review set as you would an actual test. Sit down with your one page of notes and your calculator, and give it a try. That way you will know what areas you still need to study.

1) Classify each of the following as nominal, ordinal, interval, or ratio data.

a. The time required to produce each tire on an assembly line

b. The number of quarts of milk a family drinks in a month

c. The ranking of four machines in your plant after they have been designated as excellent, good, satisfactory, and poor

d. The telephone area code of clients in the United States

e. The age of each of your employees

f. The dollar sales at the local pizza house each month

g. An employee’s identification number

h. The response time of an emergency unit

2) True or False: The highest level of data measurement is the ratio-level measurement.

3) True or False: Interval- and ratio-level data are also referred to as categorical data.

4) A small portion or a subset of the population on which data is collected for conducting statistical analysis is called __________.

5) One of the advantages for taking a sample instead of conducting a census is this:

a sample is more accurate than census

a sample is difficult to take

a sample cannot be trusted

a sample can save money when data collection process is destructive

6) Selection of the winning numbers is a lottery is an example of __________.

convenience sampling

random sampling

nonrandom sampling

regulatory sampling

7) A type of random sampling in which the population is divided into non-overlapping subpopulations is called __________.

stratified random sampling

cluster sampling

systematic random sampling

regulatory sampling

8) A type of random sampling in which every kth item (where k is some number) in the population is selected for inclusion in the sample is called __________.

stratified random sampling

cluster sampling

systematic sampling

regulatory sampling

9) Judgment sampling is an example of __________.

convenience sampling

random sampling

nonrandom (non-probabilistic) sampling

justice department sampling

10) For the following data, construct a frequency distribution with six classes.

57 23 35 18 21

26 51 47 29 21

46 43 29 23 39

50 41 19 36 28

31 42 52 29 18

28 46 33 28 20

11) What type of graph would be most appropriate for the frequency distribution above?

Pie chart

Bar chart

Pareto diagram

Histogram

12) For the following frequency distribution, determine the relative frequency, percent, and the cumulative frequency.

*Round your answer to 3 decimal places, the tolerance is +/-0.001.

Class Interval Frequency

20–under 25 17

25–under 30 20

30–under 35 16

35–under 40 15

40–under 45 8

45–under 50 6

TOTAL 82

13) True or False: Frequency distribution is a summary of data presented in the form of class intervals and frequencies.

14) True or False: The range of a data set is defined as the difference between the mean and the median.

15) True or False: The sum of the relative frequencies of a grouped data set is always equal to one.

16) The U.S. Department of the Interior releases figures on mineral production. Following are the values (in billions of dollars) of the 15 leading states in nonfuel mineral production in the United States in 2008.

1.68, 1.81, 1.85, 1.89, 2.05, 2.05, 2.08, 2.74, 3.21, 3.30, 4.00, 4.17, 4.20, 6.48, 7.84

a. Calculate the mean, median, and mode.

b. Calculate the range, interquartile range, sample variance, and sample standard deviation.

c. Compute the coefficient of skewness for these data and interpret.

17) The following graphic of residential housing data (selling price and size in square feet) indicates:

a correlation close to -1

a correlation close to 0 (no relation between the two variables)

a correlation close to 1

a negative relationship between the two variables

18) The Polk Company reported that the average age of a car on U.S. roads in a recent year was 7.5 years.

a) Suppose the distribution of ages of cars on U.S. roads is approximately bell-shaped. If 99.7% of the ages are between 1 year and 14 years, what is the standard deviation of car age?

b) Suppose the standard deviation is 1.7 years and the mean is 7.5 years. Between what two values would 95% of the car ages fall?

19) A large manufacturing firm tests job applicants who recently graduated from college. The test scores are bell shaped with a mean of 500 and a standard deviation of 50.

a) What proportion of people get scores between 400 and 600?

b) What proportion of people get scores higher than 450?

c) Management is considering placing a new hire in an upper level management position if the person scores in the upper 0.15% of the distribution. What is the lowest score a college graduate can earn to qualify for the position? 

20) According to the Bureau of Labor Statistics, the average annual salary of a worker in Detroit, Michigan, is $35,748. Suppose the median annual salary for a worker in this group is $31,369 and the mode is $29,500.

a) Is the distribution of salaries for this group skewed? If so, how and why?

b) Which of these measures of central tendency would you use to describe these data? Why?

21) True or False: The median is the most frequently occurring value in a set of data.

22) True or False: A disadvantage of the mean as the measure of central tendency is that it is affected by extremely large or extremely small values in the data set.

23) True or False: The variance is the average of the squared deviations about the arithmetic mean for a set of numbers.

24) What is the median for the following five numbers? 223, 264, 216, 218, 229

25) The second quartile of a data set is always equal to its ________.

26) The sum of deviations from the mean for a data set is equal to __________.

27) Scores obtained by students in an advanced placement test has a symmetric mound shaped (bell shaped) distribution with a mean of 70 and a standard deviation of 10. What is the proportion of students who received between 60 and 80 points.

28) For the previous problem, what is the proportion of students who received less than 50 points?

29) The following joint probability table contains a breakdown on the age and gender of U.S. physicians in a recent year, as reported by the American Medical Association.

Age of U.S. Physicians

< 35

35 - 44

45 - 54

55 - 64

> 65

TOTAL

Male

0.11

0.20

0.19

0.12

0.16

0.78

Female

0.07

0.08

0.04

0.02

0.01

0.22

TOTAL

0.18

0.28

0.23

0.14

0.17

1.00

a) What is the probability that one randomly selected physician is 35–44 years old?

b) What is the probability that one randomly selected physician is both a woman and 45–54 years old?

c) What is the probability that one randomly selected physician is a man or is 35–44 years old?

d) What is the probability that one randomly selected physician is less than 35 years old or 55–64 years old?

e) What is the probability that one randomly selected physician is a woman if she is 45–54 years old?

f) What is the probability that a randomly selected physician is neither a woman nor 55–64 years old?

30) Purchasing Survey asked purchasing professionals what sales traits impressed them most in a sales representative. Seventy-eight percent selected "thoroughness." Forty percent responded "knowledge of your own product." The purchasing professionals were allowed to list more than one trait. Suppose 27% of the purchasing professionals listed both "thoroughness" and "knowledge of your own product" as sales traits that impressed them most. A purchasing professional is randomly sampled.

a) Make a probability table including the above information.

b) What is the probability that the professional selected "thoroughness" or "knowledge of your own product"?

c) What is the probability that the professional selected neither "thoroughness" nor "knowledge of your own product"?

d) If it is known that the professional selected "thoroughness," what is the probability that the professional selected "knowledge of your own product"?

e) What is the probability that the professional did not select "thoroughness" and did select "knowledge of your own product"?

31) From a previous midterm: The table below contains data from a sample of 200 people regarding opinion about the latest congressional plan to eliminate anti-trust exemptions for professional baseball (broken down by gender).

OPINION ABOUT THE PLAN

For

Neutral

Against

Totals

Female

38

54

12

104

Male

12

36

48

96

Totals

50

90

60

200

Please show your work for parts "a" through "e" or no credit will be given!

a) What is the probability that a person selected at random is for the plan?

b) If we know that the person is a female, what is the probability that the person is for the plan?

c) What is the probability that the person is male and is against the plan?

d) What is the probability that the person is male or is neutral about the plan?

e) Is opinion about the plan related to gender, or are opinion and gender independent? Please use statistical concepts and numerical calculations in your answer.

32) True or False: If two events are independent, the joint probability of the two events is always equal to the product of the marginal probabilities of two events.

33) True or False: If the conditional probability of an event A given another event B is same as the marginal probability of the event A, then events A and B are mutually exclusive.

34) If the occurrence or non-occurrence of one event does not affect the occurrence or non-occurrence of another event, the two events are ________________________.

35) A listing of all elementary outcomes (i.e. the outcomes which cannot be broken down into other events) of an experiment (i.e. a decision making situation under uncertainty) is called a __________.

36) How many different combinations of a 3-member debating team can be formed from a group of 16 qualified students?

Page 2

70

80

90

100

110

120

130

140016001800200022002400

Square Feet

Selling Price ($1,000)

Name: XXXXXX

Analysis Assignment

6/6/2017

In this analysis assignment, we will use the Chi-Square to analyze whether a management

training program is related to the promotion of managers and use the ANOVA to analyze

whether the satisfaction rate of employees in four different offices is the same.

1. Chi-Square

In the Chi-square testing, the total number of employees who didn’t promote is 40

employees. And among the 40 employees, it is expected that 17.6 employees didn’t participate

in the training program and 22.4 employees participated in the program. However, the real

outcome is that among 40 employees, 27 employees, which is 67.5% of the total employees

who didn’t promote, didn’t participate in the training program, while 13 employees, which is

32.5% of the total employees who didn’t promote, participated in the program.

In addition, the total number of employees who promoted is 60 employees. And among

the 60 employees, it is expected that 26.4 employees didn’t participate in the program and 33.6

Pearson Chi-square < 0.05, chance is not the

only factor that causes differences.

employees participated in the program. However, the real outcome is that among 60 employees,

17 employees, which is 28.3% of the total employees who promoted, didn’t participate in the

program and 43 employees, which is 71.7% of the employees who promoted, participate in the

program.

Therefore, the management training program is more efficient than expected in helping

employees to promote. Clearly, the management training program is related to the promotion

of employees.

2. ANOVA In the ANOVA test, we analyze the satisfaction rate data of different location of

offices to see if the satisfaction rate in four offices are the same. According to the charts, the p-

value is 0.034 which is less than 0.05. Therefore, we reject the null hypothesis that data is from

a sample population with the same mean. So, the satisfaction rate in four offices are not the

same. Moreover, with the scatter gram below, we can figure out that office 3 has the smallest

variance (which means the data within the group has smallest difference) while it contains the

smallest data value; the office 2 has the largest variance (which means the data within group

has largest difference) while it contains the biggest data value.

The P value in first subset >0.05, don’t have distinct

differences; The P value in second subset <0.05,

have distinct differences

Scatter Gram

0

10

20

30

40

50

60

0 1 2 3 4 5

mothly satisfaction rating of different offices

XXXXXXXXXXX ARE 112

Analysis Assignment A

For this analysis assignment, we were introduced to the program SPSS. This is a statistical tool to help us find certain statistics about a given data set. For Data Set 1, we used the Chi-Square test to see if there was a relation between a management training program and promotion. The most important statistical values for this data are the “count” and “expected count.” The “count” showed the relationship between who was promoted and who was not for each variable. The “expected count” showed the difference between who was actually promoted and who was expected to be promoted. The difference between “expected” and “actual” can also be referred to as “residual.” All the variables have the same residual value of 9.4, however, half are negative, and the other half is positive. The employees that had a positive residual value were the ones that were not promoted and did not have management training and the ones that were promoted and did have management training. The negative residual value were the employees that were not promoted and had management training and the ones that were promoted and did not have management training.

For Data Set 2, we used a different test, the ANOVA test, to see monthly employee

satisfaction rates at offices with different locations. The test results tell us that the p-value equals 0.034, thus all office employee satisfactions are different. There are four different office locations that are placed into two different “subsets.” Each of these subsets are classified as homogeneous, meaning that each office in a certain subset is alike. From the ANOVA test, we were able to identify which offices had homogenous employee satisfaction rates. The test also told us which offices had differing employee satisfaction rates. Subset 1 contained offices 1,3, and 4, while subset 2 contained offices 2,3, and 4. This tells us that the only offices that varied in employee satisfaction were office 1 and 2.

XXXXXXXXXX ARE 112 Analysis 6/7/17

Promoted * Management_Training_Program Crosstabulation

Management_Training_Program

Total No Yes Promoted No Count 27 13 40

Expected Count 17.6 22.4 40.0 % within Promoted 67.5% 32.5% 100.0% % within Management_Training_Program 61.4% 23.2% 40.0% % of Total 27.0% 13.0% 40.0% Residual 9.4 -9.4

Yes Count 17 43 60 Expected Count 26.4 33.6 60.0 % within Promoted 28.3% 71.7% 100.0% % within Management_Training_Program 38.6% 76.8% 60.0% % of Total 17.0% 43.0% 60.0% Residual -9.4 9.4

Total Count 44 56 100 Expected Count 44.0 56.0 100.0 % within Promoted 44.0% 56.0% 100.0% % within Management_Training_Program 100.0% 100.0% 100.0% % of Total 44.0% 56.0% 100.0%

From the Chi squared analysis and the management training program crosstabulation there is a statistical importance of the expected and observed promotions. The crosstabulation showed that an expected 17.6 employees who did not complete the training program would bot be promoted. The actual amount of employees who were not trained and did not get promoted was 27. This value is proven to be significantly different because the Chi squared analysis resulted in a less than one in a thousand chance of this data occurring without there being a correlation. Because of the Chi squared test result the management training program and promotion increased the amount of those promoted, expectedly 33.6 and actually 43, and had the reverse effect on those who did not get the management training and were not promoted. These differences in the data are statistically relevant; going to the management training would increase the chances of getting a promotion.

Chi-Square Tests

Value df Asymptotic

Significance (2-sided) Exact Sig. (2-sided) Exact Sig. (1-sided) Pearson Chi-Square 14.942a 1 .000 Continuity Correctionb 13.395 1 .000 Likelihood Ratio 15.211 1 .000 Fisher's Exact Test .000 .000 N of Valid Cases 100 a. 0 cells (0.0%) have expected count less than 5. The minimum expected count is 17.60. b. Computed only for a 2x2 table

XXXXXXXXXX ARE 112 Analysis 6/7/17

From the ANOVA analysis there is used to determine if the level of employ satisfaction is the same in the four offices. About the same level of satisfaction is seen in offices 1, 3, and 4. Offices 2, 3, and 4 also have very similar levels of satisfaction across the offices. There is a noticeable difference in the satisfaction of office 1 and 2. Office 1 had 39.25 and office 2 had a

satisfaction of 45.88. The ANOVA analysis gives the manager the insight of the noticeable difference in satisfaction of offices 1 and 2. These differences call to the manager’s attention that they should address the offices and determine why the satisfaction is varying.

Monthly_Satisfaction_Rating Tukey HSDa

Office Code N Subset for alpha = 0.05 1 2

1 8 39.25 3 8 39.38 39.38 4 8 41.38 41.38 2 8 45.88 Sig. .813 .053 Means for groups in homogeneous subsets are displayed. a. Uses Harmonic Mean Sample Size = 8.000.

  • Analysis Assignment (1)
  • ARE 112 SPSS A
  • ARE112 AA

Employee _Ref Promoted

Management_Training _Program

1 No Yes 2 No Yes 3 No Yes 4 No Yes 5 No No 6 No No 7 No No 8 No No 9 No No 10 No No 140 No No Here is where the "HIDE" command w 141 No No 142 No No 143 No No 144 Yes No 145 Yes No 146 Yes No 147 Yes Yes 148 Yes Yes 149 Yes Yes 150 Yes Yes

Formula # Promoted Training Program Formula

Yes 96 83 No 54 67 p-value = 0.02701

Total 150 150

# Promoted Training Program Yes 64.0% 55.3% No 36.0% 44.7%

Total 100.0% 100.0%

We are testing the idea that Promotion = function (Going to the training program).

=COUNTIF(B2:B151,$A$154)

In Per Cent

Chi square test with Excel form

=CHISQ.TEST(C154:C155,B

ARE 122 - Spring 2020 Chi Square Test

ws used

mula

B154:B155)

ARE 122 - Spring 2020 Chi Square Test

Get help from top-rated tutors in any subject.

Efficiently complete your homework and academic assignments by getting help from the experts at homeworkarchive.com