Determine the standard error of the estimate


Assignment:

Q1: A consumer organization wants to develop a regression model to predict mileage (as measured by miles per gallon) based on the horsepower of the car's engine and the weight of the car (in pounds). Data were collected from a sample of 50 recent car models, and the results are organized and stored in Auto.

A. State the multiple regression equation

B. Interpret the meaning of the slopes, and in this problem

C. Explain why the regression coefficient, , has no practical meaning in the context of this problem

D. Predict the mile per gallon for cars that have 60 horsepower and weigh 2,000 pounds

E. Construct a 95% confidence interval estimate for the mean miles per gallon for cars that have 60 horsepower and weigh 2,000 pounds

F. Construct a 95% prediction interval for the miles per gallon for an individual car that has 60 horsepower and weighs 2,000 pounds

Q2: The business problem facing the director of broadcasting operations for a television station was the issue of standby hours (i.e. hours in which unionized graphic artists at the station are paid but are not actually involved in any activity) and what factors were related to standby hours. The study included the following variables:

Standby hours (Y)-Total number of standby hours in a week

Total staff present (X¹)-Weekly total of people-days

Remote hours (X²)-Total number of hours worked by employees at locations away from the central plant

Data were collected for 26 weeks; these data are organized and stored in Standby.

A. State the multiple regression equation

B. Interpret the meaning of the slopes, and b², in this problem

C. Explain why the regression coefficient, , has no practical meaning in the context of this problem

D. Predict the standby hours for a week in which the total staff present have 310 people-days and the remote hours are 400

E. Construct a 95% confidence interval estimate for the mean standby hours for weeks in which the total staff  present have 310 people-days and the remote hours are 400.

F. Construct a 95% prediction interval for the standby hours for a single week in which the total staff present have 310 people-days and the remote hours are 400

Q3: Use the following results:

Variable

Coefficient

Standard Error

t Statistic

p Value

INTERCEPT

-0.02686

0.06905

-0.39

0.7034

FOREIMP

0.79116

0.06295

12.57

0.0000

MIDSOLE

0.60484

0.07174

8.43

0.0000

  1. Construct a 95% confidence interval estimate of the population slope between durability and forefoot shock-absorbing capability
  2. At the 0.05 level of significance, determine whether each independent variable make a significant contribution to the regression model. On the basis of these results, indicate the independent variables to include in this model.

Q4: The marketing manager of a large supermarket chain faced the business problem of determining the effect on the sales of pet food of shelf space and whether the product was placed at the front (=1) or back (=0) of the aisle. Data are collected from a random sample of equal-sized stores. The results are shown in the following table (and organized and stored in Petfood):

Store

Shelf Space (Feet)

Location

Weekly Sales (Dolllars)

1

5

Back

160

2

5

Back

220

3

5

Back

140

4

10

Back

190

5

10

Back

240

6

10

Front

260

7

15

Back

230

8

15

Back

270                             

9

15

Front

280

10

20

Back

260

11

20

Back

290

12

20

Front

310

For (a) through (m), do not include an interaction term.

  1. State the multiple regression equation that predicts sales based on shelf space and location.
  2. Interpret the regression coefficients in (a).
  3. Predict the weekly sales of pet food for a store with 8 feet of shelf space situated at the back of the aisle. Construct a 95% confidence interval estimate and a 95% prediction interval.
  4. Perform a residual analysis on the results and determine whether the regression assumptions are valid.
  5. Is there a significant relationship between sales and the two independent variables (shelf space and aisle position) at the 0.05 level of significance?
  6. At the 0.05 level of significance, determine whether each independent variable makes a contribution to the regression model. Indicate the most appropriate regression model for this set of data.
  7. Construct and interpret 95% confidence interval estimates of the population slope for the relationship between sales and shelf space and between sales and aisle location.
  8. Compare the slope in b) with the slope for the simple linear regression model of problem 13.4 on page 481. Explain the difference in the results.   
  9. Compute and interpret the meaning of the coefficient of multiple determination, r².
  10. Compute and interpret the adjusted
  11. Compare   r² with the  value computed
  12. Compute the coefficients of partial determination and interpret their meaning
  13. What assumption about the slop of shelf space with sales do you need to make in this problem?
  14. Add an interaction term to the model and, at the 0.05 level of significance, determine whether it makes a significant contribution to the model
  15. On the basis of the results of (f) and (n), which model is most appropriate? Explain

Q5: The owner of a moving company typically has his most experienced manager predict the total number of labor hours that will be required to complete an upcoming move. This approach has proved useful in the past, but the owner has the business objective of developing a more accurate method of predicting labor hours. In a preliminary effort to provide a more accurate method, the owner decided to use the number of cubic feet moved and whether there is an elevator in the apartment building as the independent variables and has collected data for 36 moves in which the origin and destination were within the borough of Manhattan in New York City and the travel time was an insignificant portion of the hours worked. The data are organized and stored in Moving. For (a) through (k), do not include an interaction term.

  1. State the multiple regression equation for predicting labor hours, using the number of cubic feet moved and whether there is an elevator
  2. Interpret the regression coefficients in (a)
  3. Predict the labor hours for moving 500 cubic feet in an apartment building that has an elevator and construct a 95% confidence interval estimate and a 95% prediction interval
  4. Perform a residual analysis on the results and determine whether the regression assumptions are valid
  5. Is there significant relationship between labor hours and the two independent variables (cubic feet moved and whether there is an elevator in the apartment building) at the 0.05 level of significance?
  6. At the 0.05 level of significance, determine whether each independent variable makes a contribution to the regression model. Indicate the most appropriate regression model for this set of data.
  7. Construct a 95% confidence interval estimate of the population for the relationship between labor hours and cubic feet moved
  8. Construct a 95% confidence interval estimate for the relationship between labor hours and the presence of an elevator
  9. Compute and interpret the adjusted
  10. Compute the coefficients of partial determination and interpret their meaning
  11. What assumption do you need to make about the slope of labor hours with cubic feet moved?
  12. Add an interaction term to the model and, at the 0.05 level of significance, determine whether it makes a significant contribution to the model
  13. On the basis of the results of (f ) and (l), which model is most appropriate? Explain

Q6: The director of a training program for a large insurance company has the business objective of determining which training method is best for training underwriters. The three methods to be evaluated are traditional, CD-ROM based, and Web based. The 30 trainees are divided into three randomly assigned groups of 10. Before the start of the training, each trainee is given a proficiency exam that measures mathematics and computer skills. At the end of the training, all students take the same end-of-training exam. The results are organized and stored in Underwriting. Develop a multiple regression model to predict the score on the end-of-training exam, based on the score on the proficiency exam and the method of training used. For (a) through (k), do not include an interaction term.

  1. State the multiple regression equation
  2. Interpret the regression coefficient in (a).
  3. Predict the end-of -training exam score for a student with a proficiency exam score of 100 who had Web-based training
  4. Perform a residual analysis on your results and determine whether the regression assumptions are valid
  5. Is there a significant relationship between the end-of-training exam score and the independent variables (proficiency score and training method) at the 0.05 level of significance
  6. At the 0.05 level of significance, determine whether each independent variable make a contribution to the regression model for this set of data.
  7. Construct and interpret 95% confidence interval estimate of the population slope for the relationship between end-of -training exam score and proficiency exam.
  8. Construct and interpret 95% confidence interval estimate of the population slope for the relationship between end -of- training exam score and type of training method.
  9. Compute and interpret the adjusted
  10. Compute the coefficients of partial determination and interpret their meaning
  11. What assumption about the slope of proficiency score with end-of-training exam score do you need to make in this problem?
  12. Add interval terms to the model and, at the 0.05 level of significance, determine whether any interaction terms make a significant contribution to the model
  13. On the basis of the results of (f ) and (l), which model is most appropriate? Explain

Q7: The following data (stored in Treasury) represent the three-month Treasury bill rates in the United States from 1991 to 2008:

Year

Rate

Year

Rate

1991

5.38

2000

5.82

1992

3.43

2001

3.40

1993

3.00

2002

1.61

1994

4.25

2003

1.01

1995

5.49

2004

1.37

1996

5.01

2005

3.15

1997

5.06

2006

4.73

1998

4.78

2007

4.36

1999

4.64

2008

1.37

  1. Plot the data
  2. Fit a three-year moving average to the data and plot the results
  3. Using a smoothing coefficient of W = 0.50, exponentially smooth the series and plot the results
  4. What is your exponentially smoothed forecast for 2009?
  5. Repeat (c) and (d), using a smoothing coefficient of W = 0.25
  6. Compare the results of (d) and (e)

Q8: Gross domestic product (GDP) is a major indicator of a nation's overall economic activity. It consist of personal consumption expenditures, gross domestic investment, net experts of goods and services, and government consumption expenditures. The GDP (in billions of current dollars) for the United States from 1980 to 2008 is stored in GDP.

  1. Plot the data
  2. Compute a linear trend forecasting equation and plot the trend line
  3. What are your forecasts for 2009 to 2010?
  4. What conclusions can you reach concerning the trend in GDP?

Q9: The data in Strategic represent the amount of oil, in billions of barrels, held in the U.S. strategic oil reserve, from 1981 through 2008.

  1. Plot the data
  2. Compute a linear trend forecasting equation and plot the trend line.
  3. Compute a quadratic trend forecasting equation and plot the results
  4. Compute an exponential trend forecasting equation and plot the results
  5. Which model is the most appropriate?
  6. Using the most appropriate model, forecast the number of barrels, in billions, in 2009. Check how accurate your forecast is by locating the true value for 2009 on the Internet or in your library

Q10: The following data (stored in Credit) are monthly credit card charges (in millions of dollars) for a popular credit card issued by a large bank (the mane of which is not disclosed at its request):

Month

2007

2008

2009

January

 31.9

39.4

45.0

February

27.0

36.2

39.6

March

31.3

40.5

 

April

31.0

44.6

 

 

May

39.4

46.8

 

June

40.7

44.7

 

July

42.3

52.2

 

August

49.5

54.0

 

September

45.0

48.8

 

October

50.0

55.8

 

November

50.9

58.7

 

December

58.5

63.4

 

  1. Construct the time-series plot
  2. Describe the monthly pattern that is evident in the data
  3. In general, would you say that the overall dollar amounts charged on the bank's credit cards is increasing or decreasing? Explain
  4. Note that December 2008 charges were more than $63 million, but those for February 2009 were less than $40 million. Was February's total close to what you would have expected?
  5. Develop an exponential trend forecasting equation with monthly components.
  6. Interpret the monthly compound growth rate.
  7. Interpret the January multiplier
  8. What is the predicted value for March 2009?
  9. What is the predicted value for April 2009?
  10. How can this type of time-series forecasting benefit the bank?

Appendix A

Auto

MPG

Horsepower

Weight

43.1

48

1985

19.9

110

3365

19.2

105

3535

17.7

165

3445

18.1

139

3205

20.3

103

2830

21.5

115

3245

16.9

155

4360

15.5

142

4054

18.5

150

3940

27.2

71

3190

41.5

76

2144

46.6

65

2110

23.7

100

2420

27.2

84

2490

39.1

58

1755

28.0

88

2605

24.0

92

2865

20.2

139

3570

20.5

95

3155

28.0

90

2678

34.7

63

2215

36.1

66

1800

35.7

80

1915

20.2

85

2965

23.9

90

3420

29.9

65

2380

30.4

67

3250

36.0

74

1980

22.6

110

2800

36.4

67

2950

27.5

95

2560

33.7

75

2210

44.6

67

1850

32.9

100

2615

38.0

67

1965

24.2

120

2930

38.1

60

1968

39.4

70

2070

25.4

116

2900

31.3

75

2542

34.1

68

1985

34.0

88

2395

31.0

82

2720

27.4

80

2670

22.3

88

2890

28.0

79

2625

17.6

85

3465

34.4

65

3465

20.6

105

3380

Standby

Standby

Total Staff

Remote

Dubner

Total Labor

245

338

414

323

2001

177

333

598

340

2030

271

358

656

340

2226

211

372

631

352

2154

196

339

528

380

2078

135

289

409

339

2080

195

334

382

331

2073

118

293

399

311

1758

116

325

343

328

1624

147

311

338

353

1889

154

304

353

518

1988

146

312

289

440

2049

115

283

388

276

1796

161

307

402

207

1720

274

322

151

287

2056

245

335

228

290

1890

201

350

271

355

2187

183

339

440

300

2032

237

327

475

284

1856

175

328

347

337

2068

152

319

449

279

1813

188

325

336

244

1808

188

322

267

253

1834

197

317

235

272

1973

261

315

164

223

1839

232

331

270

272

1935

Q11: The marketing manager of a large supermarket chain would like to use shelf space to prdict the sales of pet food. A random of 12 equal sized stores is selected with the following results stored in Petfood):

Store

Shelf Space (Feet)

Location

Weekly Sales (Dolllars)

1

5

Back

160

2

5

Back

220

3

5

Back

140

4

10

Back

190

5

10

Back

240

6

10

Front

260

7

15

Back

230

8

15

Back

270                             

9

15

Front

280

10

20

Back

260

11

20

Back

290

12

20

Front

310

a)  Construct a scatterplot-for those data b0 = 145 and b1 = 7.4

b)  Interpret the meaning of the slope, b1 in this problem

c) Predict the weekly sales of petfood for stores with 8 feet of shelf space for petfood.

Q12: The marketing manager used shelf space for petfood to predict weekly sales (stored in petfood). For those data SSR = 20,535 and SST = 30,025.

a) Determine the coefficient of determination, r², and interpret its meaning.

b)  Determine the standard error of the estimate

c) How useful do you think this regression model is for predicting sales?

Solution Preview :

Prepared by a verified Expert
Basic Statistics: Determine the standard error of the estimate
Reference No:- TGS01898722

Now Priced at $40 (50% Discount)

Recommended (90%)

Rated (4.3/5)