Develop a good regression model with X variables in the regression equation. Be sure to complete each part and write the responses supported by Minitab/excel work.
Be sure to comment on each of the 10 points below.
1. Note any seasonality in the Y data with ACF (autocorrelation analysis of Y).
2. Determine if any of the variables require transformation. If they do, calculate the transformed values and create a scatter plot with a regression line and run a correlation with Y for each transformed X. Create a table for the Y, X and X transformed values.
3. Determine if the model requires dummy variables (e.g. for Y variable seasonality or significant events) and include a table of the dummy variable values for regression analysis. Use either Decomposition centered moving average of Y (CMA) for Y and seasonal indices (SI) to seasonally adjust the Y variable or use dummy X variables in regression.
4. Use regression to evaluate the variable combinations to determine the best regression model. Note that is any seasonal dummy variables are used all of the seasonal dummy variables must be used. Use R square and F as primary determinants of the best model.
Note the significance of each slope term in the model. Rule-- if the coefficient is not significant then you may not use the model to forecast.
5. Investigate the best model using appropriate statistics or graphs to comment on possible:
a. Autocorrelation (Serial correlation) with the DW statistic
b. Heteroscedasticity with a residuals versus order plot (look for a megaphone effect)
c. Multicollinearity with the VIF statistic
7. Evaluate model fit with 2 error measures (RMSE and MAPE).
6. Determine the best remedies for any of the problems identified in 5 above and make the appropriate changes to the regression model if required. Rerun the model and evaluate the fit again including error measures, R adjusted square, F value, slope coefficient significance, DW and VIF.
8. Evaluate the model fit residuals and comment on their randomness using autocorrelation functions (ACFs) , histogram and a normality plot (Use a four-in-one graph set along with residual ACFs).
9. Forecast for the holdout period ( 8 quarters) using the hold out X values to forecast Y. Use Minitab Regression - Options menu by placing the columns for the X variables hold out values and any dummy variable predictions in the "Prediction intervals for new observations" area. If using the Decomposition Indices make sure you seasonalize the hold out forecast Y values.
10. Evaluate the forecast error measures and residuals to determine if the error is acceptable or has systematic variation. Write conclusion relative to the acceptability of the forecast.