Multiple Regression Analysis -
1. Write an overview of correlation and regression analysis (3-4 pages) with references
2. Construct a multiple regression model using data in "Census data by neighborhood")
The data set contains a large number of variables for 333 neighborhoods, from the 2011 census. Select 8 census variables as independents and treat "average house value" as the dependent, to construct a regression equation. (See the "Census variable definition" document)
Suggested steps:
1) Select 8 variables as independents; explain in what way you believe they influence house value in a neighborhood
2) Perform a correlation analysis to check collinearity of the 8 chosen variables. If some of them a highly correlated, provide your explanations.
3) Use Stepwise Multiple Regression method to construct a regression model (i.e., generate a regression equation)
4) Interpret the model
5) Assess/verify the model
6) Use the model to predict the average house value for the neighborhood assigned to you in the following table.
Neighborhood
|
Student
|
Oakville - 2
|
|
Mississauga - 7
|
|
Lawrence Park South
|
|
Rosedale-Moore Park
|
|
Kingsway South
|
|
Forest Hill South
|
|
Markham - 20
|
|
Mississauga - 8
|
|
Mississauga - 20
|
|
Vaughan - 6
|
|
St.Andrew-Windfields
|
|
Lawrence Park North
|
|
Mississauga - 5
|
|
Vaughan - 3
|
|
Bedford Park-Nortown
|
|
Casa Loma
|
|
Princess-Rosethorn
|
|
Leaside-Bennington
|
|
Oakville - 11
|
|
Richmond Hill - 6
|
|
Attachment:- Assignment.rar