Instructions:
This is a project where you can work alone or with two other students (a maximum group size of three). All group members will receive the same marks for the assignment. All group members must be enrolled in the same tutorial. The assignment must be provided in the form of a (brief) business reportapproximately 8-14 pages. You must submit an electroniccopy of your assignment in Blackboard. Hard copies will not be accepted.SHOW YOUR WORK for calculation based questions.
This assignment requires the use of Microsoft Excel. If you have Windows, you will also need to use the Data Analysis ToolPak. If you have a Mac with Excel 2011, you will need to use StatPlus:MAC LE.
Problem Description:
The Australian Housing and Urban Research Institute at RMIT has contacted us to explore housing affordability in Melbourne. It wishes to explore whether the inner suburbs have become unaffordable for young first-time homebuyers.
A sample of 100sales of homes (excluding apartment units) across Melbourne have been obtained from the Victoria Valuers General office. Several characteristics of homes including the number of bedrooms, land area and distance to Melbourne's CBD have been included in the table.
You will use descriptive statistics, inferential statisticsand your knowledge of multiple linear regression to complete this task.
Price(Dependent Variable)and several characteristics (Independent Variables) are given in the Excel file: Thursday.xlsx.
Here is a table describing the variables in the data set:
Variable
|
Definition
|
price
|
Transaction price of home sale
|
bedrooms
|
Number of bedrooms in home
|
land_area
|
Area of land of property in square metres
|
meldist
|
Distance of home to Melbourne's CBD in km.
|
2014
|
Home sold in 2014
|
2013
|
Home sold in 2013
|
2012
|
Home sold in 2012
|
2011
|
Home sold in 2011
|
2010
|
Home sold in 2010
|
Required:
A. Calculate the descriptive statistics from the data and display in a table. Be sure to comment on the central tendency,variabilityand shape for Price, Bedrooms, Land Area and Distance to Melbourne.How would you interpret the mean of dummy variables such as 2014?
B. Draw a graph that displays the distribution of the distance of the homes to Melbourne's CBD. Be sure to comment on the distribution.
C. Create a box-and-whisker plot for the distribution of the pricesand describe the shape. Is there evidence of outliers in the data?
D. There is a growing belief thathomes in the inner suburbs are increasingly unaffordable for young first-time home buyers. What is the likelihood that a home less than 15 km from the CBD will has sell for more than $500,000?Is the price of homes statistically independent of the distance to the CBD? Use a Contingency Table.
E. Estimate the 99% confidence interval for the population mean number of bedrooms.
F. One of the implied benefits of living further away from the CBD is the increased land area for raising children. Test the claim at the 1% level of significance that the land area of homes further than 15 km from the CBD is larger than the 510 sq metres of the typical lot in the inner suburbs.
G. Run a multiple linear regression using the data and show the output from Excel.Exclude the dummy variable "2010" from the regression results.
H. Is the coefficient estimate for distance to Melbourne statistically different than zero at the 5% level of significance? Set-up the correct hypothesis test using the results found in the table in Part (G) using both the critical value and p-value approach. Interpret the coefficient estimate of the slope.
I. Interpret the remaining slope coefficient estimates. Discuss whether the signs are what you are expecting and explain your reasoning.
J. Interpret the value of the Adjusted R2.
Is there a large difference between the R2 and the Adjusted R2? If so, what may explain the reasoning for this?
K. Is the overall model statistically significant at the 5% level of significance? Use the p-value approach.
L. Based on the results of the regressions, what other factors would have influenced the price of homes? Provide a couple possible examples and indicate their predicted relationship with the price if they were included.
M. Predict the average price of a 3-bedroom home that is 10 km from the city with 300 square metres of land that sold in 2014 if it is appropriate to do so. Show the predicted regression equation.
N. Do the results suggest that the data satisfy the assumptions of a linear regression: Linearity, Normality of the Errors, and Homoscedasticity of Errors? Show using scatter diagrams, normal probability plots and/or histograms and Explain.
O. Would these results tell us anything about the affordability for non-investor purchasers of these properties? If not, describe a scenario in how you would construct a sample of households that are looking to occupy a home.
Attachment:- day.xlsx