STATISTICAL ANALYSIS PROJECT -
This project leads you through a statistical analysis of residential property data from a given non-capital city or town in Australia. This property data is also compared with property data from another non-capital city or town.
Project Situation
To analyse the real estate market in non-capital cities and towns Safe-As-Houses Real Estate, a large national real estate company, has collected data from random samples of residential properties for sale for a selection of non-capital cities and towns in States A, B and C.
As a research assistant for Safe-As-Houses Real Estate, you are analysing this data for the town or city specified by your sample. In addition, you compare the price data for this location with price data from another town or city. For example, if your student ID number ends in 8 your sample is Sample 8. That is, you will be analysing the real-estate market in Regional City 1, State B. You will also compare the residential property price data in Regional City 1, State B with the price data for Regional City 2, State A.
In each part of the project, you are required to analyse your sample data in response to given questions and provide a written answer. You can assume that the written answers are components of a longer report on the real estate market in your given city or town.
Data Analysis Project Part A -
Purpose: To
- introduce you to the project data, situation and Excel
- use Excel to graph data and calculate summary statistics
- interpret and communicate Excel results.
Part A Question -
From past research, Safe-As-Houses Real Estate is aware that the majority of first homebuyers purchase properties with three bedrooms.
You are asked to provide information on the price of three bedroom residential properties for sale in the location and state specified by your sample. In particular, information on the minimum and maximum price and the average price is required. As is an estimated price range for a three-bedroom property.
Complete the following tasks
1) Download and save your data.
2) Download the Project Part A cover sheets, name and save this file as
"Family Name_First Name_Part_A_Campus".
3) Enter your Sample Number on page 2 of the Part A coversheets.
4) Statistical Tasks
Using Price $000 (first column of your data) explore prices of three bedroom residential properties, by using Excel to
- Construct a frequency histogram or polygon
- Calculate descriptive statistics
Note: The required data for three bedrooms is in the first rows of your sample.
5) Written Task - Component of a Longer Report
Using the instructions given on page four of the Part A coversheets, introduce your data and the results of your investigation of prices of three bedroom properties for sale in the location and state specified by your sample.
This should be one to three pages and 300 to 500 words.
Use an appropriate style, without statistical jargon and equations, to clearly communicate your results.
6) Complete Coversheets 1 and 2, save and submit Part A of the project online using Project Part A link in Submit Project by the due date Tuesday 20 March 2018.
Data Analysis Project Part B -
Tasks
Task 1 Part A Self-Marking -
When directed to do so during Week 5 complete the following tasks
1) Open your saved copy of your submission for Part A.
2) Replace the Part A coversheets (three pages) with the Part B coversheets (first four pages).
3) Rename and save this file as
"Family Name_First Name_Part_B_Campus".
4) Use the solution template and marking guide provided to mark your submission for Part A. Enter recommended marks on the self-marking sheet for Part A, page 3 of the file in 3) above.
5) Write a short (approximately 200 words) reflection/feedback on your submission and marking of Part A. In particular:
- consider the good aspects of your submission, what did you do well
- identify where you made mistakes, and how you would avoid them in the future
- consider what you learnt from submitting and marking Part A.
This is to be entered in the space at the bottom of the self-marking sheet for Part A.
6) Save file. This is to be submitted with Part B - due Sunday 29 April 2018.
Task 2 Part B Appendix - Statistical Inference Tasks
The following statistical tasks should appear as appendices to your written answer. This should include all necessary steps and appropriate Excel output.
These appendices should come after your written answer within your single Word document for Part B.
In preparing your appendices you may use one of the following formats:
- Word with Excel output added.
- Handwritten with Excel output added. This will then need to be scanned and added to your word document.
Statistical Inference
Choose a level of significance for any hypothesis tests and a level of confidence for any confidence intervals. Enter these values on page 2 of the Part B coversheets along with the sample number from Part A.
Question 1 - Topic 5
Older buyers are often looking to downsize, moving from a four or more bedroom house to a smaller two or three bedroom unit.
Explore if older buyers wishing to downsize have a reasonable choice of units to choose from by using the Type data (6th column of your data) for ALL 125 residential properties for sale and an appropriate statistical inference technique to answer the following question
- What proportion of residential properties for sale, in the location and state specified by your sample, are units?
Question 2 - Topic 6
From past research, Safe-As-Houses Real Estate is aware that many potential buyers consider a non-capital city or town too expensive if the average house price is more than half a million dollars.
Explore if potential buyers would consider house prices in the location and state specified by your sample too expensive by using the Price $000 data (first column of your data) for ALL houses for sale and an appropriate statistical inference technique to answer the following question
- In the location and state specified by your sample, is the mean house price more than $500,000?
Task 3 - Part B Written Task - Components of a report
For each question, present the results of your calculations, with your interpretation and conclusion as components of a longer report on the residential property market.
Use the instructions given on page five of the Part B coversheets.
This should be a one to three pages and 200 to 400 words.
It should be submitted as a Word file with Excel output included.
Make sure you:
- Introduce each question and put it in context.
- Answer the question in non-statistical language
- Present the results of your intervals or tests without unnecessary statistical jargon
- Include conclusions which answer the given questions.
Data Analysis Project Part C -
Task 1 Part C - Appendix Statistical Inference and Regression and Correlation Tasks
The following statistical tasks should appear as appendices to your written answer. This should include all necessary steps and appropriate Excel output.
These appendices should come after your written answer within your single Word document for Part C.
In preparing your appendices you may use one of the following formats:
- Word with Excel output added.
- Handwritten with Excel output added. This will then need to be scanned and added to your word document.
Choose a level of significance for any hypothesis tests. Enter this value on page 2 of the Part C cover sheets along with the sample number from Part A.
Use your sample and appropriate statistical inference and regression and correlation techniques to answer the following questions.
Question 1 Statistical Inference Topic 7
Safe-As-Houses Real Estate is comparing residential property prices in different locations. In particular, they are interested if there is a difference in average price between two given locations.
You are required to decide if there is a difference in average price between the residential properties for sale in the location and state specified by your sample and those in the location and state specified in the last column of your data.
For example, if your student ID number ends in 2 you will be comparing residential property prices in Coastal City 1 State A with those in Coastal City 1 State B.
To provide a justified decision use Price $000 (first column of your data) and Location X State Y Price $000 (last column of your data) for ALL 125 residential properties for sale in each sample, with an appropriate statistical inference technique to answer the following question.
- Is there a difference in the mean price of residential properties for sale in the two locations?
Questions 2 and 3 Simple and Multiple Linear Regression
Safe-As-Houses Real Estate is interested in developing a model to predict the price of a residential property for sale.
To develop such a model, first develop a simple linear regression model to predict price from internal area and then a multiple linear regression model to predict price from internal area, number of bedrooms and if the property is a unit or house. Finally choose, or construct, and then interpret the linear model that best fits your data.
Question 2 Simple Linear Regression Model Topic 8
To explore the relationship between the internal area of a residential property and its price use Internal Area m^2 (independent variable - second column of your data) and Price $000s (dependent variable - first column of your data) for all 125 residential properties for sale in your sample. Using this data develop and then explore a simple linear relationship between the two variables by:
- Plotting the data with a scatter plot.
- Calculating the least squares regression equation, correlation coefficient and coefficient of determination.
- Interpreting the gradient and vertical intercept of the simple linear regression equation.
- Interpreting the correlation coefficient and coefficient of determination. Are these values consistent with your scatter plot?
Question 3 Multiple Linear Regression Model Topic 9
To explore what other factors may have an influence on the price of a residential property for sale use Internal Area m^2, Bedrooms and Type, (three independent variables - second, third and sixth columns of your data) and Price $000 (dependent variable - first column of your data), for all 125 residential properties for sale in your sample. Using this data develop and then explore the relationship between these four variables by:
- Calculating the multiple regression equation, multiple correlation coefficient, and coefficient of multiple determination.
- Interpreting the values of the multiple regression coefficients.
- Interpreting the values of the multiple correlation coefficient and coefficient of multiple determination. Compare these values with the corresponding values for the simple linear regression model.
Then determine the best model to predict the price of a residential property for sale by:
- Using appropriate tests to determine which independent variables make a significant contribution to the regression model.
- Using the results of the above tests to give or calculate the simple or multiple regression equation which best fits the data.
Task 2 - Written Answer - Components of a report
For Question 1 and Questions 2 and 3 combined present the results of your calculations, with your interpretation and conclusions as components of a longer report on the residential property market.
Use the instructions given on page four of the Part C coversheets.
This should be 300 to 700 words and three to six pages.
It should be submitted as a Word file with Excel output embedded.
Make sure you:
- Introduce each question and put it in context
- Answer the questions in non-statistical language.
- Present the result of your calculations and tests without unnecessary statistical jargon
- Include conclusions which answer the given questions.
In particular, for Question 2
- Explain the choice of independent and dependent variables
- Include your scatter plot and discuss any apparent relationship between internal area and price. Comment on the strength, shape and sign of the relationship.
In particular, for Questions 2 and 3
- Include and justify the best model.
- Discuss and interpret the values of the regression and correlation coefficients of the best model.
Note - Only Part C need to be done.
Attachment:- Assitgnment File.rar