Problem 1:
1. Plot a scatter chart to examine the relationship between the Year Movie Released and Sales. Include a trend line for this scatter chart. What does the scatter chart indicate about Sales over time for all the movies?
2. Generate a frequency distribution, percent frequency distribution, and histogram for Sales, Use bin sizes of $100 million. Interpret the results. Do any data points appear to be outliers in this distribution?
3. Use the PivotTable to generate crosstabulation for movie type and rating. Determine which combination of type and rating are most represented in the top 50 movie data.
Problem 2:
I. Develop numerical summaries of the data (mean, median, standard deviation, range and the correlation for each pair of variables). Present your results in tabular form.
2. Develop an estimated simple linear regression model that can be used to predict the contributors giving rate, given the graduation rate. Is there a relationship between the two variables? Discuss your findings and explain how you reached your conclusions.
3. Develop an estimated multiple linear regression model that could be used to predict the contributors giving rate, using the graduation rate, percentage of classes offered with fewer than 40 students, and Student to teacher ratio as independent variables. Discuss your findings and explain how you reached your conclusions.
4. Based on the results in parts 2 and 3, do you believe another regression model may be more appropriate? Estimate this model and explain how you reached your conclusion.
5. Results suggest that the relationship between the contributors giving rate and the graduation rate may be nonlinear. Do you agree with this statement? Explain why? If you agree estimate and discuss the new nonlinear model.
6. What conclusions and recommendations can you derive from your analysis? Which schools are achieving a substantial higher contributors giving rate than would be expected? Which schools are achieving a substantial lower contributors giving rate than would be expected?
7. Do you think that other independent variables could be included in the model? Why? Give examples of such independent variables.
Attachment:- problem1.xlsx
Attachment:- problem2.xlsx