Big Data and Analytics Group Assignment - ANALYTIC REPORT
Learning Outcomes Assessed: K3, S2, A1, V1, V2 :
Purpose: The purpose of this task is to give students practical experience to work in teams to write a Data Analytical report, which will provide useful insights, patterns and trends from a given dataset. This activity will give students the opportunity to show resourcefulness and creativity b y applying Watson Analytics. Enabling them to design useful visualization and predictive solutions for an analytics problem set.
Group Presentation: Week 10 (Scheduled Laboratory) Learning Outcomes Assessed: K4, K5, A2, V1, V2
Purpose: The purpose of the oral presentation is to provide an opportunity for students to present the results of Data Analysis and to share this knowledge while practicing their verbal communication skills.
Project Details: Your task for this analytical project is to use the Watson Analytics tool to explore, analyze and visualize the given dataset: Solar Cities, power usage. You will need to provide interesting insights, trends and patterns from the data set features.
Data set background:
The Solar cities project was a project led by the University of Ballarat, (former name of Federation University), which involved the recruitment of households and businesses across the Loddon Mallee and Grampians regions to monitor changes in energy consumption. The project looked at a number of factors that could influence energy consumption. These factors were broken up into sets of features, and measurements were taken for each respective feature. For example, a factor could be related to a dwelling's construction materials. In which case a feature could be "dwelling construction type" and a measurement would be taken to determine the construction type for each dwelling and stored in the data set. For example "dwelling construction type" could contain the values brick, brick veneer etc... Many of these features are included within the provided Solar Cities data set.
The following are sets of features included in the provided data set:
- Adoption of solar energy technologies
- Geographic characteristics
- Physical characteristics of the dwellings, including such things as the dwellings age, size, number of stories, number of lights, insulation etc.
The main aim of this project is to understand the drivers of power consumption, and as a large percentage of electrical energy is created by coal fired plants, then conversely the drivers of CO2
Therefore two main questions that need to be asked as a researcher of the Solar Cities project are:
1. Which combination of features highlight where efficiencies could be made in the reduction in energy consumption?
2. What would you include in a predictive model that would explain the demand on future energy use and CO2 emissions?
i.e. which features and their measurements contribute to
either higher or lower energy consumption? (For a predictive model, Watson Analytics does have a rule based engine...)
You have been commissioned as a research assistant with the main aim to find the drivers of power usage, understand the factors that contribute to this and provide any useful and interesting insights, trends and patterns that could be presented to the stakeholders of this project. Your intended audience are the project manager and associated stakeholders.
Your findings will be presented to the project manager and the other stakeholders, where you will outline the factors that contribute to power usage and what your recommendations will be to reduce power consumption, given the current dwellings that exist in the given geographical locations in country Victoria (i.e found in the data set).
You are expected to present the data findings in visual forms (i.e., charts and graphs). This is a group assignment. You will complete it with your team (max 3 members enrolled in the same laboratory).
- It is expected that each team member will contribute equally in the project.
- Each team will turn in one joint document and give a joint presentation in the timetabled laboratory class in Week 10.
- In addition, each individual team member will turn in a short reflection as part of the report.
You will receive feedback on the draft about presentation choices, content, analysis, and style.
Data Set:
Please use the dataset provided on the Moodle Shell in the assessment section.
Your job is to examine the available data and present it in a set of informative graphs and text. You may use the following questions to determine the major factors driving power consumption in the dwellings found in the data set. But you should also look for further questions, relationships and patterns to get a better understanding, in order to better form your recommendations. Remember, you will also need to answer the above two main questions.
Some Starting Questions
1. What is the contribution of power usage over a year by roof colour?
2. What is the contribution of power usage over a year by PV_Capacity?
3. What is the contribution of power usage over a year by PV_Capacity and Insulation?
4. What is the power usage by estimated age?
5. Over which months is the most power used?
6. Over which months is the least power used?
7. What are the top drivers of power usage?
8. Which suburbs have the most houses with pv_capacity
9. Which age houses are more likely to have pv_capacity?
10. Are houses that are owned more likely to use less power than the ones that are rented?
11. Which suburb dwellings use the most power?
12. Do houses with larger square meterage use more power than smaller houses, also does double story make a difference?
13. Which light types in dwellings use more power?
14. Does having more lights of any type mean the house will use more power?
15. What age houses have what type of wall construction?
16. What age houses and from which areas and with how many bedrooms use the most power?
17. Does Roof colour and roof material make difference to power consumption?
18. Do dwellings that have double glazed windows and with window coverings use less power?
Task 1 - Background information
Write a description of the selected dataset and its importance given the Solar Cities project, i.e. the Solar Cities project is attempting to understand power usage in dwellings, in order to then recommend how to improve the overall efficiency of power consumption in dwellings. Information must be appropriately referenced. [1 Page]
Task 2 - Reporting / Dashboards
For your project, perform the relevant data analysis tasks by
- answering the above questions
- finding new relevant questions
- answering the two main questions
You should also identify the visualization and dashboards you need to develop to best communicate your findings to the stakeholders. [2-3 Pages]
Task 3 - Research
Justify why these BI reporting solution/dashboards are chosen in Task 2 (Reporting / Dashboards) and why those data set features are present and laid out in the fashion you proposed (feel free to include all other relevant justifications).
Note: To ensure that you discuss this task properly, you must include visual samples of the reports you produce (i.e. the screenshots of the BI report/dashboard must be presented and explained in the written report; use ‘Snipping tool'), and also include any assumptions that you may have made about the analysis in your Task2. [1-2 Pages]
Task 4 - Recommendations for the project manager and project stake holders
Based on your BI analysis and the insights gained from the "Data Set", given your analysis performed in previous tasks, make some logical recommendations to the stake holders. Justify the answers to the two main questions, and show where power consumption efficiencies could be made, and what future consumption will look like. Give 2 possible scenarios. Also, if you have a predictive model show which features and measurements lead to power reductions/increases, to add to your logical recommendations. Do this with the help of appropriate references from peer-reviewed sources. [1-2 Pages]
Task 5 - The Reflection: Each Team member is expected to write a brief reflection about this project in terms of challenges, learning and contribution.