Essay on data warehousing


Part 1: Demonstrate applied knowledge of people, markets, finances, technology and management in a global context of business intelligence practice (data warehouse design, data mining process, data visualisation and performance management) and resulting organisational change and how these apply to implementation of business intelligence in organisation systems and business processes

Part 2: Identify and solve complex organisational problems creatively and practically through the use of business intelligence and critically reflect on how evidence based decision making and sustainable business performance management can effectively address real world problems

Part 3: Demonstrate the ability to communicate effectively in a clear and concise manner in written report style for senior management with correct and appropriate acknowledgment of main ideas presented and discussed.

Assignment Task 1: Data Warehouse Concepts

Drawing on relevant and current literature on data warehouses, write a short essay on data warehousing that addresses three sub tasks:

Task1.1) Provide a concise definition of a data warehouse and identify and describe two ways in which a data warehouse differs from a transactional database (10 marks about 250 words) and

Task 1.2) Identify and describe the three main types of data warehouse

Task 1.3 Define concept of a data lake, discuss two advantages and two disadvantages of a data lake comparative to a data warehouse

Assignment Task 2: Exploratory Data Analysis and Linear Regression Analysis

Carefully study the Data Dictionary for Boston Housing Data Set (See Table 1) and accompanying description of each variable. It is important to understand this data set as it is used for Task 2 and Task 3 in Assignment 2. Each record in the housing.csv data set describes a Boston suburb or town. The data was drawn from the Boston Standard Metropolitan Statistical Area (SMSA) in 1970.

Assignment Task 2.1) Conduct and report on exploratory data analysis (EDA) of the housing.csv data set using RapidMiner Studio data mining tool. Note this will require use of a number of RapidMiner operators

Provide following for Task 2.1:

(i) a screen capture of your final EDA process, briefly describe your EDA process

(ii) summarise key results of your exploratory data analysis in Table 2.1 Results of Exploratory Data Analysis for housing.csv. Table 2.1 should include key characteristics of each variable in housing.csv set such as maximum, minimum values, average, standard deviation, most frequent values (mode), missing values and invalid values etc.

(iii) Discuss key results of exploratory data analysis presented in Table 2.1 and provide a rationale for selecting top 5 variables for predicting median house value (medv), in particular focusing on the relationships of independent variables with each other and with dependent variable median house value (medv) drawing on results of EDA analysis and relevant literature on determinates of house prices

Hint: Statistics Tab and Chart Tab in RapidMiner Studio provide a lot of descriptive statistical information and the ability to create useful charts like Barcharts, Scatterplots, Boxplot charts etc for EDA analysis. You might also like to look at running correlations and/or chi square tests as appropriate to determine which variables contribute most to predicting median house value (medv).

Assignment Task 2.2) Build and report on Linear Regression model for predicting medv using RapidMiner data mining process and appropriate set of data mining operators and a reduced set of variables from housing.csv data set as determined by your exploratory data analysis in Task 2.1.

Provide the following for Task 2.2:

(i) A screen capture of Final Linear Regression Model process and briefly describe your Final Linear Regression Model process

(ii) Table 2.2 named Results of Final Linear Regression Model for Task 2.2 for housing.csv data set.

(iii) Discuss the results of Final Linear Regression Model for housing.csv data set drawing on key outputs (coefficients, standardised coefficients, t-statistics values, p-values and significance levels etc) for predicting median house value (medv) and relevant supporting literature on interpretation of a Linear Regression Model.

Include all appropriate outputs such as RapidMiner Processes, Graphs and Tables that support key aspects of exploratory data analysis and linear regression model analysis of the housing.csv data set in your Assignment 2 report.

Task 3: Tableau Desktop View of Weather Traffic Volume

After connecting to housing.csv data set in Tableau Desktop you consider binning variables such as age, crim (crime rate), ptratio (pupil to teacher ratio) to create categorical variables Task 3.1) Create a Tableau Text Table or Graph view that displays median house values by age of houses and other relevant data using the data set housing.csv. Comment on the (1) process of preparing a Text Table or Graph view using Tableau Desktop and (2) key trends and patterns that are apparent in Tableau view you have created (8 marks about 50 words).

Task 3.2) Create a Tableau Text Table or Graph view that displays median house values and potential impact of crime rate and other relevant data using data set housing.csv.

Comment on the (1) process of preparing a Text Table or Graph view using Tableau Desktop and (2) key trends and patterns that are apparent in Tableau view you have created.

Get competent Business Intelligence Assignment Help, Homework Help service to secure top grades. Approach us and get instant online assistance!

Tags: Business Intelligence Assignment Help, Business Intelligence Homework Help, Business Intelligence Coursework, Business Intelligence Solved Assignments, Data Warehouse Assignment Help, Data Warehouse Homework Help

Attachment:- Business Intelligence.rar

Request for Solution File

Ask an Expert for Answer!!
Other Subject: Essay on data warehousing
Reference No:- TGS03052059

Expected delivery within 24 Hours