Draw a barchart of the number of respondents per brand what


Need help with R codes and visualizations for the questions below. The datasets are part of the installed packages in R and can be installed in R.

Q1: Bodyfat

The dataset bodyfat is available in the MMST package. It provides estimates of the percentage of body fat of 252 men, determined by underwater weighing, and body circumference measurements. The dataset is used as a multiple regression example to see if body fat percentage can be predicted using the other measurements. Draw a parallel coordinate plot for the dataset.

a) Are there any outliers? What can you say about them?

b) Can you deduce anything about the height variable?

c) What can you say about the relationship between the first two variables, density and bodyfat?

d) Do you think the ordering of the variables is sensible? What alternative orderings might be informative?

Q2: Wine

The wine dataset can be found in the packages gclus, MMST, pgmm, and rattle. They took the data from the UCI Machine Learning Repository [Bache and Lichman, 2013]. The original source is an Italian software package [Forina et al., 1988]. The version in pgmm has about twice as many variables as the others, and the version in MMST includes the names of the three classes of wine, rather than the numeric coding that the other versions use.

a) Use pcp's to investigate how well the variables separate these classes.

b) Are there any outliers?

c) Is there evidence of subgroups within the classes?

Q3: Whisky

The package bayesm includes the dataset Scotch, which reports which brands of whisky 2218 respondents consumed in the previous year.

a) Draw a barchart of the number of respondents per brand. What ordering of the brands do you think is best?

b) There are 20 named brands and a further category Other.brands. That entails drawing a lot of bars. If you decided to plot only the biggest brands individually and group the rest all together in the ‘Other' group, what cutoff would you use for defining a big brand?

c) Another version of the dataset called whiskey is given in the package flexmix. It is made up of two data frames, whiskey with the basic data, and whiskey_brands with information on whether the whiskeys are blends or single malts. How would you incorporate this information in your graphics, by using colour, by using a different ordering, or by drawing two graphics rather than one?

d) Which of the spellings, ‘whisky' or ‘whiskey', is more appropriate for this dataset?

Request for Solution File

Ask an Expert for Answer!!
Computer Engineering: Draw a barchart of the number of respondents per brand what
Reference No:- TGS02872413

Expected delivery within 24 Hours