Problem: Computer Science
1. An analysis of factors affecting usage of public transportation (e.g., busses, trains, ferries, etc.) is being conducted. Survey responses are collected, recording various categorical demographic and socioeconomic variables as well as the number of times during the last week any form of public transportation was used. Describe what type of model you would use to analyse this data and why.
2. The survey in part (1) is to be modified. The same categorical covariates are collected, but this time the survey asked what mode of public transport was preferred (e.g., busses, trains, ferries, light rail, none of them, no preference, etc.). Discuss the relative advantages and disadvantages of a classification tree model versus a random forest model in analysing this data.