Statistical and Optimization Methods for Engineers Assessment
Modelling Project:
You are to first obtain data suitable for developing a statistical model. Any type of model described in the text is suitable. Examples include linear regression, count, or discrete choice data. You are strongly encouraged to obtain data from within your field of interest or some other data set that has interest for you. If you design a data collection instrument (i.e. survey) as part of this modelling exercise you will receive up to 10 extra points towards your overall score. If the survey is of low quality or the sample size very low, you may receive no extra points.
The data should be of extremely high quality if obtained from 3rd party: e.g. it should have been used as a basis for a published peer-reviewed journal article or government report. Many federal, state, and regional government web sites have data available for analysis. Alternatively, you should devise a survey or questionnaire and collect data for later analysis. You should provide a reference to another survey upon which your survey design is based and also include in your references section.
The following steps should be taken:
1. Identify a phenomenon or data generating process that is interesting and nontrivial.
2. Explore/research the known theories on this phenomenon, what is known, etc.?
3. From 2 identify the population and sample appropriate to support the model.
4. Develop a theoretical model of the process/behaviour/etc., and then use this to devise a questionnaire/survey.
5. Administer the survey and collect at least 60 independent observations if you are administering your own survey. A database from outside typically would include 100's or 1000's of observations.
6. Develop statistical models from the data.
Write a quality report on the activities conducted, with all supporting statistical documentation provided. All tables and figures should be numbered and referenced in the report, and a final model should be put forth.
The report submission is individual. The report will contain the following sections:
1. Abstract: (what you did, why you did it, and what you found): maximum 350 words
2. Background: (introduce the phenomenon, research question, or data generating process of interest and describe what have others found about this process/phenomenon, etc. This section should include references to at least 3 peerreviewed references on the topic. (no more than 3 pages)
3. Population and sample: (describe each, and how sampling was conducted, and potential limitations/biases) (1 page maximum)
4. Methods: Describe why the data meet the requirements for your modelling method and describe the method briefly (1 page maximum)
5. Analysis Results: (2 to 4 pages). This should include discussion of at least two models and include formatted, easy to read model output translated into Tables (like the textbook model output). Model assumptions should be checked and discussed. The discussion should include a variable-by-variable description of the impacts of the independent variables on the outcome variable. It should also include a discussion of overall goodness of fit measures of the model.
6. Conclusions (1 page max). This should NOT summarize what was done but instead focus on what was learned from the model. What relationships became evident? What do the relationships mean? What are the limitations? Are there policy implications? Are there practical implications? How should the model be interpreted?
2500 words.