
Define the parameter of interest state the hypotheses and

1. Discussion 1 - Dataset. Chosen Dataset file name should end with .xls, .csv, or txt.

Before choosing your dataset you will want to review the Final Project instructions below to understand the requirements and expectations. The dataset you choose in this discussion will be the one you use throughout.

For the final project you will be creating a report based upon a pre-existing dataset. This module you will start by choosing a dataset that interests you. See attached PDF for potential datasets. Once you have chosen the dataset, attach the file to your initial post.

In the reading this week you were exposed to the different types of variables that you may encounter. Use this information to determine whether the variables in your chosen dataset are continuous, dichotomous, ordinal, or nominal. Your chosen dataset must have at least four variables, two must be continuous variables, one must be dichotomous, and one can be dichotomous, ordinal, or nominal. Clearly state the type for each variable in your dataset.

Link for suggested CDC website for dataset: https://www.cdc.gov/HealthyYouth/

2. Discussion 2 - Research Questions

Now that you have chosen your dataset it is time to focus on potential research questions for your final project. Using the dataset chosen last week develop three related research questions that you can answer (although you may not be interested in each of these).

-One question should relate to your two continuous variables to each other.

-One question should relate to dichotomous variable to your categorical (which could be: dichotomous, nominal, or ordinal variable).

-One question should relate one of your continuous variables to your dichotomous variable.

-Each research question should have a clear explanatory and response variable. It is important that your research questions directly relate to your dataset. If you are not able to answer your research question with the chosen dataset, you will not be able to complete this project. Make sure to attach your chosen dataset to your initial post.

Discussion 2: Written Assignment  

  • In this assignment review the "Comorbid Psychiatric Disorders in Youth in Juvenile Detention" paper. Read through the article paying careful attention to the section describing how the subjects were selected and the study or experiment was designed, then answer the following questions.

Article: Abram KM, Teplin LA, McClelland GM, Dulcan MK. Comorbid Psychiatric Disorders in Youth in Juvenile Detention. Arch Gen Psychiatry. 2003;60(11):1097-1108. doi:10.1001/archpsyc.60.11.1097. https://archpsyc.jamanetwork.com/article.aspx?articleid=208029

  • What is the population that the researchers are studying?
  • Describe how the data was collected:
  • How was the sample chosen?
  • Do you think the sample is representative of the population? Explain.
  • Is this an experiment or a study? Make sure to clearly explain how you came to this conclusion and support your claim.
  • Describe how this experiment or study was designed.
  • Do you think the results of this experiment/study could be applied to youth detention centers, in your own state or region? Explain.

3. Discussion 3: Written Assignment

  • This week you will utilize the software Stata to construct your graphs and tables. Stata is a data analysis and statistical software.
  • In this assignment you will describe the dataset you have chosen for your final project. For each of your key variables (that you described in Discussion 1), provide graphical and numeric summaries for them using Stata.
  • Categorical variable: For the graphical summary you will need a bar chart. For the numeric summary you will need a table of frequencies.
  • Continuous variable: For the graphical summary you will need either a histogram or a boxplot. For the numerical summary you will need means, standard deviations, and sample sizes.

In addition provide appropriate numerical and graphical summaries using Stata for the two way associations of interest described by your three research questions. Note: You will be including these three in your final project

  • Continuous-continuous association: For the graphical summary create a scatterplot. For the numeric summary you will create a correlation.
  • Categorical-categorical association: For the graphical summary create a mosaic plot or two way table. For the numerical summary you will need a two-way contingency table.
  • Continuous-categorical association: For the graphical summary create side-by-side boxplots. For the numeric summary you will need means, standard deviations, and samples sizes for the continuous variable in each category.

4. Discussion 4: Confidence Intervals

-The focus this week is on confidence intervals. Utilize the following information and answer the questions below: A 95% confidence interval obtained from a sample of 100 outpatients for the true population mean normal mean systolic blood pressure is given by (114 mmHG, 120 mmHG).

-Provide a correct interpretation of this interval. Can you think of other interpretations that would also be correct?

-This confidence interval came from a single sample, would we get the same interval if we obtained a different set of 100 patients? What does this imply about your interpretation of the given interval?

-If we wanted a 99% confidence interval instead, can you tell whether it would be narrower or wider? Can you tell by how much?

5. Discussion 5: P-Values

Define what the p-value represents. Why can we compare it to alpha (the significance level of the test)? Last week we discussed confidence intervals. Consider what a p-value tells you that a confidence interval does not? What does a confidence interval tell you that a p-value does not? Is one more useful than the other? Make sure to support your ideas. (Not more than one page).

6. Discussion 5: Written Assignment

  • Choose one of your research questions you have developed for your project and perform a hypothesis test to evaluate it. For your final project you will need to include the test used to evaluate each research question (section 2.b in the Final Project) and describe the results (section 4.a in the Final Project Document). In this assignment you will only need to evaluate the research question that relates your two continuous variables together. Make sure to explicitly show all your steps and check all necessary assumptions. The steps discussed in the lecture are the same steps you should use for all research questions.
  • The steps are:

1) Define the parameter of interest.

2) State the hypotheses

3) Determine the test statistic and p-value considering any necessary assumptions

4) Decide whether to reject or not reject the null hypothesis

5) Clearly state a conclusion in the context of the problem

The submission needs to clearly discuss each step to properly evaluate your primary research question. Please divide up the document with subheadings for each step.

7. Discussion 6: Written Assignment

-Now that you have performed a hypothesis testing on one of your research questions in Discussion 5, it is time to perform the appropriate hypothesis tests for your other two research questions.  Using Stata, perform these tests on your dataset.  Just like the assignment in Discussion 5, you will need to address each of the five steps. Make sure your submission includes the following for each question. 

1) Define the parameter of interest.

2) State the hypotheses

3) Determine the test statistic and p-value considering any necessary assumptions

4) Decide whether to reject or not reject the null hypothesis

5) Clearly state a conclusion in the context of the problem

The submission needs to clearly discuss each step to properly evaluate your two remaining research questions. Please divide up your document with subheadings for each step.

8. Discussion 7: Linear Regression

-Up until this point you have assessed two continuous variables using a correlation. This discussion you will run a simple linear regression to address your research question relating your two continuous variables together. Take the regression equation you obtain from Stata and interpret the slope of the line. What information does this give you that you weren't able to obtain by running a simple correlation? Note that this regression does not need to be included in your final project. 


The final project for this course will be a final report. This report will be similar to a research article that you would submit to a journal except you will not be using original research. You will base your final project off a preexisting dataset that you will choose in Discussion 1. From this dataset you will develop research questions. In addition, you will create appropriate visual displays and run statistical tests for at least 3 types of2wayassociations:

1. Categorical response and categorical explanatory variable

2. Continuous response and categorical explanatory variable

3. Continuous response and continuous explanatory variable

Please note that categorical variables can be dichotomous, ordinal, or nominal. For association2, make sure you use a categorical variable that is dichotomous.

You will interpret the results of these tests keeping in mind the limitations based on how the data was collected. This project will apply the items you learn throughout the course in a practical way.

Your report should include the following items.

1. Introduction[about one page in length]

a. Clearly state your primary research question as well as your two secondary research questions. Your questions should have a clear explanatory variable and a clear response variable.

b. Build a case for why your research question is important. Explain why your question is relevant and what the implications might be if you find an association.

c. Utilize at least two peer reviewed journal articles to support your claims.

2. Methods [about a half page in length]

a. State where your data came from and how it was collected

b. Discuss the statistical hypothesis test used for each research question. You should have three research questions, one for each potential association. There will be one research question that is your primary. You should go through the procedures you used to evaluate each research question.

3. Results: Basic Descriptive Statistics[about a page in length]

a. Examine your most important variables (primary response variable and your primary explanatory variable) individually. What does the distribution look like? Is the distribution symmetric or bell shaped? If not, which way is it skewed?

b. If you have a lot of variables you want to report on, use a table of summary statistics. Make sure everything is well labeled and has a title.

4. Results: Associations and Statistics[about two pages in length]

a. Include the results of your three statistical hypothesis tests. Here you should report your results and not the details of the test.

b. What are the most relevant associations you have examined so far? Clearly state what you have found and what the main features of your figures are.

c. There should be three different types of graphs and/or tables included to depictyour findings:

i. A scatterplot (for continuous- continuous association)

ii. A mosaic plot or two way table (for categorical-categorical association)

iii. Side-by-sideboxplots (for continuous-categorical association)

5. Discussion/Conclusion[about a page in length]

a. Report on the associations found. "I found that..."

b. Discuss how the associations fit together.

c. Discuss a theme you see. If you do not see a theme that is ok, but you will need to discuss the limitations, why not theme was found, other variables coming into play, etc...

d. In this section of the report answer the primary research question to the best of your ability based on your analysis so far. Also report on your secondary research questions.

e. What would you do next now that the report is complete?

6. Bibliography

a. Include a bibliography that cites all sources referenced throughout the paper using the most current version of AMA formatting. Guidelines for Submission: This report should be 5-6pages in length not including the bibliography page. The submission should use the most current AMA format. The final project should be in the form of a report with a title (no abstract is needed). Use a standard font and size of at least 11 points, and minimum 1 inch margins. Throughout the course you will work on parts of your final project. Make sure to incorporate the feedback you receive from your instructor before the final submission.

10. Discussion 8 - Written Assignment

Using the provided dataset below, determine whether the mean birthweight for infants differs significantly according to mothers' smoking status by using a one way ANOVA in Stata. Make sure to include all 5 hypothesis testing steps in your writeup.  If you found a significant difference, how might you determine which specific groups differ?  Are there any potential problems with the method you have proposed?

1) Define the parameter of interest.

2) State the hypotheses

3) Determine the test statistic and p-value considering any necessary assumptions

4) Decide whether to reject or not reject the null hypothesis

5) Clearly state a conclusion in the context of the problem

6) Specific Comparisons: How would you figure out which specific groups differ? 

7) Potential Problems: What are the potential problems with the method you have proposed to figure out which groups differ?

Attachment:- Question 10 Data Set.rar

Request for Solution File

Ask an Expert for Answer!!
Basic Statistics: Define the parameter of interest state the hypotheses and
Reference No:- TGS01402352

Expected delivery within 24 Hours