Public Health Secondary Data Discussion Post: Data Manipulation
Because your secondary data may have been collected for a different purpose, you may not have the exact variables you need for your research. Data manipulation gives you the opportunity to move or rearrange your data without fundamentally changing it (Baty, 2009). Different variables or records are collected from a variety of data sets. Then, you need to combine these data sets to conduct your analysis. This process is called a data merge or append.
For example, if the age for a given person is stored in one data set, whereas the gender for that same person is stored in a different data set, you would need to merge the two data sets in order to have both variables in one place. At other times, you may need to split data into two and then merge them into one to meet the analysis requirements. For example, if you want to calculate the change from baseline, you would need to have the baseline for every record. Then, you would need to split baseline data and then merge them back to the main data. The process of data manipulation can be challenging, especially when you are working with a large volume of data, but it can be used to answer very specific research questions.
Post a brief explanation of the concept of splitting data sets. Then, explain the basic elements for splitting data sets. Finally, provide two examples for when splitting data sets data sets may be appropriate. Support your response.
The response should include a reference list. One-inch margins, Using Times New Roman 12 pnt font, double-space and APA style of writing and citations.