A. MICROBIOME
1) In the accompanying microbiome dataset 1, which includes the known 16SrRNA sequences of 5 different gut microbiota, identify which species the last sequence (seq5) corresponds to.
2) Identify a substring in each sequence of no more than 30 bases that uniquely distinguishes each species but comes from the same region of the sequence. Report the coordinates (relative to seq1) where your substring came from, and write out the 5 unique sequences as they would appear in a multiple alignment.
3) Why do you think the region where your substring came from is more variable than most of the rest of the sequence?
4) Dataset 1 and dataset 2 were derived from 16s sequencing of a malnourished and well fed individual respectively, with all other variables (e.g. age, sex) perfectly matched. Using the substrings above, count the number of 16s rRNA derived from the 5 species you identified above. The "find" function in excel may be useful here. You can assume that there exist no other species with the same substring as those that you are measuring. Report your counts in table format with column headings: dataset 2, dataset 3, and row labels: your 5 species. (3 marks) N.B. ".tab" (tab-delimited) files can be opened with Excel. For Mac users, you may need to change it to ".txt" before opening.
6) Which of the 5 species differs the most in its abundance between the two individuals?
B. CONSERVATION AND POSITIVE SELECTION OF NON-CODING RNA
Many non-coding RNAs are turning out to have very important conserved functions. The HAR1A (Human accelerated region 1) sequenceis expressed in human cerebral cortex during early human development. Part of the sequence appears to be under strong positive selection and is referred to as Human accelerated region 1
1. Use BLAT to call up the sequence forHuman Accelerated Region A (HAR1A) using the following segment of human HAR1A: AGACGTTACAGCAACGTGTCAGCTGAAATGATGGGCGTAGACGCACGT
Show the screen shot.
2. Is the conservation limited to this short search sequence? Show data.
3. Do a CLUSTALW alignment of the region in 1) plus 100 nucleotides from either side for vertebrates ranging from Human to fish. Use at least 10 species and include available primates as well as the species in Question 5. Show the alignment.
4. Show the graphical phylogenetic relationship using the alignment.
5. How many changes are present between the Human and Chimp? Chimp to Mouse? Chimp to Opossum?
C. GENEMANIA AND BIOGRID
Links to these sites and descriptive material in Nucleic Acids Research are shown below.
GENEMANIAhttps://genemania.org/
NAR: https://academic.oup.com/nar/article/38/suppl_2/W214/1126704/The-GeneMANIA-prediction-server-biological-network
BIOGRID https://thebiogrid.org/
NAR: https://academic.oup.com/nar/article/45/D1/D369/2681732/The-BioGRID-interaction-database-2017-update
1. Go to Genemania.
Choose a protein of interest that is present in any of the species (dropdown icon, upper left panel). On the right side of the upper left panel is a drop down menu to toggle on/off various types of interaction. Turn off everything except Physical interactions. Show the output.
2. The image will show reported physical interactions as well as predicted interactions. The displayed information will show on the right hand side of the page. Turn off everything except the one you searched for. Try toggling on/off the various types of data on the left hand menu. Show at least three variations on the theme. Be sure to include Attributes alone.
3. Go to BioGrid and input the same protein/organism. The initial image displays all information. On the Switch View panel click the Interactors, Interactions, and then Network. Show image of the last one. How does the view compare to Genemania?
Attachment:- Assignment Files.rar