Use MEGA BLAST to search the NCBI non-redundant nucleotide database (important to use MEGABLAST to limit your search - you should get about 7 blast hits from this search)
Answer the following questions:
1. Are there nucleotide sequence/s which correspond to your cDNA (ie derived from the same gene/same organism) in the NCBI database?
Describe your BLAST results providing accession number/s, nucleotide identity matches, and indicate whether the NCBI sequence/s derived from a cloned mRNA or computer generated?
2. Were there closely related sequences from other organisms that were similar to your cDNA but didn't match completely - briefly describe
3. Obtain the sequence of the NCBI mRNA you identified in question 1 and show the sequence. If you identified multiple sequences, take the longest one.
4. What is the organism that is the source of your cDNA clone? Apismellifera (bee)
5. Determine the amino acid sequence encoded by the complete mRNA
6. What is the size and pI of this putative protein?
7. Are there any SNP differences between your cDNA sequence and the NCBI mRNA? If so, indicate the SNP and position and potential changes to the encoded protein sequence.
8. Do a MegBLAST search of the appropriate EST database using the NCBI mRNA sequence you identified in question 1.
a. Are there EST sequences that match the mRNA? Briefly indicate the data that supports your conclusion. Provide the accession numbers of the two longest EST reads and nucleotide matches
9. Using the protein sequence obtained in question 5, determine the number of exons encoding the complete protein using TBLASTN search against the appropriate genome database and map the approximate positions (within a few amino acids) of the exon-intron boundaries mapped onto the protein sequence above
10. Does the protein encoded by the bee cDNA have hydrophobic domains that would indicate that it is associated with membranes? If so:
a. indicate what type of domains are detected, ie signal peptides, transmembrane spanning domains,
b. indicate where the domains are located by underlining the regions within the protein sequence
c. indicate the potential orientation of the protein with regards to the membrane
11. Does the protein encoded by the bee cDNA have amino acid sequences conserved with known functional motifs? If so,
a. A) indicate which motifs are found and their putative functions. Comment on the similarity between the bee protein and the motif, ie. Is it a meaningful match? Describe.
12. Is there a mouse homolog of the bee protein?
a. Provide the sequence, its name and accession number and describe the amino acid similarities with the bee protein, ie % identical residues, protein length, etc. If there are more than one splice variant of this protein, just choose one to analyze
b. Would you consider this protein to be orthologous to the bee protein. Describe the criteria you used for determining orthology.
c. Does the putative mouse ortholog contain the same hyrophobic domains as the bee protein? Describe
d. Does the putative mouse ortholog contain the same conserved motifs as the bee protein? Show your evidence supporting this
e. Show a global alignment of the mouse and bee proteins. (Don't do a BLAST alignment)
f. Are there specific amino acids that are critical for the motif/s identified in E. and are the amino acids conserved between the bee and mouse protein? If so indicate the conserved amino acids within your alignment in G
13. Describe in (1/4-1/2 page) the general structure, (potential) function, mode of action and evolution of this gene family, drawing on your analysis and information present in the webbased databases.
Attachment:- code.txt