1. Write a pattern matching statement using a regular expression that returns the GI number from the header of a sequence in a FASTA file.
Assume that the GI number occurs as between the first set of vertical bars.
For example, given this header:
>gi|1786181|gb|AE000111.1|AE000111 Escherichia coli K-12 MG1655
print out:
GI: 1786181
2. Write a Perl program that produces the following output when searching for the word RNA in the paragraph below:
Output:
(RNAi) ends at position 49
(dsRNA) ends at position 211
(ssRNAs) ends at position 374
mRNAs. ends at position 431
Search Text (you can assign this to a string in your program):
"Several rapidly developing RNA interference (RNAi) methodologies hold the promise to selectively inhibit gene expression in mammals. RNAi is an innate cellular process activated when a double-stranded RNA (dsRNA) molecule of greater than 19 duplex nucleotides enters the cell, causing the degradation of not only the invading dsRNA molecule, but also single-stranded (ssRNAs) RNAs of identical sequences, including endogenous mRNAs."