Bioinformatics assignment - construct regular expressions, Biology

Bioinformatics assignment - construct regular expressions

Bioinformatics Assignment -

In this assignment should check the following sequence and test whether it has the following restriction cut sites. This searching should be done globally, that is, it should check for all possible restriction sites. If the restriction sites are present, print out the regex, the pattern that matched the regex, and the position of where the cut beings.

Hint: the pos function gets the position of the last matched substring. Play around with it to see how it works.

Construct regular expressions for the two restriction enzyme motifs. Each restriction enzyme motif should be represented by one regular expression:

CACNNN/GTG (so CACNNN or CACGTG) where N represents A,C,T, or G

GCCWGG, where W represents A or T

The DNA sequence you will be searching in is this one, which you will paste into your program:

$dna = 'AACAGCACGGCAACGCTGTGCCTTGGGCACCATGCAGTACCAAACGGAACGATAGTGAAAACAATCACGA

ATGACCAAATTGAAGTTACTAATGCTACTGAGCTGGTTCAGAGTTCCTCAACAGGTGAAATATGCGACAG

TCCTCATCAGATCCTTGATGGAGAAAACTGCACACTAATAGATGCTCTATTGGGAGACCCTCAGTGTGAT

GGCTTCCAAAATAAGAAATGGGACCTTTTTGTTGAACGCAGCAAAGCCTACAGCAACTGTTACCCTTATG

ATGTGCCGGATTATGCCTCCCTTAGGTCACTAGTTGCCTCATCCGGCACACTGGAATTTAACAATGAAAG

CTTCAATTGGACTGGAGTCACTCAAAATGGAATCAGCTCTGCTTGCAAAAGGAGATCTAATAACAGTTTC

TTTAGTAGATTGAATTGGTTGACCCACTTAAAATTCAAATACCCAGCATTGAACGTGACTATGCCAAACA

ATGAAAAATTTGACAAATTGTACATTTGGGGGGTTCACCACCCGGGTACGGACAATGACCAAATCTTCCT

GTATGCTCAAGCATCAGGAAGAATCACAGTCTCTACCAAAAGAAGCCAACAGACTGTAATCCCGAATATC

GGATCTAGACCCAGAGTAAGGAATATCCCCAGCAGAATAAGCATCTATTGGACAATAGTAAAACCGGGAG

ACATACTTTTGATTAACAGCACAGGGAATTTAATTGCTCCTAGGGGTTACTTCAAAATACGAAGTGGGAA

AAGCTCAATAATGAGATCAGATGCACCCATTGGCAAATGCAATTCTGAATGCATCACTCCAAATGGAAGC

ATTCCCAATGACAAACCATTTCAAAATGTAAACAGGATCACATATGGGGCCTGGCCCAGATATGTTAAGC

AAAACACTCTGAAATTGGCAACAGGGATGCGAAATGTACCAGAGAAACAAACTAGAGGCATATTTGGCGC

AATCGCGGGTTTCATAGAAAATGGTTGGGAAGGAATGGTGGATGGTTGGTACGGTTT'

If you print out $dna, you may notice that the sequence is wrapped around some 70 characters or so. This means that $dna currently contains some \n characters in it, which will affect how regex matches against the string. In order to correctly identify all possible restriction sites, you would need to first remove those newline characters. This can be done by including the substitution operator after the variable declaration (similar to what we did in the vim writing exercises):

$dna =~ s/\s//g; # What would happen if the 'g' modifier is removed?

The following is the expected output. Instead of "$pattern1" and "$pattern2", you should be printing out the actual regular expression that you used to match the restriction enzyme motif. I did not print it out because that would give you part of the answer.

The program should include:

two regular expressions, one for each enzyme

one variable that contains the DNA sequence

optional if you would like to challenge yourself, include some code that will accept one command-line argument. If one is given, replace $dna above with the sequence provided by the user. Ensure that the provided sequence is a DNA sequence; otherwise, end the program and print a helpful message back to the user.

One subroutine called find_cut_sites that will accept 2 parameters: a DNA sequence and a regular expression. The subroutine should match the regular expression against the sequence and print the positions of all found cut sites. The position printed should be the starting position of where the site was found. Nothing should be explicitly returned by this subroutine. Whenever a subroutine does not explicitly return anything, it is known to be a void subroutine (void because a result is not provided back to the caller).

(There should be two subroutine calls for find_cut_sites(): one for each regular expression.)

Comments describing your subroutine (what it accepts, what it returns, what it does) and any other ambiguous code.

Attachment:- Assignment File.rar

View Complete Question

Request for Solution File

Ask an Expert for Answer!!

Biology: Bioinformatics assignment - construct regular expressions

Reference No:- TGS02943324

Expected delivery within 24 Hours

Have a Question? (oR Write a Review)

Write atleast 100 words!!

Request for Solution File

Ask an Expert for Answer!!

Biology: Bioinformatics assignment - construct regular expressions

Reference No:- TGS02943324

Have a Question? (oR Write a Review)

Recent Questions Asked Biology

Q : Advise the board of downing ltd whether sam has breached

Q : Define a packet analyzer and describe its use list best

Q : Find an other example like this that misrepresented the

Q : Steve jobs was a strong charismatic leader who co-founded

Q : Bioinformatics assignment - construct regular expressions

Q : What do you understand by the term sustainability

Q : What is the likelihood that a randomly selected restaurant

Q : Apply industry standards to the implementation and support

Q : Using suitable examples distinguish between strategic items

Identity and fitting into different environments

Risk of emotional or psychological data

How knowledge affect your future behavior

Determine when my problem space is sufficiently narrowed

Understanding human perception by human-made illusions

Describe context of your first interaction with each person

Social skills interventions for children with autism

Request for Solution File

Ask an Expert for Answer!!

Biology: Bioinformatics assignment - construct regular expressions

Reference No:- TGS02943324

Recent Questions Asked Biology

Q : Advise the board of downing ltd whether sam has breached

Q : Define a packet analyzer and describe its use list best

Q : Find an other example like this that misrepresented the

Q : Steve jobs was a strong charismatic leader who co-founded

Q : Bioinformatics assignment - construct regular expressions

Q : What do you understand by the term sustainability

Q : What is the likelihood that a randomly selected restaurant

Q : Apply industry standards to the implementation and support

Q : Using suitable examples distinguish between strategic items

Asked Questions