1. For the following cases, is it appropriate to model their data using HMMs? Explain the reasons.
-Monthly precipitation data in Rochester
-A dataset of indoor images
-Handwriting recognition
-Daily S&P 500 stock market price
2. Follow the graph to decompose the joint probability P(x1, x2, x3, x4, x5, x6).
3. A DNA sequence is a series of components {A, C, G, T}. Assume the hidden variable X takes 2 possible state values {s1, s2}, and the parameters of the HMM are as follows:
Transition probabilities: P(s1|s2) = 0.8, P(s1|s2) = 0.2
P(s2|s1) = 0.2, P(s2|s2) = 0.8
Emission probabilities: P(A|s1) = 0.4, P(C|s1) = 0.1, P(G|s1) = 0.4, P(T|s1) = 0.1,
P(A|s2) = 0.1, P(C|s2) = 0.4, P(G|s2) = 0.1, P(T/s2) = 0.4,
Initial probabilities: P(s1) = 0.5, P(s2) = 0.5
The observed sequence is Y = CGTCAG.
-Calculate P(Y/M). (Hint: refer to the lecture slides to calculate forward messages recursively, each forward message is a vector containing the probabilities of the two hidden states at that time step)
-Calculate P(x3 = s1|Y). (Hint: refer to the lecture slides for the recursive derivation of the forward-backward algorithm).