Assignment task: Consider a variant of the LDS model, with a new latent transition that depends on an observed sequence of inputs y1:T in the form:
zt+1 = Azt + Byt + wt
where matrix B is an additional model parameter and yt is the observed input vector at time t. And observation model
xt = Czt + Ddt + wt
where dt is also observed and D is an additional model parameter.
Provide a detailed mathematical explanation about how does the Kalman filtering and smoothing updates change for this variation? Show mathematically how the EM-based parameter estimation procedure changes?