1. Which of the following functions cannot be a valid autocorrelation function? Explain your reason for all.
(a) Rx(τ ) = 2e-τ2, -∞ < τ < ∞
(b) Rx(τ ) = |τ |e-|τ|, -∞ < τ < ∞
(c) Rx(τ ) = ( sin(πτ)πτ)2, -∞ < τ < ∞
(d) Rx(τ ) = (0.2cos(3πτ ))3, -∞ < τ < ∞
2. Write a MATLAB program to analyze a speech file and simultaneously, on one page, plot the following measurements.
1. The entire speech waveform (x-axis in seconds)
2. The short-time energy, Enˆ
3. The short-time magnitude, Mnˆ
4. The short-time zero-crossing, Znˆ
Use the speech waveform in file timit samples/mdab0/sx229.wav to test your program. Choose appropriate frame size (L), frame shift (R), and window type (Hamming, rectangular) for the analysis. Explain your choice of these parameters. Provide your MATLAB program in an ".m" file as your answer.
3. Write a MATLAB program to compute the two variants of the short-time autocorrelation as we defined in Lecture 7, namely: short-time autocorrelation and modified short-time autocorrelation.
(a) For this exercise you will have to specify the frame length, L, and the maximum number of autocorrelation points, K. Compare the short-time autocorrelation estimates from the two variants for several voiced regions. Which short-time autocorrelation estimate would be
better for pitch period detection, and why?
(b) Try to estimate a suitable threshold for selecting the peak autocorrelation in voiced regions, which would be appropriate for pitch detection. By processing speech file (you may use sx229.wav) on a frame-by-frame basis, determine the pitch contour of the speech file based on detecting peak of the autocorrelation function. You may decide a voiced speech if the peak is over the threshold, or an unvoiced speech if the peak is under the threshold.
You may plot detected pitch values for voiced frames. Which variant of the autocorrelation function works better?