E27: Computer Vision Spring 2016 - HOMEWORK 3
1. Image warping and invertible transformations
Given a digital image, and an invertible transformation H˜ of the form
p˜' ≡ H˜p˜
we would like to compute the warped image whereby each point p˜ in the original image is transformed to its new location p˜'. This type of image warping is exactly what the cv2.warpPerspective function does, for example.
We could envision a somewhat straightforward algorithm for performing this image warp: for each location p˜ in the original image, compute the nearest pixel location of the transformed point p˜' in the warped image, and copy the color found in p˜ to the warped image at location p˜'.
However, the vastly preferable algorithm is to loop over the destination pixels p˜' in the warp image, and use the inverse transformation H˜-1 to identify the nearest pixel p˜ in the source image and copy the color from that source pixel to the destination.
What is the difference between the two approaches? Why is the second one preferable?
2. Convolution vs. Correlation
Given two real-valued functions f and g of one variable, the convolution of f and g is defined as:
(f * g)(x) = -∞∫∞ f(x - u) g(u) du
The correlation of f and g is defined similarly:
(f ⊗ g)(x) = -∞∫∞f(x + u) g(u) du
a. How are convolution and correlation related? Given any functions f(x) and g(x), define a function h(x) such that
f * g = f ⊗ h
What is h(x) in terms of g(x)?
b. Filtering confusion. Although we have defined filtering operations in class in terms of convolution, many software packages (including OpenCV) define filtering in terms of correlation instead. As we will see, it doesn't make a huge difference.
Imagine that g(x) is a symmetric (even) function, such as a Gaussian, where
G(x) = g(-x)
Based upon your answer above, can you tell the difference between f * g and f ⊗ g? Why or why not?
Next, imagine that g(x) is antisymmetric (odd), where
G(x) = -g(-x)
Now can you tell the difference? What is the mathematical relationship between f * g and f ⊗ g in this case?
3. Image derivatives and separability
a. The Gaussian blur operation is seperable. Let I(x, y) be a continuous function specifiying image intensity at the point (x, y), and let G(x, y) be a Gaussian function:
G(x, y) = 1/Z2exp(-(x2+y2/2σ2))
where Z2 is a normalizing constant that makes G integrate (or sum, in the discrete domain) to 1. Show that the convolution of I with G can be represented as convolution with two functions gx and gy that depend only on the x and y coordinates respectively.
Remember, the discrete convolution I * G can be expressed as the sum
(I * G)(x, y) = ∑u∑vI(x - u, y - v)G(u, v)
What are the functions gx(x) and gy(y) such that I * G = I * gx * gy?
b. The partial derivative of a Gaussian is separable. Let G→(x, y) be defined as the derivative of the Gaussian function with respect to x:
G→(x, y) = (∂/∂x)G(x, y)
show that the convolution (I * G→) can also be represented as a convolution with two functions that depend only on x and y.
c. Why do we care about separability? Say we are convolving a w x h image I with an arbitrary square n x n kernel G. How many operations (in terms of big-O notation) does it take to evaluate (I * G) for each pixel in I? How many operations if G is separable and we can evaluate the convolution as ((I * gx) * gy), where gx and gy are both kernels with size n? In short, why do we prefer to use separable convolutions?
4. Non-linear filtering
The erosion, dilation, opening and closing operators that we discussed in class are sometimes considered "non-linear filters". How are they similar to convolutions, and how are they different? What are some other examples of non-linear filters and their uses?