E19: Numerical Methods for Engineering Applications Spring 2016 - PROJECT 4
Project: Benchmarking gradient-free optimizers
OVERVIEW
In this project, you will investigate the effectiveness of several gradient-free optimization techniques in solving a particular optimization problem: lossy image compression.
BACKGROUND
For this project, we will approximate a grayscale image using a sum of k 2D Gaussian functions. Each Gaussian function has six parameters influencing its shape, size, location, rotation, and intensity. Therefore, the overall image may be approximated by 6k numbers. Contrast this to the "traditional" method of representing a w-by-h image as a list of pixel intensities - if our Gaussian functions approximate the image well, and if 6k << w x h, we have effectively compressed the image by representing it with a much smaller list of numbers.
Our objective function f(x) for this optimization problem will be the sum of squared differences in pixel values between the original image and the Gaussian reconstruction. We will investigate how this objective is minimized by three different optimization methods: greedy hill climbing2, the Nelder-Mead method (which we discussed in class), and finally, Powell's method (which you will research on your own). Furthermore, since the objective function allows us to select different values of k, we can investigate how each method's performance scales with problem size.
Your goal is to generate a number of random intial x vectors for a variety of values of k, and test the performance of all of the methods on each vector. FYI, the function scipy.optimize.minimize() can be used to run either Nelder-Mead or Powell by passing in the appropriate method parameter.
DISCLAIMER
Real-life image compression (e.g. JPEG) doesn't work this way - it's much cleverer. We are just hacking together a silly representation of a compressed image that lets us play with optimizers!
TASKS
Download the starter code from the course website. It contains a very useful base for generating and visualizing sums of Gaussian functions as images.
Implement the objective function. Look at the portion of the starter code which maps the vector of parameters x to the resulting objective function value (on lines 167-174).
Your first task should be to wrap up all of the variables besides x (e.g. xmesh, ymesh, target) into a Python lambda object f which accepts x as input and provides the objective function value as output. Remember, since we are using lambdas, there is no need to define any new functions with the def keyword to define a function, or to use global variables anywhere!
Implement a mutation function, and run the hill climbing algorithm. Now create a lambda object called mutate which takes single parameter vector x as input and which outputs a perturbed version by calling the perturb_params function (your lambda should wrap up the w and h parameters required by this function).
Once you have f and mutate, you should be able to do the hill climbing method by passing these lambdas to the greedy_random_hillclimbing function. Try running it on a smallish (k = 5) parameter vector for a small number of iterations (perhaps 100 or so). Although the results will not substantially resemble the input image, you should at least be able to verify that the objective function is decreased by this function.
Benchmark the three methods on the image problem for k = 8, 16, and 32. For each value of k, test each algorithm 10 times on a new initial randomly generated vector. You should limit the number of function evaluations to 10,000. For hill climbing, this is passed in as a parameter to the function; for the others, you should provide options=dict(maxfev=10000) to the scipy.optimize.minimize function.
The results of your benchmarks should be summarized by including all of the following in your PDF write up:
- a set of bar graphs (with standard error bars) showing the mean runtimes of each optimization method as k is varied
- a set of bar graphs (with standard error bars) summarizing the objective function values obtained by each method as k is varied
- a set of representative images of the best and worst outputs of each optimization method (you can use matplotlib.pyplot.savefig to save figures)
Go further Take your lab a bit further by trying out one of the following ideas (or approach me if you are interested in coming up with your own idea):
- Research one of the other methods implemented by scipy.optimize.minimize and tell me what you discovered.
- Try out these optimizers on your own Engineering-inspired unconstrained multivariable minimization problem.
- Try to modify the program to compress a color image (see me for help).
WHAT TO TURN IN
You should submit all of your source code along with a PDF write-up containing your plots and answers to these questions (a few sentences on each should suffice).
- Which optimization method acheieves the best performance, and why? You may find it instructive to flip through the first few slides at https://informatik.unibas.ch/fileadmin/Lectures/HS2013/CS253/PowellAndDP1.pdf to learn a bit about Powell's method in order to explain the results.
- Which method is slowest? Fastest? Explain why you think that might be the case.
- Would it be theoretically possible to use a non-linear least-squares solver such as the Gauss-Newton method to solve this image compression problem? If so, why might we still prefer to use a method like Powell or Nelder-Mead? If not, why not?
Attachment:- Assignment.zip