Laboratory 9 - Hill Climbing- the Scales Problem
Exercise 1: Small Change Method
Add the ScalesSolution class and the CS2004 class from laboratory 8 into your project. Within ScalesSolution create a method called SmallChange as follows:
publicvoid SmallChange()
{
}
This method will change a random element of the scasol field/attribute. As in the lecture notes, we will generate a random integer (say p) between 0 and n-1 (n = scasol.length()). We will then change position p of scasol, i.e. it is a '1' we make it a '0' and if it is a '0' we set it to '1'.
However this might be a little less straight forward than expected since in Java, strings are considered almost constant, i.e. there is no method to change part of a string.
There are numerous ways in which we can do this. One option is as follows:
1) Create a random integer p that ranges between 0 and n-1.
2) Create an empty new string say x.
3) Copy from elements 0 to p-1 from scasol to x.
4) Copy the changed version of position p of string scasol to x.
5) Copy from p+1 to n-1 of scasol to x.
6) Set scasol to be x.
Alternatively the method getChars within theStringclass might be of use.
Test you code on a set of logical examples, e.g. a string equal to "11111" and "00000", and then some random strings. For example:
publicstaticvoid main(String args[])
{
ScalesSolution s = new ScalesSolution("11111");
s.println();
s.SmallChange();
s.println();
}
You might get the following output:
11111
11011
Exercise 2: Hill Climbing Method
Before we start on the Hill Climbing method, we need to create a way of copying an instance of the ScalesSolution class. The simplest way to do this is to add the following method to ScalesSolution:
public String GetSol()
{
return(scasol);
}
This method simply returns the string representation that the instance of the class contains. Try the following code:
ScalesSolution s1 = new ScalesSolution(10);
s1.println();
ScalesSolution s2 = new ScalesSolution(s1.GetSol());
s2.println();
You should get the same string being displayed twice. This is how we are going to copy a solution.
We are now going to write the code of the Random Mutation Hill Climbing Algorithm (RMHC) within the Lab9 class.Create a method as follows:
publicstaticScalesSolution RMHC(ArrayList weights,int n,int iter)
{
ScalesSolution sol = new ScalesSolution(n);
return(sol);
}
This method takes in an ArrayList of weights and a parameter n of what sized solution we are searching for and will return a ScalesSolution representation of the best solution (as found by RMHC) for solving the Scales problem applied to the first n weights of the ArrayList weights. The hill climbing algorithm will run for iter iterations.
We now need to implement the RMHC as in the lecture notes:
1) We need to add a For loop that iterates for the specified number of iterations.
2) We need to create an initial random solution of size n.
3) We need to evaluate the fitness of our current solution within the loop.
4) We need to copy the current solution (say oldsol).
5) We make a small change to the current solution and evaluate the fitness to another variable.
6) If the new fitness is worse than the old, we copy oldsol back to being our current solution.
7) After the For loop has completed we return the current solution.
It is often worth displaying within the loop, the current iteration and current fitness. This allows us to verify that the fitness is decreasing or remaining the same. If it goes up then we have made a mistake in our code.
Verify that the algorithm works on some small problems, e.g. the weights 1, 2, 3,4 and 10 as in the lecture.
Exercise 3: Reading in Data and Running Some Experiments
Once you have verified that the RMHC algorithm works read in the "1000 Primes.txt" file as we did in laboratory 8 and then run the algorithm for 1,000 iterations. Once this is working, run the algorithm ten times and record the best fitness for each run. Do you get the same average as in the lecture notes? Time how long each run takes for 1,000 iterations.
Now run the algorithm for as many times as possible within 5 minutes, running each repeat for 10,000 iterations. Does a run for 10,000 iterations take ten times longer than a run for 1,000? What average result do you get? How does it compare with the results in the lecture notes?
Finally do you think using a String for the representation was a good idea? If not, what would have been better?
Laboratory 10 - Parameter Estimation - Projectile Modelling
Exercise 1: The Cannon Class
In Appendix A there is a class that can be extracted, called Cannon.java.
This class has three methods (there are more but we will be using only three):
publicstatic Double GetMaxRange(Double angle,Double muzzlevelocity)
This method simulates a K12 projectile being fired at a given angle (in degrees between 25-55) and muzzle velocity (in metres per second between 1,500-1,650). The return value is the horizontal range in metres. Note that these two parameters are restricted as described in the previous section.
publicstatic ArrayList GetX()
This method returns an array list containing the simulated x-axis points the projectile passed through from the last call toGetMaxRange.
publicstatic ArrayList GetY()
This method returns an array list containing the simulated y-axis points the projectile passed through from the last call to GetMaxRange.
If you get both the x-axis and y-axis points, then you can plot the flight of the K12 projectile. The length of the return values of GetX and GetY should always be the same as each other (array list size). Tell Dr Swift if it isn't!
Implement the follow code fragment:
double r = Cannon.GetMaxRange(40.0,1575.0);
System.out.println(r);
ArrayList xt = Cannon.GetX();
ArrayList yt = Cannon.GetY();
System.out.println(xt.size());
System.out.println(yt.size());
This simulates the cannon being fired at an angle of 40 degrees with a muzzle velocity of 1,575 metres per second. Print out (display or write to a file)the paired values forx and y and then plot them in Excel.
Exercise 2: Hill Climbing Method
We are now going to design a hill climbing algorithm to solve the following problems:
1) What is the maximum range of K12? What angle and muzzle velocity is needed?
2) What is the minimum range of K12? What angle and muzzle velocity is needed?
3) What angle and muzzle velocity is needed to hit a target 75,000 metres away? How close can you get?
4) What angle and muzzle velocity is needed to hit a target 95,000 metres away? How close can you get?
5) What angle and muzzle velocity is needed to hit a target 65,000 metres away? How close can you get?
You can reuse some of your previous hill climbing code for these exercises.
We will need to design/decide upon the following elements:
1) A representation
2) A fitness function
3) A random starting point
4) A small change operator
5) The number of iterations to run for
Laboratory 11 - A Simple Genetic Algorithm Applied to the Scales Problem
Exercise 1: The Simple Genetic Algorithm
In Appendix A there are four classes that are to be extracted:
Class
|
Function
|
Lab11
|
The main class for running the experiments.
|
SimpleGeneticAlgorithm
|
The Genetic Algorithm Class. You will need to understand, use and modify this class.
|
ScalesChrome
|
The Chromosome class for the Scales problem. You will need to understand and use this class.
|
CompareScalesChrome
|
This is used by the sort method to sort the population of ScalesChromosomes. You need to use this class.
|
The SimpleGeneticAlgorithm class performs the Genetic Algorithm. It creates the initial random population containing a number of ScalesChromeobjects, it iterates for the specified number of generations or fitness function calls, and performs the genetic operators of Crossover, Mutation and Survival. The best individual from the final population is then returned as the solution to the specified size of Scales problem.
Create a project in Eclipse called Lab11and copy the classes into the project. Examine the main method of class Lab11.
The constructor for the SimpleGeneticAlgorithm class has the following format:
public SimpleGeneticAlgorithm(int ps,int gs,int nb,double mr,double cr)
The parameters are defined as follows:
Parameter
|
Function
|
ps
|
The population size for the genetic algorithm.
|
gs
|
The number of generations to run for.
|
nb
|
The number of weights (genes) we are solving the Scales problem for.
|
mr
|
The Mutation rate.
|
cr
|
The Crossover rate
|
We are going to run the Scales problem for 1000 weights (as in the example code). We are also going to run the Genetic Algorithm for 10,000 fitness function calls rather than the specified number of generations. This is so that we can compare the results for a number of different experiments. If we change any of the parameters, then we will create differing number of individuals between experiments, and thus it may appear that one run was better than another, but it may have been the case that the best result evaluated twice as many fitness calls that the other.Thus we have two versions of the RunSGA method. One version runs the GA for the specified number of generations, and the other runs it for a number of fitness function calls.
Read through the comments and look at the structure of the classes and then run the program. What do you think of the quality of the results when compared to your Hill Climbing results?
Exercise 2: Optimising the Simple Genetic Algorithm
You should have found that the each run of the GA produces absolutely useless (very poor) results. This is because the parameters are not set correctly. Look at the lecture notes and choose more sensible values for the population size, the crossover rate and the mutation rate. You should be aiming to be able to run the GA for 10 times and get a final solution of 1 each time!!!Turn on the reporting flag (change the second parameter in the RunSGA call to true), run the GA with your optimal parameters once, and then plot the average fitness and best fitness against generation number as in the diagram below.
Laboratory 12 - Further Genetic Algorithms
Exercise 1: Modifying the GA
We are going to modify Laboratory 11's simple Genetic Algorithm (GA). Perform the following steps:
1) Copy and rename the CompareScalesChrome.java, ScalesChrome.java and the SimpleGeneticAlgorithms.java files. In the filename and within the files themselves (use the EclipseFind/Replace menu option) change every instance of ScalesChrome to OneMaxChrome. Alternatively you can use the Eclipse Refactor tool.
2) Create a project Lab12 within Eclipse based around the Lab11 project.
3) The first task is to open up the CompareOneMaxChrome class and then change the signs of the two return statements(a 1 to a -1 and a -1 to a 1). This converts the problem from a minimisation problem to a maximisation one.
4) You should now only need to modify the OneMaxChrome.java file if you renamed everything correctly. Within this file you no longer need the weights field as the OneMax problem does not need any weights. Remove and modify the code so that there are no references to weights.
5) Remove the code within the GetFitness function and implement the OneMax fitness function.
Exercise 2: Experiments
Conduct a number of experiments to evaluate which version of the two Crossover operators performs best on the OneMax problem. Run each experiment at least ten times and vary the size of the problem you are solving.
Plot appropriate convergence graphs and record average performances.
Laboratory 14 - Data Clustering
Exercise 1: Running the KMeans Algorithm
Within the KMeans class, a clustering arrangement of n items is represented by an length vector, say C, where each element ci = x means that data item i is in cluster x. For example, imagine we had a result of {1,2,1,2,1,2} for 6 items, then rows/items {1,3,5} are in cluster 1 and rows/items {2,4,6} are in cluster 2.
We are now going to run the KMeans algorithm on the ClusterLab.txt dataset. Perform the following steps:
1) Create the expected clustering arrangement for the dataset as described in section 14.2. For example, the first 25 variables could be in cluster ID 1, the next 50 in cluster ID 2 and the last 25 in cluster ID 3. You will need to create an ArrayList to store the expected arrangement in.
2) Read in the ClusterLab.txt dataset using the ReadArrayFile method within the KMeans class.
3) Create a new KMeans object, specifying the dataset, number of columns and number of rows as required.
4) Run the KMeans algorithm by calling theRunIter method, specify three clusters (since that is what we are expecting), ten iterations (this should be enough) and the expected clustering arrangement. Store the result in an ArrayList.
5) Compare the result with the expected arrangement using the Kappa metric.
Exercise 2: How Consistent is KMeans?
Run the KMeans algorithm ten times and display the Kappa metric as in section 14.4 each time. What do you notice? Calculate the mean, maximum and minimum values.
Exercise 3: Clustering the Iris Dataset
We are now going to cluster Fisher's famous Iris dataset. Carry out the following steps:
1) Look at the following web address: https://archive.ics.uci.edu/ml/datasets/Iris.
2) Download the Bezdek version of the Irisdataset.
3) You will need to pre-process this dataset into a data text file and expected clustering text file. The best way to do this is to read the downloaded file into Excel.
4) Once you have imported the dataset into Excel, you should note that the actual data consists of 150 rows (instances)and 4 columns (variables). The assumption is that the rows will cluster according to the three classes, Iris Setosa, Iris Versicolour and Iris Virginica. You could use the search and replace facility in Excel to replace the class name for a cluster number (a different one per class).
5) Read in and cluster the dataset in the same manner as exercises 1 and 2.
Attaching link from where you can download Additional information to complete this task -
https://www.dropbox.com/sh/or8395x12nig6p3/AAD5B_ufgiOcFWNgOdfqgy3da?dl=0