Assignment 1:
Discussion Questions
Topic for discussion: Genetic algorithms
Please briefly discuss the major steps of the genetic algorithms. Based on your experience and your textbook reading, what are suitable problem areas for the application of genetic algorithms?
Neural Networks is a huge research topic and there are a variety of implementations depending on the application area. Please pick up one type of neural network and discuss its mechanisms, advantages/disadvantages, and possible application areas. You can discuss neural networks supported by the tool you choose for Assignment 2.
According to the textbook, what is a decision tree and why is it useful? In addition please describe a commonly used decision tree algorithm.
Range of values for entropy in the ID3 algorithm Attachment
We had some discussions in class regarding range of possible values for the entropy computation in the ID3 algorithm. I did some research after class. It turns out that the range of values will depend on the base of the logarithm function. Assuming that we have 10 independent events, if each one of them happens with equal probability of 0.1, then the entropy function (the version we discussed in class, which has a base of 2) will yield a value large than 1. If we adopt a logarithm function of base 10, then the maximum value of such entropy will be 1. For the ID3 algorithm, we adopt the entropy with base of 2.
For additional details you may refer to the following Wikipedia pages.
A broader discussion can be found at the following URL. It involves the original intention of entropy used by Shannon - the number of bits used for the observation of certain events, and how much information can be obtained through the observation. More formally, given two independent events, if the first event can yield one of n equiprobable outcomes and another has one of mequiprobable outcomes then there are mn possible outcomes of the joint event. This means that if log2(n) bits are needed to encode the first value and log2(m) to encode the second, one needs log2(mn) = log2(m) + log2(n) to encode both.
The reason that we adopt a base 2 was due to the use of binary systems as all information is represented in bits. We do have alternative choices such as trits for log3, nats for the natural logarithm ln and so on.
Materials for A* Attachment
Just in case that some of you might feel it difficult to understand and/or implement the A* algorithm, here are some external online materials that you may want to read. Both contain pseudo codes and case studies very similar to the robot navigation problem in the first assignment.
Wikipedia page:
https://en.wikipedia.org/wiki/A*_search_algorithm
A* path finding for beginners:
https://www.policyalmanac.org/games/aStarTutorial.htm
Testing whether two line segments intersect Attachment.
I was expecting someone to ask this question... Anyways you may find the following algorithm useful for your implementation of the first assignment.
*********************************
Here are the conditions for testing whether two line segments intersect:
Consider two line segments p and q.
Line segment p is represented by end points (p1 p2).
Point p1 has coordinates (p1x p1y).
Point p2 has coordinates (p2x p2y).
Line segment q is represented by end points (q1 q2).
Point q1 has coordinates (q1x q1y).
Point q2 has coordinates (q2x q2y).
Let:
k1 = p1x - p2x
k2 = q2y - q1y
k3 = p1y - p2y
k4 = q2x - q1x
k5 = p1x - q1x
k6 = p1y - q1y
d = (k1 * k2) - (k3 * k4)
a = ((k2 * k5) - (k4 * k6)) / d
b = ((k1 * k6) - (k3 * k5)) / d
Line segments p and q intersect if a and b are both between 0 and 1.
That is, iff:
(0 < a < 1) AND (0 < b 1).
Note when d=0, the line segments are parallel and do not intersect.
Assignment 2:
Part 1. Text Reading:
Genetic Algorithm (Chap. 4 Sec 4.1)
Neural Networks (Chapter 18 Sec 18.7)
Course Slides
Part 2. Problems:
(Note: Please include any external materials other than the textbook. Use the APA format where appropriate.)
2.1 Genetic Algorithm
The following algorithm is used to implement crossover in a genetic algorithm:
Input: Two strings of n bits x and y
Output: Two strings of n bits x' and y'
The crossover operator is applied as follows:
A crossover site is selected at random (with equal probability) that divides each string into two sub-strings of non-zero length. That is x = [x1 x2] y = [y1 y2], with length of x1 = length of y1.
The outputs are generated as x' = [x1 y2] and y' = [y1 x2]
Given that you start with (x1, y1) = ((0 1 0 1 0 1) (1 1 1 1 1 1)), specify which 6-bit strings are possible values obtained through crossover alone. Justify your answer.
2.2 Genetic Algorithm
A genetic algorithm uses the following mutation operator: the bits in the input string are considered one by one independently, with probability 0.03 that each bit is inverted. Given that you apply the mutate operator to the string (1 1 1 1), what is the probability that the output is: (0 0 0 0)? (0 1 0 0)? (1 0 1 0)? (1 1 1 1)? Show the process of your computation.
2.3 Neural Network
The data set in the file "data.txt" contains 300 observations for 4 input variables (Temp, Pres, Flow, and Process) and an output variable (Rejects). The first column "No." is simply an identifier. The table below reproduces the first 4 observations:
No. |
Temp |
Pres |
Flow |
Process |
Rejects |
1 |
53.39 |
10.52 |
4.82 |
0 |
1.88 |
2 |
46.23 |
15.13 |
5.31 |
0 |
2.13 |
3 |
42.85 |
18.79 |
3.59 |
0 |
2.66 |
4 |
53.09 |
18.33 |
3.67 |
0 |
2.03 |
Train a back-propagation neural network on approximately 80% of the observations, randomly selected. Test the trained network using the remaining 20% observations.
Please write a detailed report that includes the following.
1) A detailed discussion how you set up the key parameters of the tool and performed the experiments.
2) Answer: Will different parameters yield the same solutions based on your experiments? Please justify your choice on these parameters.
3) Present:
(i) A figure that plots the actual and predicted values of the output "Rejects" for the training and test data sets.
(ii) Sum of squared errors for the training and test data sets.
You should also show important intermediate results, and important steps of your experiments (if not reported previously).
Note: The easiest way to solve this problem is to use a Neural Network tool. However if you wish to implement your own neural networks, that is also fine.
Part 3. Practical Assignment: Genetic Algorithm
This practical assignment is intended for you to get familiar with some of the up-to-date AI tools. Please download a genetic algorithm (GA) tool (either a freeware or a trail version of a commercial product) from the Internet and run it on your computer.
1) Follow the instructions to configure and run the tool you chose. You are also required to go through an example (or a case study) to show that the tool really works.
2) Write a brief report. In your report, answer the following questions in your own words (please do not copy/paste from a tutorial or other online materials).
a) Where and when did you download the tool?
b) What kind of real-world problems can be solved using the tool?
c) What is the actual running environment (software and hardware) of the tool?
d) How will you evaluate the tool based on your own experience? Do you need to tune the parameters in order to solve the example problem? If so, how? If not, explain in detail.
e) In what aspects could the tool be improved?
3) Take a screenshot (usually by pressing Shift + PrintScreen) during the running of the tool and paste it in your lab report. In your lab report you can provide as many screenshots as you want and/or other output to show you have actually run the tool.
Attachment:- data.txt