Your task is to implement a simple automatic language


Assignment

Project 1: Snake Game

Snake (also called Nibbles) is a classic game whichis played on a rectangular game board. When the game starts, an apple is randomly added to the game board and the snake moves forward. The player uses the 4 arrow keys on the keyboard to change the current direction of the snake. When the snake eats the apple,the body of the snake grows longer, and another apple is randomly added to the gameboard. The game is over if the snake either runs into one of the 4 edges of the game board, or it runs into its own body.

In this project, you will modify the existing Snake game implemented by Jan Bodnar to gain experience with linked list data structures. This game program uses two arrays below to store the x and y coordinates of all joints of the snake:

private final int x[] = new int[ALL_DOTS];
private final int y[] = new int[ALL_DOTS];

You will remove these two arrays from the game program and implement SnakeLinkedList and SnakeNode classes to perform the tasks of these two arrays. Other existing variables and method names in the game program are unchanged (however some method bodies that contain the x and y coordinates can be modified).

The SnakeLinkedList class is either singly-linked list or doubly-linked list class. This class contains the basic variables and methods for a linked list and the snakeMove method to move the snake. You cannot use the existing LinkedList class in Java to implement this class.

The SnakeNode class contains the basic variables and methods for a linked list node, the x and y coordinates and colour (red, blue, white or green). The red head and joints on the body of the snake are instances of this SnakeNode class. Below is the required snake with a red head and 7 joints on its body.

Project 2. Automatic Language Identification

Your task is to implement a simple automatic language identification system that can identify 5 languages (English, French, German, Italian and Spanish) for a sequence of words input by a user at runtime (provided that all words in this sequence are in the same language).

You will implement two modes in the system: learning and identification. In the learning mode, the system learns a specific language by reading a text file that contains several words in that language and outputs a language model, which is a set of probability values of Markov chains (each word is represented as a Markov chain). In the identification mode, the userwill input a word sequence in an unknown language and the system compares this word sequence with 5 language models produced in the learning mode to calculate 5 matching scores. The system will then return the language whose model best matches the word sequence (i.e., the model that has highest matching score). Use Hashtable to store probability values of Markov chains.

The following shows you how to calculate probability values of Markov chains and matching scores.

A word is a sequence of letters. Let S be a set of 4 letters S = {b, e, l, u}. There are many combinations of these letters, for example, belu, beul, blue, bleu, etc. The combination blue is an English word and bleu is a French word.

The occurrences of letters in a word can be regarded as a stochastic process and hence the word can be represented as a Markov chain where letters are states. The occurrence of the first letter in the word is characterized by the initial probability of the Markov chain. The occurrence of other letter given the occurrence of its previous letter is characterized by the transition probability. After analysing several words in a specific language, a set of initial probabilities and transition probabilities will be collected. This set is called language model.

The initial probabilities and the transition probabilities P(letter x → letter y) for a language can be calculated as follows

P(letter x) = number of occurrences of x as the first letter / number of words

P(letter x → letter y) = number of pairs(x,y) / z∈ letter set∑ number of pairs (x,z)

For example, given the word set {added, bee, cab,dad} and the letter set {a, b, c, d, e} in English, the probabilities are calculated as follows:

• Four initial letters in the 4 words are a, b, c and d, so P(a) = P(b) = P(c) = P(d) = 1/4, P(e) = 0

• There are 10 pairs of letters: ad, dd, de, ed, be, ee, ca, ab, da and ad. Sort them as follows: ab, ad, ad, be, ca, da, dd, de, edand ee.

o From these 3 pairs ab, ad, and ad, we have P(a→a) = 0, P(a→b) = 1/3, P(a→c) = 0, P(a→d) = 2/3, P(a→e) = 0.
o From be, we have P(b→a) = P(b→b) = P(b→c) = P(b→d) = 0 and P(b→e) = 1
o From ca, we have P(c→a) = 1 and P(c→b) = P(c→c) = P(c→d) = P(c→e) = 0.
o From da, dd, and de, we have P(d→a) = 1/3, P(d→b) = 0, P(d→c) = 0, P(d→d) = 1/3, P(d→e) = 1/3.
o From edand ee, we have P(e→a) = 0, P(e→b) = 0, P(e→c) = 0, P(e→d) = 1/2, P(e→e) = 1/2.

Assume that we have calculated all language models. Given a sequence of words in an unknown language, now we can calculate the matching score that represents the occurrence of the word sequence in a language. The matching score is simply a product of the initial probability and transition probabilities that are used to generate the word sequence.

For example, the word sequence contains only a word, blue, there are two language models, English and French, and their initial probability and transition probabilities are as follows

Probability

From English model

From French model

P(b)

0.2

0.2

P(b→l)

0.3

0.3

P(l→u)

0.4

0.2

P(u→e)

0.5

0.3

The matching scores of the word blue with the English and French models are as follows

P(blue | English) = P(b) P(b→l) P(l→u) P(u→e) = 0.2*0.3*0.4*0.5 = 0.012
P(blue | French) = P(b) P(b→l) P(l→u) P(u→e) = 0.2*0.3*0.2*0.3 = 0.0036
Since 0.012 > 0.0036, the word blue is identified as an English word.

Hints will be given in lectures and tutorials.

Attachment:- Java-Assignment.rar

Request for Solution File

Ask an Expert for Answer!!
JAVA Programming: Your task is to implement a simple automatic language
Reference No:- TGS02761552

Expected delivery within 24 Hours