Can you please use the template below to create a simple spell checker program. Your program will have the following characteristics:
Your program will be capable of reading multiple files to be spell checked from command line arguments.
When your program starts up, you will read in the dictionary.txt file which contains a list of known words. Only read in this file once. Store these words in a HashSet data structure.
Create a TreeSet to store any miss-spelled words that you find.
For each file in the command line parameter list, print out the name of the file before reading in any words.
For each file parse out each of the words in the file and check to see if the word is in the HashSet you created with the dictionary.txt words or its already in your list of miss-spelled words.
If the word is in the dictionary HashSet or the miss-spelled word TreeSet then proceed to the next word.
If the word is not in the dictionary or the miss-spelled word list, give your user 2 choices: Add the word to the HashSet dictionary, or add the word to the list of miss-spelled words.
At the end of each file processed, dump out the list of miss-spelled words.
When you then start to read in the next file, make sure you clear the list of miss-spelled words. Keep any words added to the dictionary HashSet.
More details:
In parsing the words in these text files, you will need to filter out punctuation characters. I used the StringTokenizer, and I called the following constructor:
StringTokenizer st = new StringTokenizer(line, " \t,.;:-%'\"");
before checking the word against the dictionary, make sure you lower case the word
if the "word" doesn't start out with a letter (i.e. >= 'a' And <= 'z'), then skip it because it is likely a number
If the word couldn't be found in either list and ends with an 's', try removing the 's' to create a singular version of the word and try again to find the word in your dictionary or miss-spelled word list using the singular version of the word.
Only print out lines that contain a word needing a user response. See the Sample output below from the published answer.