First at all, the program should be in java in JGRASP developer, the teacher requires that and while you writing the code you should pit comments for everything you do so the grader can follow the steps of the code. in the end of the work, it is necessary to write for me some information that used in the project like analysis, what kind of algorithm and data structure you used and why you use it? for efficiency and complexity! and you should test the program with different inputs or test cases !!
Text Processing Functions
INFX 141 / CS 121 • BOZHENA BIDYUK • UC IRVINE • WINTER 2016
This assignment is to be done individually; you may not use code written by your classmates. Use code found over the Internet at your own peril -- it may not do exactly what the assignment requests. If you do end up using code you find on the Internet, you must disclose the origin of the code. As stated in the collaboration guidelines, concealing the origin of a piece of code is plagiarism. Use Piazza for general questions whose answers can benefit you and everyone.
General Specifications
You may use Java, Python, or Scheme/Racket for this assignment. Java is the safest choice because the assignment is written with Java in mind and contains a variety of helpful Java resources. Using Python or Scheme will require you to translate these Java resources.
If you use Java, your solution must fill out the program skeleton provided. (i) Fill in each method according to its Javadoc specification. (ii) Feel free to create additional methods / classes where necessary.
If you don't use Java, you should produce a similar skeleton to start with and fill it out. You should also be very precise with instructions for how to run your program -- what programs are needed, what versions, and so on. If the TA can't run your program, your grade will reflect that.
You should test your code thoroughly, of course, with test data you create. You may exchange test data with anyone in the class. We will test your program with our own text files.
At points, this assignment may be underspecified (i.e., not fully describe what to do in every situation). In those cases, post your questions on Piazza or check with the TA. For minor issues, make your own assumptions and document them.
Part A: Utilities
Write a method that reads in a text file and returns a list of the tokens (preferably alphanumeric) in that file. Write a method to print out frequency results.
Package: ir.assignments.two.a
File: Utilities.java
Method: tokenizeFile(File)
Method: printFrequencies(List)
Part B: Word Frequencies
Count the total number of words and their frequencies in a token list.
Package: ir.assignments.two.b
File: WordFrequencyCounter.java
Method: computeWordFrequencies(ArrayList)
Part C: 2-grams
A 2-gram is two words that occur consecutively in a file. For example, "two words", "words that", and "that occur" are all 2-grams from the previous sentence.
Count the total number of 2-grams and their frequencies in a token list.
Package: ir.assignments.two.c
File: .TwoGramFrequencyCounter.java
Method: computeTwoGramFrequencies(ArrayList)
Part D: Palindromes
A palindrome is a words or phrase that reads. the same in both directions. For example, these are all palindromes: "kayak", "Do geese see god", "A man, a plan, a canal--Panama". Count the total number of palindromes and their frequencies in a text file.
Package: ir.assignments.two.d
File: PalindromeFrequencyCounter.java
Method: computePalindromeFrequencies(ArrayList)
Once you have implemented your palindrome counting algorithm, please perform a short analysis of its runtime complexity: Does it run in linear t.ime relative to the size of the input? polynomial time? exponential time? This analysis should go in the analysis.txt file in this package.
Attachment:- Assignment.zip