Overview: The UNIX operating system (and its variants, of which Linux is one) includes quite a few useful utility programs. One of those is wc, which is short for Word Count. The purpose of wc is to give users an easy way to determine the size of a text file in terms of the number of lines, words, and bytes it contains. (It can do a bit more, but that's all of the functionality that we are concerned with for this assignment.) Counting lines is done by looking for "end of line" characters (\n (ASCII 10) for UNIX text files, or the pair \r\n (ASCII 13 and 10) for Windows/DOS text files). Counting words is also straight-forward: Any sequence of characters not interrupted by "whitespace" (spaces, tabs, end-of-line characters) is a word. Of course, whitespace characters are characters, and need to be counted as such.
A problem with wc is that it generates a very minimal output format. Here's an example of what wc produces on a Linux system when asked to count the content of a pair of files; we can do better!
$wc |
prog3a.dat |
prog3b.dat |
|
2 |
6 |
38 |
prog3a.dat |
32 |
321 |
1883 |
prog3b.dat |
34 |
327 |
1921 |
total |
Assignment: Write a Java program (completely documented according to the class documentation guidelines, of course) named Prog3.java that counts lines, words, and bytes (characters) of text files. The output format is shown in the Output section, below.
The user is to be able to supply the name(s) of the file(s) in two ways. The first is on the command line, as wc expects. We saw how to read command-line arguments recently, and there's an example program that demonstrates how to do it (T01n24). If there are no command-line arguments, your program is to display some usage information and prompt the user to enter the file name(s) on the keyboard.
Data: On the class web page you can find the two files, prog3a.dat and prog3b.dat, that I used to create the example above. These are just sample input files, meant to get you thinking about how wc behaves. You should plan to create several sample input files of your own to test further the behavior of your program. You can be sure that your section leader will be grading your program by testing it on a variety of files. The more testing you do, the greater the likelihood that your program will work correctly when graded.
Output: Your program is to produce counts of the number of lines, words, and characters (bytes) found in each readable file provided on the command line, and the output is to be displayed to the user in the well-structured, clearly-labeled format shown below, starting with a blank line. The five lines above the table (a blank line, the description, another blank, and the prompting line) are to be displayed only when no filenames are given on the command line. Here is an example of the output we expect when the user gives no file names on the command line but provides two when prompted:
This program determines the quantity of lines, words, and bytes in a file or files that you specify.
Please enter one or more file names, comma-separated: prog3a.dat, prog3b.dat
Lines Words Bytes
-------- -------- --------
2 6 38 prog3a.dat
32 321 1883 prog3b.dat
-----------------------------------------
34 327 1921 Totals
If the user supplies the name of only one existing, readable file, the last two lines (the line of hyphens and the line of totals) are not to be displayed. If, when prompted for file names, the user fails to give any usable file names, your program is to terminate after displaying some helpful instructions about what the program does, what input is expected from the user, and what output the user can expect to receive. As shown, we expect the list of file names to be comma-separated when received from the direct prompting (no commas are typed when names are given on the command line, in keeping with common UNIX command line behavior).
Attachment:- Assignment.rar