Assignment:
General information about assignments (important!):
https://cs.acadiau.ca/~jdiamond/comp2103/assignments/General-info.html
Information on passing in assignments:
https://cs.acadiau.ca/~jdiamond/comp2103/assignments/Pass-in-info.html
Information on coding style:
https://cs.acadiau.ca/~jdiamond/comp2103/assignments/C-coding-style-notes
[1] A filter program is a program which reads its input from "standard input" ("stdin") and writes its output to "standard output" ("stdout"). Filter programs are useful because they make it easy to combine the functions they provide to solve more complex problems using the standard shell facilities. Filter programs are also nice to write, because the programmer doesn't have to worry about writing code to open and close files, nor does the programmer have to worry about dealing with related error conditions. In some respects, filter programs are truly "win-win".
Write a filter program which uses getchar() to read in characters from stdin, continuing until end of file (read the man page and/or textbook to see the details on getchar(), or, heaven forbid, review the class slides). Your program must count the number of occurrences of each character in the input. After having read all of the input, it outputs a table similar to the one below which, for each character seen at least once, lists the total number of times that character was seen as well as its relative frequency (expressed as a percentage). Note that the characters \n, \r, \t, \0, \a, \b, \f, and \v (see man ascii) must be displayed with the appropriate "escape sequence". Ordinary printable characters must be output as themselves. Non-printable characters (see man isprint) must be printed with their three-digit octal code (see man printf).
You can get input into a filter program (a2p1 in this case) in three ways:
(a) "pipe" data from another program into it, like
$ echo blah blah | a2p1
(b) "redirect" the contents of a file into the program, like
$ a2p1 < some-file
(c) type at the keyboard, and (eventually) type ^D (control-d) at the beginning of a line to signify end of file.
Your output must look like the following, for this sample case:
$ echo ^Aboo | a2p1
Char
|
Count
|
Frequency
|
----------------------------
|
001
|
1
|
20.00%
|
\n
|
1
|
20.00%
|
b
|
1
|
20.00%
|
o
|
2
|
40.00%
|
Note: in the above examples, and from now on in this course's assignments, text inredis text that the human types, and a "tiny_mce_markerquot; at the beginning of a line like that represents the shell prompt.
Note that I entered a ^A (control-a, not the circumflex character followed by the capital A) by typing ^V^A. The ^V tells your shell that you want it to interpret the next character literally, rather than to use any special meaning (during command line entry) that the next character might normally have. (Question: what does your shell do, when typing in a command line, if you type ^A without first entering ^V? Try it and see if you can figure it out. You might find it useful to know this, and to know what ^E and ^W do as well.)
You should run your program on a few different inputs to demonstrate to the marker that you have thought about (and programmed for!) the different cases that could occur. If you redirect input from a data file, rather than using "cat" to display the contents of your file when creating your transcript file, use a command like $ od -bc test-data-1 .
[2] The printf() function in C is very powerful and convenient, but it takes some getting used to.
This question will give you experience with this function.
When you are testing functions like printf() which produce formatted output, sometimes you want to make sure that the spaces in the output are the ones you expect. To make it obvious where all the white space came from, it is often convenient to enclose a format specification inside a pair of characters that are not otherwise used in that output. For example, if you use something like
printf("| f|\n", x)
then you will know whether the f format specification produced any spaces before the newline.
The constant M_PI, an approximation to the value of v, is defined in the system include file math.h. Write a program which prints out the value of M_PI using each of the following specifications, one specification per line of output:
(a) fixed point notation, field width of 9, 5 digits of precision, left justified
(b) fixed point notation, field width of 9, 5 digits of precision, right justified
(c) fixed point notation, field width of 9, no precision specification, right justified
(d) fixed point notation, field width unspecified, 5 digits of precision, right justified
(e) scientific notation, field width 9, 5 digits of precision, right justified
(f) scientific notation, field width 9, 5 digits of precision, left justified
(g) scientific notation, field width 14, 5 digits of precision, right justified
(h) scientific notation, field width 14, 5 digits of precision, left justified
As an example of what your output might look like, here is one sample line of output:
|3.14159 | is field 9, precision 5, left justified
Examine your output and try to understand what the printf() function is doing, especially any outputs that you find surprising.
[3] These days, many people are very concerned with the protection of private information. This program will do a very rudimentary form of encryption. (Don't use this for anything you want to keep secret!)
Write a filter program to encrypt standard input as follows:
(i) upper case letters between ‘A' and ‘M' are replaced with the lower case letter 13 positions further along in the alphabet (e.g., ‘B' is replaced with ‘o');
(ii) upper case letters between ‘N' and ‘Z' are replaced with the lower case letter 13 positions earlier in the alphabet (e.g., ‘Z' is replaced with ‘m');
(iii) lower case letters between ‘a' and ‘m' are replaced with the upper case letter 13 positions further along in the alphabet (e.g., ‘d' is replaced with ‘Q');
(iv) lower case letters between ‘n' and ‘z' are replaced with the upper case letter 13 positions earlier in the alphabet (e.g., ‘y' is replaced with ‘L');
(v) the four punctuation characters ‘.', ‘,', ‘!' and ‘?' are replaced with, respectively, ‘!', ‘?', ‘.' and ‘,'.
Of course, an encryption program is no use without a corresponding decryption program. A bit of thought (or experimentation) should show that this program will decrypt its own output.
To test your program, create a few text files; some small, some big. Then run some tests which will convince the marker of the following things:
(i) the output of the program looks different than the input; and
(ii) running the output of the program through the program a second time recovers the original data.
You can make use of the Unix utilities diff and/or cmp, as well as cat, to help convince the marker that your program works. Here is a sample run with a small number of tests. Note that you don't need to cat big files into the script file, but you can use ls -l or wc to show the marker that the files were big.
$
$
$cat test1.dat
This is a short file.
Is it long enough, or should it be longer?
$
$a3p1 < test1.dat gUVF VF N FUBEG SVYR!
vF VG YBAT RABHTU? BE FUBHYQ VG OR YBATRE,
$
$a3p1 < test1.dat | a3p1 This is a short file.
Is it long enough, or should it be longer?
$
$wc test2.dat
1241 3394 44616 test2.dat
$a3p1 < test2.dat > test2.dat.crypt
$
$wc test2.dat.crypt
1241 3394 44616 test2.dat.crypt
$
$cmp test2.dat test2.dat.crypt
test2.dat test2.dat.crypt differ: byte 3, line 1
$
$a3p1 < test2.dat.crypt > test2.dat.crypt.decrypt
$
$cmp test2.dat test2.dat.crypt.decrypt
$
Notice that, as is common for Unix programs, cmp says nothing in the "success" case, which for cmp is when the two files are identical.
The third (red) command entered above is interesting. It shows how the Unix shell (command interpreter) allows you to send the output of one program (the first a2p3) into the input to another program (in this case the second a2p3). This is an extremely powerful feature of the shell, especially since your program does not have to do anything special to make this happen; as far as your program is concerned, it is reading from standard input and writing to standard output. The fact that those may be the keyboard, screen, a file or another program generally don't matter to your program. (In fact, the occasional program does care, but that is an uncommon circumstance.)
In the partial script above I showed a test with a short file, and a long file. I showed that (some) letters got encrypted as they should, and that punctuation is properly encrypted. I also showed that my program works with a fairly long file, but you might want to try a bigger file yet. And you might consider using the head program to output just the first few lines of a very long file to your script file.
Should anything else be tested? Are there any boundary cases here?
Did you use functions in any of these questions? Should you have? Did you document them correctly?
Does you program "blow up" on unexpected input, or does it deal with bad input in a "graceful" way?
How does your program deal with boundary conditions, if there are any?
Did you remember to put all required comments in? Does your program call out for any other comments in the body of the code?