Python Programming Assignment
In this assignment you will write a program which reads a data file containing information in different formats, validates this data, and prints out a report incorporating this data. The data file contains information about a set of credit card transactions which includes the name of the purchaser, the date of the transaction, the amount of the transaction, and the credit card number. The four values for each transaction will be on a single line separated by colons. Each of these pieces of information might be entered using different possible formats, and you should be able to recognize each. Each of the possible formats are:
• Name - a first name consisting of letters, followed by a space, optionally followed by a middle initial (one letter) that may have a period after it, followed by a space, followed by a last name consisting of letters, optionally followed by a space and one of Sr, Jr, III or IV, with a period possibly after the Sr or Jr.
• Date - (1) the date can be in a text form which is made up of the name of the month, followed by a space and then the day in the month (1 or 2 digits), followed by a comma and a space, followed by the year (4 digits); (2) the date can also be in a numerical form which is made up of the month as a number between one and twelve (1 or 2 digits), followed by a / or -, followed by the day (1 or 2 digits), followed by a / or -, followed by the year (2 or 4 digits - if 2 assume the missing two digits are 20 if the existing digits are 16 or less, and 19 otherwise).
• Amount - the monetary amount is given in the format of an optional leading dollar sign, followed by the dollar amount in digits, optionally followed by a period and 2 digits representing the cents.
• Credit card number - Three types of credit cards may be used: Visa, Master Card and American Express. Each of these has a particular format: Visa numbers are 16 digits long and may be a string of 16 digits without spaces or dashes, or they can be four groups of 4 digits with a space or dash between each group, and the number begins with the digit 4; Master Card numbers have the same formatting for 16 digits, but the number begins either with a value between 51 and 55, or between 2221 and 2720; American Express numbers are 15 digits long and may be a string of 15 digits without spaces or dashes, or they can be three groups of 4, 6 and 5 digits with a space or dash between each group, and the number begins with the value 34 or 37.
Your program should take the name of a data file as a command line argument. After validating it, the data file should be read line by line. Each line should have the contents of the transaction validated and extracted using regular expressions as much as possible. If a data line contains a value that is not properly formated, an appropriate error message should be displayed about what is wrong and the data for that line should be ignored.
All data that is valid should be organized into a report in the following way: the transactions should be ordered such that those that use American Express cards come first, followed by those that use Master Card, followed by Visa. Within each credit card group, the transactions should be ordered chronologically based on the dates, with those that occur on the same date being ordered alphabetically by the person's last name, and if there are multiple transactions for the same person on that date, they should be ordered by the amount of the transaction.
As the data for a transaction is printed out in this report it should be in a standard format, not in whatever format was entered in the file.
This format is:
• Credit Card numbers are listed as separate groups of digits with spaces between them.
• Dates are listed in the format MM/DD/YYYY, where MM is a two digit form of the month, DD is a two digit form of the day, and YYYY is a four digit form of the year.
• The name is printed as the last name followed by the optional Sr., Jr., III or IV and then a comma, followed by the first name and then the optional middle initial with a period after it.
• The amount is listed as a dollar sign, followed by the dollars, and followed by a period and two digits for the cents.
You should submit your commented source code, along with the data file you used to test your program, and the corresponding report made by your program for this file.