This task involves developing some functions that extract data from an SQL database. The scenario is that a company which owns an online vehicle search website wants to generate some statistics about their car dataset so that they can have a better idea about its distribution. You are asked to write four Python functions to generate the required statistics.
1. Write a Python function named top_N_models(make, N) that generates the number of occurrences of each model of a specified car make and prints the top N most frequent models for that car make together with their corresponding number of occurrences, one model per line. This function has two parameters. The first parameter is a string specifying a car make. The second parameter is an integer indicating the number of top models to be printed. For example, for function call top_N_models('BMW', 2), if the top two most frequent models of BMW are M3 with frequency 30 and M5 with frequency 25, your function is expected to produce the following result:
>>> top_N_models('BMW', 2)
M3 30
M5 25
Note that if there are no details for the specified make of car in the database an appropriate error message must be printed. (Hint: A non-existent row or field will be represented by the special Python value None.) See the unit tests in question file top_N_models_Q.py for further detail.
2. Write a Python function named top_N_expensive(N) that prints the make, model, and price of the top N most expensive cars, one car per line. The output should be ordered in descending order by price first, then in ascending order by car make, and finally in descending order by car model. Moreover, the output should not contain duplicates, i.e., any two lines in the output can't be exactly the same. This function has only one parameter, N, which indicates the number of makes to be printed.
For example, if the top five most expensive cars are MERCEDES-BENZ CLK320 with price $99,990, BMW X5 with price $89,400, BMW M5 with price $89,400, LEXUS LS430 with price $85,000, and BMW Z4 with price $85,000, the output of your function should be as follows.
>>> top_N_expensive(5)
MERCEDES-BENZ CLK320 $99990
BMW X5 $89400
BMW M5 $89400
BMW Z4 $85000
LEXUS LS430 $85000
3. Write a Python function named min_max(make)that prints the range of prices of the cars with the specified make. For example, if the maximum price and minimum price of all Toyotas in the database are $32,000 and $3,000, respectively, your function should produce the following result.
>>> min_max('TOYOTA')
TOYOTA ($3000-$32000)
Notice that the string printed should be formatted so that it is easy to read. Also, if there are no details for the specified make of car in the database an appropriate error message must be printed.
(Hint: A non-existent row or field will be represented by the special Python value None.) See the unit tests in question file min_max_Q.py for further detail.
Development hints
Before you can begin this task you must ensure that you have access to MySQL software and the MySQL Workbench so that you can create the database. You must also have access to an appropriate MySQL-Python module so that you can call MySQL functions from Python code. This will be either MySQLdb (for Windows users) or the MySQL Connector (for Mac users).
You need to write four Python programs for this task to retrieve data from the database. The four programs are independent from each other. You don't have to start from the first program. You can choose any one to start with.
Deliverables
The deliverables for this task are the four completed Python programs,
- min_max.py,
- top_N_expensive.py,
- top_N_makes.py and
- top_N_models.py.
Note that the "_Q" question suffixes have been removed from the file names. These programs must pass all of the unit tests in the supplied program templates without modification. You do not need to submit any SQL database scripts or dumps. We will use our own SQL database to test your software.