For the following assignment you have to download and work with the data set "RS_FI_2018.csv" which is on Studynet. This dataset contains annual financial statement data (Balance Sheet and Income Statement information) for over 15'000 banks from the 35 OECD countries over the period 2000 - 2014. The data contains information from unconsolidated bank statements and is sourced from the Bankscope database. All balance sheet and income statement positions are reported in million USD.
Assignment
1) Data sample
• Create a sample which includes all observations in the years 2005-2014 for banks with the following specialization: "Commercial Banks", "Cooperative Banks" and "Savings Banks"
• Create a summary table showing the number of banks per country and year and the total number of bank-year observations in the sample
2) Variables:
• Create a dataset with the following variables based on balance-sheet / income-statement items:
o "Equity_Ratio": Ratio of Equity to Total Assets
o "Loan_Ratio": Ratio of Loans to Total Assets
o "Other_Assets_Ratio": Ratio of Other Earning Assets to Total Assets
o "Deposit_Ratio": Ratio of Total Customer Deposits to Total Assets
o "Other_Liabilities_Ratio": Ratio of Non-Customer Deposit Liabilities to Total Assets
o "Interest_Income_Ratio": Ratio of Net Interest Revenue to Total Assets
o "Other_Income_Ratio": Ratio of Other Operating Income to Total Assets
o "Profit_Ratio": Ratio of Profit Before Tax to Total Assets
o "Size": Natural logarithm of Total Assets in USD
o "Bank_Type" which is 1 for commercial banks, 2 for cooperative banks and 3 for savings banks
o "Commercial_Bank" which is 1 for commercial banks and 0 for cooperative and savings banks
3) Data cleaning
• Drop observations which have missing values for any of the constructed variables
• Produce a table with descriptive statistics of all constructed variables.
• Drop observations which display "unreasonable" values in the constructed variables.
• Reproduce the table with descriptive statistics
4) Create a cross-sectional dataset (1 observation per bank)
• Create a dataset with
o the average of each constructed balance sheet and income statement ratio per bank over the period 2005-2014.
o the standard deviation of "Profit_Ratio" over the period 2005-2014.
o The bank "type" and the country in which the bank is located.
• Create a table showing the number of banks in the dataset per country as well as the total number of banks in the dataset
• Drop banks from those countries which have less than 10 banks in the sample
• Produce a table with descriptive statistics of all constructed variables.
• Label all variables in the data set with short, meaningful labels.
5) Univariate tests (T-tests)
• Use the cross-sectional dataset to test for differences in the means of leverage between Commercial Banks vs. Cooperative and Savings Banks
6) Scatterplot
Use the cross-sectional dataset to produce two scatterplots examining:
• Scatterplot 1 shows whether risky banks (as measured by the standard deviation of "Profit_Ratio") have higher or lower leverage
• Scatterplot 2 shows whether larger banks have higher or lower leverage
7) Use the cross-sectional dataset to conduct linear regressions:
a. Relate "Equity" to the riskiness of the bank controlling for the size of the bank
b. Repeat regression (a) controlling for bank type
c. Repeat regression (b) controlling for the country in which the bank is located.
Dataset and Software
- The dataset for completing the assignment is available on StudyNet
- You are expected to use the software R to complete your assignment. R is available on the computer terminals in the computer labs.
Learning and using Stata:
- You are expected to make yourself familiar with the basic functions of the software including the use of .R files to write and run codes.
- A manual will be made available on StudyNet.
- On the Stata website you will find further material to the software.
Deliverables and deadline
Each student has to submit the following:
• R script file with the code used to complete the assignment.
• A word / pdf document presenting the results in formatted tables / graphs as well as a brief description of the findings (max 1-2 pages for description followed by max 5 pages tables/graphs)
Attachment:- README and DATA.rar