Empirical Research in Finance
Assignment 1: Return calculation, descriptive statistics and gretl
The purpose of this assignment is to gain some experience in computing returns, port-folios and summary statistics. You are encouraged to use gretl as much as possible (as you will need it in later assignments as well), but if you are a die-hard Excel fan, you are allowed to use it as well. In any case, the data used need to be submitted as a gretl datafile, so you will have a minimum of gretl exposure in this assignment.
This assignment covers some new material and being able to work in teams stimulates discussion and helps digesting the material. Note that you should not co-operate with other teams!
Please strictly follow the guidelines below.
Assignment
Every team downloads from Datastream the return index for two stocks. (The Data-stream datatype for return indices is "RI".) You are free to choose any two stocks as long as the following requirements are met:
*The stocks are liquid and are traded (almost) daily.
*They have a history of at least ten years, more specifically you need to have data starting 31 December 2003.
*The stocks need to trade in the same currency zone. This is because you also have to download the return index of a stock market index for this zone. Tip: you can use the MSCI index for this country or region. An alternative is the Datastream country of regional index.
1 A Datastream terminal is available at the school, but also on the third floor of the University Library (Economics). If you are not familiar with Datastream, please consult the introductory text available on Blackboard.
You are asked to do the following:
1. Download the (three) daily return indices from Datastream starting from 31 December 2003
The daily data should be imported in gretl and saved as a gretl dataset. Of course, you do not have to print these data, but you do have to submit the gretl dataset together with your report. Also, you should clearly define the series that you eventually selected.
2. Compute the daily returns for both stocks. In addition, compute the returns of a Daily rebalanced
a) Equal-weight portfolio
b) Value-weight portfolio. (Hint: the Datastream datatype for market value is "MV".)
Finally, compute the daily returns of the stock market index, so that you end up with five return series. Describe their statistical characteristics (e.g. histogram, moments, quantiles, ...). Discuss the results briefly and compare the individual series to the portfolio series.
3. Compute the correlation matrix of the five return series.
4. You want to investigate to what extent the daily returns are normally distributed?
To this end, you compute
a) The four tests that are provided in gretl.
b) You also want to apply the chi-square one-sample test. (Hint: see section "Hypothesis tests on distributions" of Chapter 10 of the course text.) To this end, you compute the deciles of the distributions and compare the empirical frequencies with those expected if the data were normally distributed.
5. Next, you want to know whether the distributions are constant over time. You divide the sample into two (approximately) equal subperiods and then you test
a) Whether average returns are constant over time;
b) Whether volatility has remained constant over time;
c) Whether the distributions have remained constant over time. (Hint: use now the two-sample chi-square test.)
6. Divide the entire sample into three subperiods based on the index returns in the following way:
a) index returns lower than 2 standard deviations below the mean;
b) index returns higher than 2 standard deviations above the mean;
c) All others.
Compute the average returns and standard deviations of both individual stocks conditional upon the index return falling in each of the four categories above.
Verify the law of iterated expectations and the decomposition of variance.
7. Repeat the previous tasks #2 through #6 for monthly returns. (Hint: In gretl you can easily do this by compacting the daily dataset with return indices (save it first!). This procedure is found in the Data menu. Use end-of-period values.
You can now easily compute monthly returns.) Compare the results for this lower frequency to those for the daily data and discuss.
The assignment should result in a properly formatted report. I should be able to read the written report without having access to Excel, gretl or any other statistical software package. This means that it should include tables with the results (in the text), graphs (also in the text), and of course your interpretation of the results in a well-structured manner. (In this respect: below you will find the general feedback I gave last year after the first assignment. Try to avoid the errors made by your predecessors.)
Provide also the properly documented Excel workbook (if any), the gretl session (and/or the command log) you have used to generate the results and the gretl dataset. Give every file the following name structure: "ERF1
If you use external material you should clearly mention this in your report and indicate your source.
Failure to do so amounts to fraud!
General feedback previous academic year
*Remember that your report should be written as a scientific report. So do not use "telegram style". (This feedback note does: professor's privilege!)
*Reports should not be anonymous. Clearly indicate your names.
*Number your tables and figures and add notes in order to make them self-explanatory.
This allows you to refer to your tables (and figures) in the text by their number: "As can be seen in Table 1 ..." is much better than "As can be seen in the table below," Indeed, there may be many tables below and the wonders of your text processor may even have lead the table to be printed above the text. Most text processors provide automatic numbering for tables and figures -use this facility.
*Also number your pages. Some text processors do not do this by default, so check it.
*Check the spelling of your text. (L YX, Word and most other word processors provide automatic spelling checking - use it!)
*Describe everything that you did to produce the reported results. Anybody who reads your report should be able to reproduce it.
*Do not try to type formulas using "normal" text. Use your software's facilities for formulas. Note in this regard the following:
It is common usage to write variables in italics. Most equation editors do this by default. When you adopt this convention, do it consistently, also in the text.
In contrast, numbers and functions are usually not in italics.
Avoid using "
*" as the multiplication operator. Use the more common ""or" ".
As an example, we should have something like:
1 +R= exp(r); where R is the ordinary rate of return and r its continuous counterpart.
*When reporting numbers, be consistent in the use of the decimal separator. In English, the preferred use is the decimal point. So 9.5, not 9,5. However, if you want to use the latter, use it throughout your text, tables and figures. Consistency is king!
*Report only what you need to (or want to) report. Do not report all the output of the statistical software if you do not need such extensive results. For instance, Excel reports both standard deviation and variance. It is not likely that you want to have both in your report, as knowing either of them is sufficient. You would normally also comment on the items that you report. So that is the rule of thumb: if you are not going to discuss the stats, do not report them.
*Round your numbers and put returns in percentages. Numbers like 1E-205 can more compactly be written down as 0.00.
*When using a formula to compute something, explain it in general terms (unless it is so common that you can expect everybody to know what it is). Normally, you would never refer to the syntax of the software that you used to implement the formula. You can add a footnote saying which software you used, but that is it.
So, no sentences like "you will find this in cell A5 of sheet Alotofcrap," or "I have used the summary command to compute...".