Detailed Question:
Econometrics requires STATA skills.
1. Let n be some given integer. For an n-vector x, let f (x) = Σni=1 Σnj=1 βijxixj. If you knew the numbers βij's, how will you check if for all x, f (x) ≥ 0 using STATA? Illustrate your method by an example. (Hint: Write the function of interest in terms of a vector x and a matrix B. Use one of my manuals to learn about calculating eigenvalues in STATA.)
2. Let A be a certain n×n matrix. Consider the problem of maximizing x'Ax (where x is an n-vector) subject to the condition x'x = 1. Show that the solution is a certain eigenvector of a certain matrix that can be derived from A. Hint: Set up the Lagrangian and use matrix differentiation technique to characterize the first order condition. Explicitly show that there is a certain solution to your foc that will be the argmax of the given constrained optimization problem.
3. (a) Show that if A is symmetric positive definite and can be written as
A11 A12
A21 A22
then A11 and A22 are also positive definite.
(b) If in the above partition, A11 is 1 × 1, then A11A22 - A21A21' is also positive definite.
(c) Hence, show that there exists a lower triangular matrix P such that PP' = A.
This results is called the Cholesky Factorization Theorem and is very useful for computational purposes. (Hint: Use an induction argument. Suppose you have shown the result is true for matrices of size up to n and now, let A be (n+1)×(n+1).
Because of the special nature of P, you need to solve for a scalar, a column vector of size n and a n × n square matrix...)
4. Suppose y = (y1, y2, y3)t is distributed as N(µ, Σ), with
(a) What is the distribution of z = 4y1 - 6y2 + y3?
(b) What is the correlation coefficient between 3y1 - 2y2 and y2 + 7y3?
(c) Explicitly create three random variables, such that each is a linear combination of y1, y2 and y3 and they are pairwise independent.
Hint: Use the method of proof of one of the theorems on distribution of quadratic forms. Use STATA for computations.
5. For the above random vector y, use STATA's built-in statistical functions to determine the probability that each yi is in between 0 and 2.
6. For the cork dataset we used in the last lab session, use the first 25 observations to estimate the mean and variance covariance matrix of the random vector (n, e, s, w)' (assume that the sample hails from a normal population). Now, use Fact 13 to predict the (s, w)' values given the (n, e)' values for the last 3 observations. Create confidence intervals for each of the two variables you are trying to predict as well.
7. Prof. Gasdupta wants to test whether the undergraduates in his class have the same ability as the graduate students (as evidenced by a recent exam score). Assume that Gasdupta has 28 graduate students, and 33 undergraduate students. The average score of the graduates is 70 and that of undergraduates is 72. The sample standard deviations are 9 and 8. Make the assumption that any two different scores are independent and both samples hail from a joint normal distribution.
(a) The first thing Gasdupta wants to check is whether the two population variances are equal. This is Ec507 material from last semester. If you have not seen a test for this before, research into a statistics text book and unearth the test for equality of variances (It will be an F test, the relevant statistic having something to do with the ratio of the sample variances..). Show that Gasdupta cannot reject the null hypothesis that both distributions have the same population variance at 5% level. Preferably, retrieve tail probability values using STATA rather than a table (check out the help menu for "statistical functions"; for a distribution such as F, learn about the functions "F", "Ftail", "InvF" and "InvFtail" - corresponding functions exist for other distributions which crop up during testing...).
(b) Theoretically justify your procedure, i.e. argue that the relevant statistic is indeed F-distributed under the null using the results you have learnt from Topic 2 (this should be easy).
(c) Now assuming that the two populations have same sample variance, Gasdupta wishes to test whether they have equal means. This requires a "two-sample t-test". Again, this is Ec507 stuff from last semester, but if you haven't seen it before, research into a good statistics text, find out the relevant test statistic, and conduct the test.
(d) Theoretically verify that indeed the given statistic is t-distributed under the null. This is not as straightforward as part b), because now, both in the denominator and in the numerator, the statistics that appear are functions of the pooled sample. But I am sure you can do the verification following these suggestions: Proceed in a similar fashion as shown in the justification of the 1-sample t-test. This time, you will need to define a N(0, I) random vector of size n1 + n2, where n1, n2 are the two sample sizes. To get the chi-squared variate in the denominator, you will have to define a (n1 + n2) × (n1 + n2) matrix imaginatively...
8. In this problem, we consider a standard portfolio optimization problem from Finance.
(a) First, you will solve the problem theoretically using matrix differentiation methods. Let there be n assets, each with an uncertain return. Let r˜i represent the (net) return on the i-th asset, i.e. if I invest $1 today on the ith asset, I will get back tomorrow (1 + r˜i) dollars.
Let r represent the expectations of these returns written as an n × 1 column vector. Let Σ represent the variance covariance matrix of the asset returns; if σij is the i- j th entry of this matrix, then Cov(r˜i, r˜j ) = σij . We assume that both r and Σ are known to the investor.
The investor has ω dollars to invest. He wants to allocate this money in purchasing various assets so as to minimize his risk, as measured by the standard deviation of his portfolio while making at least an amount r¯ on his investment (in expectation). Let w represent the vector of dollar amounts spent in purchasing various assets. Convince yourself that in matrix notation the investor's problem is to choose w so as to
Minimize w'Σw
subject to ι'w = ω
r'w = r¯
Using the usual Lagrangian approach, find the expression for the optimal w (it is, unfortunately, a bit ugly-looking). Note that there should be two Lagrangian multipliers.
(b) Go to https://finance.yahoo.com. Download monthly price data for last 12 months on three stocks of your choice. To do this, enter the ticker symbol of the stock in the box next to "get quotes" on top of the page (if you don't know the ticker symbol, just start typing the company name and yahoo will show you symbols that match closely the company name you are typing...). Once you arrive on the stock page, go to the link for historical prices (on the left) and you will be allowed to download monthly price data by choosing a start and an end date. The adjusted closing prices are what we are looking for. Note that there should be 12 data points. After you have downloaded data in STATA, from the 12 price data values for each stock, you should be able to create 11 return vatiables (Note that ( Pt+1-Pt/pt ) is the t return between period t and t + 1 if price in period t is Pt).
i. Find the mean returns and the variances using the summarize or tabstat command.
ii. Find the (sample) variance-covariance matrix using the corr command with the covariance option as in
‘‘corr x y z, covariance''.
iii. You now have a three-asset instance of the problem (i.e. you have your r and Σ). Now make up values for r¯ and ω and solve the optimization problem using EXCEL. My suggestion is to choose r¯ as the average of the three mean monthly returns times your wealth ω. To know how to solve optimization problems with EXCEL go through my handout posted in the "Manuals" folder. Now verify that EXCEL's solution tallies with your theoretical solution.
9. Prove a slightly general form of Fact 10: Show that if x ∼ N(µ, Σ) then, Ax + b ∼ N(Aµ + b, AΣA'). Do this via the characteristic function approach, i.e. write down explicitly the characteristic finction of Ax+b, by utilizing the formula of the characteristic function of x given that x ∼ N(µ, Σ).
10. Prove Fact 11 by using Fact 10.
11. Prove Fact 12; that is, if
then show that the joint density function of can be written as the product of two different density functions, one for each subvector. Assume Σ11, Σ22 are positive definite matrices and make use the pdf formula for multinormal vectors.
12. Prove Fact 13 following these suggestions. Show that x1 and y are independent where y = x2 - Σ21Σ-111x1 (I know, this sounds strange given that x1 appears in y, but the claim is true); also, find the distribution of y. Now make use of the fact that P [U ≤ u|V = v] = P[U ≤ u] when U, V are independent and the generalized version of Fact 10 you proved earlier to infer Fact 13.