Assignment
This assignment uses data on natural gas bills for the house my wife and I bought in 2011. The gas is used mainly for heating. The data are in gasBills.rdata, which contains a single dataframe, gas. The variables in the dataframe are described on page 2.
You'll make one table with tabLego with three or four regressions for the questions be- low. You don't need to use var.tags when you make the table. You will need to put source("convenience.r") in your .rmd file. You can get convenience.r from the source directory.
Don't forget to correct standard errors for serial correlation!
1. The dependent variable throughout the assignment is bill. Run a Dickey-Fuller test (no trend) of whether it has a unit root. Describe how to interpret the results of the test.
Although the results of the D-F test turn out to be a little ambiguous, for the remainder of the assignment we'll proceed as though bill does not have a unit root (I think the test isn't very powerful since we have only 80 observations).
2. The first regression in your table will report coefficients for the bivariate relationship between our gas bill and heating-degree days. There s a small complication, however: as explained on the last page, the dates are the months in which we paid the bill, but the bill is for gas two months earlier, e.g., we pay for January heating in March. How should you handle that in the regression?
3. We bought a new high-efficiency furnace in 2016. The second regression in your table should add the newFurnace variable. Interpret that coefficient in words.
4. A new furnace doesn't do anything for you during warm weather when there are no heating-degree days. It matters most when it's really cold-lots of heating-degree days. Incorporate that insight into in your third regression and explain briefly. (Hint: Think about-or sketch-the relationship between heating degree days and gas used with the new efficient furnace compared to the old inefficient one.)
5. Referring to the coefficients, explain what your third regression says about the effect of our new furnace on our heating bills.
6. Describe a rationale for a fourth regression, add it to your table, and briefly discuss whether the regression supports your idea (it's completely OK if it doesn't).
Basically, your options for another regression boil down to (a) add one or more new variables or (b) change how variables enter the regression.
Variables in the gas bills dataframe (gas)
The gas dataframe has monthly observations from August 2011 to March 2018.
date The date coded as year + (month - 1)/12. In other words, January 2012 is 2012.0, February 2012 is 2012 + 1/12 = 2012.08333, and so forth.
bill The dollar amount of the bill paid in each month. Important: The bill is for gas used two months earlier. In other words the bill paid in February 2012 is for gas used in December 2011.
hdd Heating-degree days. The value for a day is 60 minus the average of the high and low temperatures for the day if the average is below 60 degrees Farenheit. For example, if the high was 55 and the low was 45, that day has 10 heating-degree days: 60 - 1 (45 + 55).
If the average of the high and low temperatures is 60 or above it is zero. The value for the month (what's in the dataframe) is the sum of heating degree days for the month.
newFurnace A dummy variable that is 1 starting in August 2016 when our new furnace was installed.
minTemp The average minimum daily temperature for the month. maxTemp The average maximum daily temperature for the month. jan, feb,... Month dummies.
month month numbers (January = 1, . . . , December = 12).
year Duh.