Most of the information that the text contains about multiple regression requires some knowledge of linear algebra. For that reason, you'll probably only want to use the text for reference/clarification. Refer mostly to the transparencies and your notes from lecture.
Reminder: This assignment will not be collected. Its purpose
is to help you review the material we have covered in class and the
text, but it is not to be viewed as an exhaustive list of
topics/skills that you need to know.
Problem 1 (Wonnacott and Wonnacott, problem 13-5)
In the midterm election U.S. congressional elections (between Presidential elections), the party of the President usually loses seats in the House of Representatives. To measure this loss concretely, we take as our base the average congressional vote for the Presidents's party over the previous 8 elections; the amount that the congressional vote drops in a given midterm election, relative to this base, will be our standardized vote loss, Y.
Y depends on several factors, two of which seem important and easily measurable: X1 - Gallup poll rating of the President at the time of the midterm election (percent who approved of the way the President is handling his job) and X2 = change over the previous year in the real disposable annual income per capita.
Year Y X1 X2
1946 7.3% 32% -$40
1950 2.0% 43% $100
1954 2.3% 65% -$10
1958 5.9% 56% -$10
1962 -0.8% 67% $60
1966 1.7% 48% $100
From the above data (Tufte, 1974), the following multiple regression equation was computed: Yhat = 10.9 - 0.13*X1 - 0.034*X2
- If X2 is kept constant, estimate the change in Y for a unit change in X1.
- If X1 is kept constant, estimate the change in Y for a unit change in X2.
- Estimate the vote loss Y for a midterm election when X1=60% approval, and X2=$50 increase in real income.
- Using a computer, calculate the multiple regression coefficients, and verify that they agree with the given equation. Which coefficients are significantly different from 0, using alpha=0.05? Using alpha=0.01?
- Suppose it is known a priori that Y has no relation whatever to X2. when you solve for the coefficient for X1, what do you get?
- If R-squared is 0.8711, what is the adjusted R-squared?
Problem 2 (Wonnacott and Wonnacott, problem 13-9)
In the previous problem, the congressional vote loss of the Presidents's party in midterm elections (Y) was related to the President's Gallup rating (X1) and change in real income over the previous year (X2). Specifically, the following regression was computed from n=6 points:
Yhat = 10.9 - 0.13*X1 - 0.034*X2
(0.046) (0.010)
- Find a 99% confidence interval for the coefficients of X2.
- Perform a hypothesis test to determine whether the coefficient of X1 is significantly different from zero. (Use alpha=0.05.)
- What assumptions are you making in the above questions?
- Which regressor gives strongest evidence of being statistically discernable?
Problem 3 (Wonnacott and Wonnacott, problem 13-11)
To determine the effect of the various influences on land value in Florida, the sale price of residential lots in the Kissimmee River Basin was regressed on several factors. With a data base of n=316 lots, the following multiple regression was calculated (Conner et al., 1973; via Anderson and Sclove, 1978):
Yhat = 10.3 - 1.5*X1 - 1.1*X2 - 1.34*X3 + ...
where
Y = price per front foot (in dollars)
X1 = year of sale (X1=1,...,5 for 1966,...,1970)
X2 = lot size (acres)
X3 = distance from the nearest paved road (miles)
- Other things being equal, such as year of sale and distance from the nearest paved road, was the price (per front foot) of a 5 acre lot more or less than a 2 acre lot? How much?
- Other things being equal, how much higher was the price (per front foot) if the lot was 1/2 mile closer to the nearest paved road?
- Was the average selling price of a lot (per front foot) higher in 1970 than in 1966? How much?
Problem 4 (Wonnacott and Wonnacott, problem 14-1)
To help firms determine which of their executive salaries might be out of line, a management consultant fitted the following multiple regression equation from a data base of 270 executives under the age of 40:
SAL = 43.4 + 1.24*EXP + 3.60*EDUC + 0.74*MALE
(0.30) (1.20) (1.10)
residual standard deviation s=16.4
where
SAL = the executive's annual salary ($000)
EDUC = number of years of post-secondary education
EXP = number of years of experience
MALE = dummy variable, coded 1 for male, 0 for female
- From this regression a firm can calculate the fitted salary of each of its executives. If the actual salary is much lower or higher, it can be reviewed to see whether it is appropriate.
Fred, who's 32, has been with the firm since he was 25; his annual salary is $126,000 annually. He got a 2-year MBA, following a 4-year undergraduate degree.
- What is Fred's fitted salary?
- How many standard deviations is his actual salary away from his fitted salary? Would you therefore call his salary exceptional?
- Closer inspection of his record showed that he spent 2 years studying at Oxford as a Rhodes Scholar before obtaining his MBA. In light of this info, recalculate your answers to the previous 2 questions.
- In addition to identifying unusual slaries in specific firms, this regression can be used to answer questions about the economy-wide structure of executive salaries in all firms. For example,
- Is there evidence of sex discrimination?
- Is it fair to say that each year's education (beyond high school) increases the income of the average executive by $3600 a year?
Problem 5 (Wonnacott and Wonnacott, problem 14-18)
Suppose a study of quantity supplied as a function of price gave the following regression:
log(Q) = 5.2 + 1.3*log(P)
For this question, keep in mind that a change in log(X) is approximately equal to relative change in X itself.
- If price increased by 3%, by about how much would the quantity supplied increase?
- What price increase would be required to increase the quantity supplied by 10%?