課程名稱︰勞動經濟學一
課程性質︰經濟系選修
課程教師︰樊家忠老師
開課學院:社會科學院
開課系所︰經濟學系
考試日期(年月日)︰2017/11/08(三)
考試時限(分鐘):14:20~16:20 共2小時
試題 :
滿分180分
Question 1 (50 points)
Elliott is contemplating an estimation of the causal effect of the 90-day
operation of sobriety checkpoints, starting on June 1, 2012, on traffic
incidents caused by driving under influence (DUI). Elliott's data report the
daily number of DUI accidents in each country in 2011 and 2012. The sample
spans 150 days prior to June 1 and 180 days after June 1, for both years. No
other anti-DUI policy was carried out during the two years. The regression is
specified as:
DUI(c,d,t) = α + βT + γI + ρ(I * T) + X(c,d,t)π + ε(c,d,t) (1)
In equation (1), DUI(c,d,t) is number of DUI accidents in country c on day d
of year t. I is a dummy variable indicating the post-intervention time period.
T is the treatment indicator, and set at 1 for the treatment year. X(d,c,t)
refers to other control variables.
(a) What is the economic meaning of the coefficient of α?
(b) What is the control group in this setting? Why is it a proper control
group? [ explain in 30 words ]
(c) https://imgur.com/a/jNRNh
Figure 1 presents DUI accidents for 2012 (black curve) and 2011 (gray
curve). The 30-day period that begins with June 1 is called period 1; the
next 30 days is period 2, whereas the 30 days before June 1 is period 0,
and so on. Given the information provided by Figure 1, what do you expect
the signs of β, γ, and ρ to be? Your answers can be positive, negative,
or zero. Briefly explain each of your answers.
(d) Given the information provided by Figure 1, which of the following do you
think X(d,c,t) should include? Why? [explain in 30 words]
(1) A trend variable
(2) A squared term of the trend variable
(3) Both a trend variable and its squared term
(4) None of above
(e) Design a regression to test on the parallel assumption. As shown in
equation (1), specify the regression equation, explain the dependent,
independent varibles, and the sample. In your regression, which
coefficients should be used to test on the parallel assumption?
DUI(c,d,t) = α + βT + γTrend + δ(Trned * T) + X(d,c,t) + ε(c,d,t)
Question2: (40 points)
Lee (2008) estimated the causal effect of party incumbency on re-election
probabilities, using U.S. data. His interest is whether the Democratic candidate
for the seat in the U.S. House of Representatives has an advantage if his
party won the seat last time. The widely-noted success of House incumbents
raises the question of whether representatives use the privilages and
resources of their office to gain advantage for themselves or their parties.
Lee applied a regression discontinuity design (RDD).
Figure 2 plots the probability a Democrat wins against the difference between
Democratic and Republican votes shares in the previous election. The dots in
the figure are local averages ( the average win rate in non-overlapping
windows of share margins that are .005 wide). The probability of a Democratic
win at election t + 1 is an increasing function of vote share won by the
Democratic candidates minus the vote share won by the Republican candidate at
election t. The most important feature of the plot is the dramatic jump in win
rates at the 0 mark, the point where the two candidates get the same votes.
Based on the size of jump, incumbency appears to raise party re-election
probabilities by about 40 percentage points.
https://imgur.com/a/t5l0V
(a) Define the treatment variable and outcome variable in this study.
(b) Is this a sharp or fuzzy RDD? Why?
(c) The two validity requirements for a RDD are (1) smoothness of density
distribution of observations across the cutoff point; and (2) smoothness
of the means of all observables at the cutoff point. To examine the
validity of his RDD, Lee presented Figure 3 as shown above. The variable
in the vertical axis refers to the number of Democratic victories in the
elections before election t. Is Figure 3 useful to support requirement (1),
(2), neither, or both? Why?
https://imgur.com/a/VBD8z
(d) Design tand layout the regression equation for the estimation of Lee's RDD.
Carefully define the dependent variable and independent variables used in
the equation. Specify the coefficient that indicates the jump in win rates
at the cutoff point.
Question 3: (50 points)
Dr. John Snow is regarded as one of the founding fathers of modern epidemiology
As London suffered a series of cholera(霍亂) outbreaks during the mid-19th
century, Snow theorized that cholera reproduced in the human body and was
spread through contaminated water. This contradicted the prevailing theory that
diseases were spread by "miasma" (瘴氣) in the air. This question regards his
research design.
During the 1854 outbreak, there were only two water supplies in London - the
Southwark & Vauxhall Water (SVW) Company and the Lambeth Water (LW) company.
Note that SVW pumped water from a part of River Thames that was contaminated
with sewage, while LW pumped its water from further upstream, where the River
Thames was clean. The entire London can be catergorized into three areas:
● Area A was supplied only by SVW
● Area B was supplied only by LW, and
● Area C by both. In area C, a household is either supplied by SVW or LW, and
adjacent houses often had different water suppliers and did not know who
their suppliers was.
(a) With death data of all the three areas in 1854, Snow is intended to
estimate the casual effect of contaminated water on the hazard of cholera
defined by deaths caused by cholera in each 10,000 houses. There are two
simple-difference approaches to estimate the effect: (1) comparing area A
as the treatment group and area B as the control group; and (2) comparing
households supplied by SVW in area C as the treatment group, and those
supplied by LW in area C as the control group. Which approach is more
desirable? Why or why not?
Snow obtained a list of all cholera deaths in areas A and B during the first
seven weeks of the 1854 outbreak. For each death, he determined the water
supplier to the house of the deceased. The following table presents his data.
______________________________________________________________________________
│
│ Number of houses Death from Death in each
│
│ Cholera 10,000 houses
│
______________________________________________________________________________
│
SVW (area A) │ 40,048 1,263 315
│
│
LW (area B) │ 26,107 98 38
│
│
Other Cities │ 256,483 1,488 58
______________________________________________________________________________
(b) The treatment effect can be expressed by conditional expectation function:
E(Y,dirty | SVW) - E(Y,clean | SVW), where SVW indicates the treatment
(water supplied by SVW) and Y is the potential hazard if treated (Y,dirty)
or not (Y,clean). Since E(Y,clean | SVW) cannot be observed, we can only
measure E(Y,dirty | SVW) - E(Y,clean | LW) as a proxy. Please use figures
in the above table to calibrate the treatment effect.
(c) In 30 words, explain why your answer in part (a) may suffer from selection
bias,
(d) Present the selection bias using condition expectation function.
(e) Suppose earlier in 1849, LW still sourced dirty Thames water before the
company moved the source up-steam in 1853. Now Snow also collected data
from the 1849 outbreal in London. Please construct a
difference-in-difference regression to estimate the treatment effect using
the following two dummy variables:
1. SVW: a dummy variable indicating water being supplied by Southwark &
Vauxhall Water Company (as opposed to LW).
2. Year1854: a dummy variable indicating year 1854 (as opposed to year 1849
Question 4: (40 points)
Elliott designs the following experiment. He picks a large primary school in
Taipei that has 24 year-one classes every year. At the beginning of the school
year, he randomly selects 50% of year-one students to catergorize them into the
treatment group, and the remaining 50% into the control group. Students in the
treatment group are assigned to a class according to their months of birth.
That is, January-born students are assigned to the one class, February-born
students to another class, etc. In total, there will be 12 classes in the
treatment group. For the control group, all students are randomly assigned to
another 12 classes without considering their months of birth. Teachers are
randomly assigned to the 24 classes. One year later, Elliott organizes a
mathematic test to all the students in the experiment. Assume that no student
or teachers quits the experiment or switches to another group.
(a) Students born in which month are the oldest at school entry in Taiwan? Why?
(b) If elliott finds that August-born students in the treatment group perform
better than August-born students in the control group, provide a reason for
such a difference. [ In 30 words ]
(c) If elliott finds that September-born students in the control group perform
better in the math score than September-born students in the treatment
group, provide a reason for such a difference. [In 30 words ]
(d) Provide a reason that the comparison made in pary (b) is biasesd because of
the Hawthrone Effect. [ In 30 words]