代做AEM 4110/5111 – Introduction to Econometrics Problem Set 5代做留学生SQL语言

Problem Set 5

AEM 4110/5111 – Introduction to Econometrics

Instructions

This problem set is due by 11/20 at 11:59pm.

•  Submit your answers via Canvas in the assignments section of the course.

•  Submit a zipped folder with the following documents.  The zipped folder should be named according to “PS5” “lastname”

1. A write-up in PDF format with your answers to the questions below and the full names of all your group members.

2. A do-file with the Stata code you use for your answers.  In the do-file,  comment your script specifying which sections correspond to each answer in your write-up.

3. For the questions that require filling a table, you can create one in Excel or using LaTeX.

4. Important! Please write each answer on a separate page and clearly label it with the corresponding question number (for example, Question I.1.a, Question I.1.b, etc.).

Question I: RDD

Goal: In this problem, you will analyze the causal effect of harsher DUI (driving under influence) punishments on recidivism (repeat offenses) using a Regression Discontinuity Design.

Set up

In many U.S. states, drivers arrested for driving under the influence (DUI) face different penalties depending on their blood alcohol content (BAC) at the time of arrest.  Specifically:

•  Drivers with BAC ≥ 0.08 face stricter punishments:  higher fines, longer license suspen- sions, mandatory jail time, and a permanent criminal record.

•  Drivers with BAC < 0.08 face lighter punishments:  smaller fines, shorter suspensions, and often no jail time.

This sharp cutoff at BAC = 0.08 creates a natural experiment.  Drivers just above and just below the threshold are likely very similar in terms of their drinking behavior, demographic characteris- tics, and driving patterns.  The only difference is that those just above 0.08 receive much harsher punishment.

You will use this discontinuity to estimate whether harsher DUI penalties reduce the likelihood that offenders commit another DUI offense within the next 4 years (recidivism).

The Data

The dataset hansen dwi .dta contains information on DUI offenders in Washington State from 1999-2007. The key variables are:

bac1: Blood alcohol content (BAC) at the time of arrest (the running variable)

• recidivism: Indicator for whether the individual was arrested for DUI again within 4 years (the outcome variable)

male: Indicator for male

white: Indicator for white

aged: Age at the time of arrest

• acc: Indicator for whether the arrest involved an accident

1 Let’s start by thinking about why it’s hard to estimate the causal effect of the DUI penalty.

(a) Why would it be problematic to simply compare recidivism rates between all offenders with BAC ≥ 0.08 and all offenders with BAC < 0.08 (so, even individuals far away from the cutoff)? Explain in 2-3 sentences, being specific about the source(s) of bias and using the data to support your statement.

(b) Explain the key assumption that allows RDD to identify the causal effect of harsher punish- ment in this context. What must be true about offenders just above vs. just below the BAC cutoff of 0.08? Answer in 2-3 sentences.

(c) Is this a Sharp RDD or Fuzzy RDD design? Explain your reasoning based on the treatment assignment rule. Answer in 2-3 sentences.

2 Let’s now start with our regression analysis.   The simplest RDD specification estimates the following regression:

recidivismi = β0 + β1duii + ui                                                                        (1)

where duii = 1 if BAC ≥ 0.08, and 0 otherwise.

(a) Load the dataset in Stata.  Keep only observations where BAC is within 0.05 of the cutoff (i.e., BAC between 0.03 and 0.13). This is your analysis sample. How many observations remain in your sample?

Hint: Use keep  if  bac1  >=  0 .03  &  bac1  <= 0 .13

(b) Generate the dui dummy variable.

(c) Run regression 1 in Stata.

Report the coefficient on dui and its standard error.

Coefficient: Standard Error:

(d) Is this coefficient statistically significant at the 5% level?  State your null hypothesis, alterna- tive hypothesis, and decision rule clearly.

H0:

H1:

3 Now estimate a more flexible RDD specification that controls for the running variable (BAC).

recidivismi = β0 + β1duii + β2bac centeredi + ui                                                   (2)

where bac centeredi  = bac1i  - 0.08 (BAC centered at the threshold).

(a) First, create the centered BAC variable in Stata. What is the mean of bac centered?

(b) Run the regression. Report the coefficient on dui and its standard error:

Coefficient: Standard Error:

(c) Has your estimate changed compared to Question 1?  If so, why? Explain in 2-3 sentences what role the bac centered variable plays.

4    [Optional question, not graded] Now estimate an RDD specification that allows for different slopes on either side of the cutoff:

recidivismi = β0 + β1duii + β2bac centeredi+ β3(duiXbac centeredi) + ui                    (3)

where duiXbac centered is the interaction between dui and the centered BAC variable

(a) Generate the interaction term and run this regression. Report the coefficient on dui and its standard error:

Coefficient: Standard Error:

(b) Provide an interpretation for the coefficient β3 .  Can you reject the null hypothesis that the coefficient is 0 at the 5% level? What can you conclude? Use 2-3 sentences.

Hint:  Review Lecture 12 (dummy X continuous variable interaction).

5 Now we are going to test whether the RDD assumptions are satisfied.  A key assumption in RDD is that individuals just above and below the cutoff should be similar in terms of observable characteristics (other than the treatment). In other words, we want balance on covariates.

(a) For this exercise, we want to test whether there is a discontinuity in male (gender) at the threshold by running:

malei = β0 + β1duii + β2bac centeredi + ui                                               (4)

Report the coefficient on dui and its p-value:

Coefficient: p-value:

(b) Is there evidence of a discontinuity in gender at the threshold? If this result holds for other covariates, what does it suggest about the validity of the RDD design? Explain in 2-3 sentences.

6 Another key concern in RDD is whether individuals can manipulate the running variable to get on one side of the threshold or the other.

(a) In this context, why might we be worried about manipulation of BAC levels? Give one specific example of how manipulation could occur.

(b) Create a histogram of the BAC variable (bac1) using narrow bins (e.g., 0.002 width).  You can use the following Stata command:

histogram  bac1,  width(0.002)  xline(0.08)

Based on the histogram, do you see any evidence of unusual bunching or gaps around the 0.08 threshold that would suggest manipulation? Explain what you observe in 2-3 sentences.

Optional Question (Not Graded) In a more advanced RDD analysis, researchers often test the validity of their design by examining whether there are discontinuities at placebo cutoffs (fake thresholds where there should be no effect).

Choose  a placebo  cutoff at  BAC  =  0.10  (above  the  real  threshold)  and  re-run  your  RDD regression from question 3, but this time:

•  Define a new treatment indicator: placebo dui  =  1 if BAC ≥ 0.10, 0 otherwise

•  Use a new centered variable: bac placebo centered  =  bac1  -  0 .10

•  Restrict your sample to BAC between 0.07 and 0.13

Report the coefficient on placebo dui. Is it statistically significant? What does this placebo test tell you about the credibility of your main RDD results?

Question II: Diff-in-Diffs

The Set-up For this problem, we will work on the same problem as Problem Set 4: estimating the treatment effect of a mentoring program in high school.  Now we continue to work under the assumption that there is self-selection into the mentoring program, that is, students voluntarily choose whether to enroll or not.

In Problem Set 4 you already verified that students who selected into the program have, on average, a lower GPA than those who don’t.  You also saw that this generates a bias when we try to esti- mate the treatment effect of the program by comparing students who enrolled and those who don’t.

Now you are going to use a Diff-in-Diffs estimator to estimate the treatment effect of the program, where you exploit the panel structure of your data.

The Data The dataset mentoring data panel .dta contains the following variables:

•  student id: Student identifier

time: time variable = 0 if before treatment, = 1 if after treatment

income: parental income

•  parent educ: parental education

treat: dummy if the student self-selected in the mentoring program

gpa: students GPA

1 We saw that we can compute the Diff-in-Diff estimator by running the following regression

gpait = β0 + β1postt+ β2treati + β3treatXpostit + uit                                            (5)

where post = 1 if time = 1 and treatXpost = 1 if post = 1 AND time = 1 and 0 otherwise.

(a) Using 1-2 sentences each, please provide an interpretation for each coefficient.

(b) Run regression 5 and report the coefficients and p-values.

Note:  You need to generate the variables post and treatXpost.

β1: p-value:

β2: p-value:

β3: p-value:

(c)  [For AEM 5111 only] Based on the estimates you got in point (b), do you think that there’s a time trend in GPA? Explain in 2-3 sentences.

2 Let’s now think about the assumption we need to make to claim causality.

(a) Which assumption do you need to make in order to interpret the Diff-in-diff estimator as the causal effect of the program on GPA? Using 2-3 sentences, discuss the assumption and describe which data you would need to test it.

(b) Can you think of a case when the assumption in (a) may be violated? Discuss using 2-3 sentences.

(c)  Optional (Not Graded) You find that the grades of the students in the treated group are

trending downward, whereas the GPA of the other students is approximately stable.

Why would this be a problem for your estimation of β3?

• Under this scenario, how would your estimate of β3  compare to the one you estimated in the previous part?

Hint:  You can use a figure to help your reasoning.




热门主题

课程名

mktg2509 csci 2600 38170 lng302 csse3010 phas3226 77938 arch1162 engn4536/engn6536 acx5903 comp151101 phl245 cse12 comp9312 stat3016/6016 phas0038 comp2140 6qqmb312 xjco3011 rest0005 ematm0051 5qqmn219 lubs5062m eee8155 cege0100 eap033 artd1109 mat246 etc3430 ecmm462 mis102 inft6800 ddes9903 comp6521 comp9517 comp3331/9331 comp4337 comp6008 comp9414 bu.231.790.81 man00150m csb352h math1041 eengm4100 isys1002 08 6057cem mktg3504 mthm036 mtrx1701 mth3241 eeee3086 cmp-7038b cmp-7000a ints4010 econ2151 infs5710 fins5516 fin3309 fins5510 gsoe9340 math2007 math2036 soee5010 mark3088 infs3605 elec9714 comp2271 ma214 comp2211 infs3604 600426 sit254 acct3091 bbt405 msin0116 com107/com113 mark5826 sit120 comp9021 eco2101 eeen40700 cs253 ece3114 ecmm447 chns3000 math377 itd102 comp9444 comp(2041|9044) econ0060 econ7230 mgt001371 ecs-323 cs6250 mgdi60012 mdia2012 comm221001 comm5000 ma1008 engl642 econ241 com333 math367 mis201 nbs-7041x meek16104 econ2003 comm1190 mbas902 comp-1027 dpst1091 comp7315 eppd1033 m06 ee3025 msci231 bb113/bbs1063 fc709 comp3425 comp9417 econ42915 cb9101 math1102e chme0017 fc307 mkt60104 5522usst litr1-uc6201.200 ee1102 cosc2803 math39512 omp9727 int2067/int5051 bsb151 mgt253 fc021 babs2202 mis2002s phya21 18-213 cege0012 mdia1002 math38032 mech5125 07 cisc102 mgx3110 cs240 11175 fin3020s eco3420 ictten622 comp9727 cpt111 de114102d mgm320h5s bafi1019 math21112 efim20036 mn-3503 fins5568 110.807 bcpm000028 info6030 bma0092 bcpm0054 math20212 ce335 cs365 cenv6141 ftec5580 math2010 ec3450 comm1170 ecmt1010 csci-ua.0480-003 econ12-200 ib3960 ectb60h3f cs247—assignment tk3163 ics3u ib3j80 comp20008 comp9334 eppd1063 acct2343 cct109 isys1055/3412 math350-real math2014 eec180 stat141b econ2101 msinm014/msing014/msing014b fit2004 comp643 bu1002 cm2030
联系我们
EMail: 99515681@qq.com
QQ: 99515681
留学生作业帮-留学生的知心伴侣!
工作时间:08:00-21:00
python代写
微信客服:codinghelp
站长地图