代做STAT0021: Assessment 4 Instructions Term 1, 2023-24代做回归

STAT0021: Assessment 4 Instructions

Term 1, 2023-24

1    Introduction

Please read and understand these instructions before you begin the assessment.

Assessment 4 will begin with the release of these instructions on the STAT0021 course Moodle page within the “Assessment 4 – Individual Coursework – Term 1” section at 1pm on Wednesday 13th December 2023.

The intention of the assessment is for you to apply the techniques you have learned during the course to a real-world dataset made up of a number of variables (percentage of the population double vaccinated for COVID-19, median household income, median house price, etc.) measured for subregions of London.

A copy of the data to be analysed is available as an Excel spreadsheet on the course Moodle page within the “Assessment 4 – Individual Coursework Term 1” section.

Assessment 4 makes up 50% of your module mark for STAT0021

2    Data

The data are real measurements for subregions (Middle Layer Super Output Areas, or MSOAs) of London. In total, there are 11 variables recorded for 982 observations. Vaccination data was recorded in December 2021. Demographic data is accurate as of the 2011 census, but can be treated as being contemporaneous with the vaccination data.

Variable name               Description

ID

A unique identifying number assigned to each observation.

VaxPercent

The percentage of the population who have received at least two COVID-

19 vaccination doses.

Political

An indicator of the political group which controls the borough in which the subregion is located.

0: Conservative 1: Labour

2: Other (Liberal Democrat or no majority party)

PopDensity

The population density (people/km2)

Over65

The percentage of the population who are aged 65 or over

Obesity

The percentage of the population who are classified as obese (BMI 30)

PostALevel

The percentage of the population who have a qualification above A-Level (e.g. a university degree or similar vocational qualification)

Unemployment

The percentage of the population who are unemployed

HHBenefit

The percentage of the population living in households reliant upon means- tested benefits

MedHHInc

The median household income

MedianHP

The median house price

3    Submission structure

You should structure your analysis and subsequent write-up according to the below headings.

3.1    Exploratory data analysis

The first step in any data analysis is to explore the data to get a sense of what the variables represent and the potential for relationships between them.

Your submission should include three separate, distinct exploratory analyses, each of which contains all of:

A.   The results of a single numerical calculation (e.g. a summary statistic or the results of a hypothesis test).

B.   A single figure (generally containing a single plot, but potentially containing up to three related plots).

C.   A discussion of what your numerical result and figure tell us about the London and justification of why this information is interesting.

Note that:

1.    Each of your exploratory analyses will be marked out of 6 marks (for a total of 3x6=18 marks overall for this section of the assessment).

2.    Marks will be awarded for the degree of insight shown in each part of the analysis. A numerical result and/or plot which is not discussed will receive a poor mark – a large proportion of the marks will be awarded based upon the degree to which your discussion correctly interprets your results and justifies why they are insightful.

3.   Variety across the three analyses will be rewarded. For example, submissions which repeat the same analysis and discussion for three sets of variables fail to show a breadth of understanding and will receive a poor mark.

4.    Neither of your discussions should include VaxPercent, as this is the focus of the later parts of the assessment.

5.   You are free to transform and/or combine variables, and to identify and potentially remove any outliers from the data. Any such decisions should be justified in your discussion.

3.2    Simple linear regression

VaxPercent is the focus of this part of the task. How can the other variables be used to explain the variability in VaxPercent via simple linear regression?

Your submission should include:

A.   Justification of which variable can be used as a covariate to produce the best simple linear regression model for the outcome VaxPercent.

B.   An interpretation of the estimated model coefficients for your best simple linear regression model.

C.   Comments on the fit of your best simple linear regression model.

D.   A plot of VaxPercent against the covariate in your best simple linear regression model with the accompanying regression line.

Note that:

1.   This component of your submission will be marked out of 9 marks.

2.    There is not a specific definition of the “best” model. It is likely to be based both upon how well the model fits the data and how well the assumptions underlying simple linear regression are satisfied (quantitative and qualitative evidence). Include in your justification why you would categorise your model as being the best and the steps you took to arrive at this best model.

3.    Your model can include a variable which is not present in the original dataset, but which has been obtained via a transformation or combination of variables in the original dataset. You should not bring in external data. You should provide a justification of why any new variable is useful/interesting if you haven’t already given an explanation earlier in your submission.

4.   You should support your justification, interpretation and comments with suitable Stata output.

5.    If there are any particularly unusual observations identifiable as a result of your analysis, you should mention them using their ID and justify why you do or do not believe them to be outliers. If you believe them to be outliers, then you can exclude them when fitting your model.

Your submission should also include:

E.    The lower quartile, median, and upper quartile value for the covariate in your best simple linear regression model.

F.    A mathematical equation to indicate how your best simple linear regression model can be used to make predictions of VaxPercent.

G.   Predictions of the value of VaxPercent when the covariate in your best simple linear regression model takes its lower quartile, median, and upper quartile values.

Note that:

6.   This component of your submission will be marked out of 3 marks.

7.    If your best model includes variable x as the covariate, you should use Stata to calculate the lower quartile, median, and upper quartile values of x. Then, calculate the corresponding predicted values of VaxPercent according to your best model.

3.3    Multiple linear regression

VaxPercent is again the focus of this part of the task. How can the other variables be used to explain the variability in VaxPercent via multiple linear regression? Your submission should include:

A.   Justification of which variables can be used as covariates to produce the best multiple linear regression model for the outcome VaxPercent.

B.   An interpretation of the estimated model coefficients for your best multiple linear regression model.

C.    Comments on the fit of your best multiple linear regression model fit. Note that:

1.   This component of the assessment will be marked out of 9 marks.

2.    There is not a specific definition of the “best” model. It is likely to be based both upon how well the model fits the data and how well the assumptions underlying multiple linear regression are satisfied (quantitative and qualitative evidence). Include in your justification why you would categorise your model as being the best and the steps you took to arrive at this best model.

3.    Your model can include variables which are not present in the original dataset, but which are obtained via a transformation or combination of variables in the original dataset. You should  not bring in external data. You should provide a justification of why your new variables are useful/interesting if you haven’t already given an explanation earlier in your submission.

4.   You should support your justification, interpretation and comments with suitable Stata output.

5.    If there are any particularly unusual observations identifiable as a result of your analysis, you should mention them using their ID and justify why you do or do not believe them to be outliers. If you believe them to be outliers, then you can exclude them when fitting your model.

Your submission should also include:

D.   The lower quartile, median, and upper quartile values for each covariate in your best multiple linear regression model.

E.    A mathematical equation to indicate how your best multiple linear regression model can be used to make predictions of VaxPercent.

F.    Predictions of the value of VaxPercent when the covariates in your best multiple linear regression model jointly take their lower quartile, median, and upper quartile values.

Note that:

6.   This component of the assessment will be marked out of 3 marks.

7.    If your best model includes variables x1, x2, … as the covariates, you should use Stata to calculate the lower quartile, median, and upper quartile values of x1, x2, … . Then, calculate  the corresponding predicted values of VaxPercent according to your best model. That is, you should submit three predicted values. One for when all of your covariates take their lower quartile values, one for when they all take their median values, and one for when they all take their upper quartile values.

3.4    Linear regression with a factor variable as a covariate

Linear regression is useful for understanding how continuous variables influence other continuous variables. There may be occasions when we would like to understand how categorical variables, also referred to as factor variables, influence a continuous variable. Careful consideration of a factor variable can allow for its inclusion as a covariate within a linear regression. While linear regression with a factor variable as a covariate isn’t taught as part of STAT0021, you should be able to extend your knowledge of linear regression from STAT0021 to understand the basics of linear regression   with a factor variable as a covariate through a small amount of research.

VaxPercent is the outcome of interest, with the link to Political being the aim of the investigation.

Your submission should include:

A.   Stata output including the results of an appropriate test taught as part of STAT0021 to determine whether the mean value of VaxPercent differs according to the levels of Political.

B.   An interpretation of those test results.

C.   A suitable plot to compare VaxPercent and Political.

D.   Plot (or plots) necessary to verify whether the assumptions of your test are satisfied.

Note that:

1.   This component of the assessment will be marked out of 3 marks.

Your submission should also include:

E.    Stata output including the results of a linear regression model for VaxPercent using Political treated as a factor variable as the covariate.

F.    An interpretation of the estimated model coefficients from that linear regression model.

Note that:

2.   This component of the assessment will be marked out of 3 marks.

Your submission should also include:

G.   A mathematical equation to indicate how this regression model can be used to make predictions of VaxPercent.

H.   Predicted values of VaxPercent when Political takes each of its three different levels.

Note that:

3.   This component of the assessment will be marked out of 3 marks.

Your submission should also include:

I.     Discussion of the benefits of building a linear regression model using Political as a  factor variable covariate in contrast to the drawbacks of a linear regression model using Political as a continuous variable covariate when wishing to determine how VaxPercent varies with Political.

Note that:

4.   The component of the assessment will be marked out of 3 marks.

3.5    General marks

6 marks are available to submissions which:

A.   Are clear, well-written and formatted; with plots and Stata output adequately sized and labelled; and which correctly follow the submission format instructions.

4    Submission details

4.1    Submission format

You should submit a single file, saved as a pdf and named as “Assessment 4 [your student number]” . For example, if your student number is 22000000 then your submission should be a single pdf file named “Assessment 4 22000000” .

Your submission should also include within it your student number, but should not contain your name.

4.2    Submission length

Your submission should be made up of no more than:

.    Five A4 pages of discussions which cover all of the requirements outlined in the previous section, with a font size no smaller than 10 points.

.    Ten pages of Stata output (as screenshots) and other relevant figures. Each figure should

have a number by which it is referred to in your discussions. Figures should be of a suitable size and quality to be easily interpretable.

.    One page, if necessary, of references to journal articles, books, websites, AI tools, etc.

Requesting the discussions and figures be separated in this way may seem unusual, but is done to stress both that enormous amounts of writing are not expected for this assessment and that carefully chosen figures can be just as (or even more) useful than a greater volume of text. The permitted length is an upper limit, not a guide for how much you are expected to submit. If you can clearly explain your thoughts more concisely then shorter submissions will not automatically be marked lower.

Any submission which is over the permitted length will suffer a penalty of 10 percentage points, although any such penalty will not reduce a mark below the pass mark of 40%.

4.3    Submission procedure and deadline

You must complete your submission via the “Assessment 4 – Individual Coursework – Term 1” section of the STAT0021 course Moodle page before the deadline of 1pm on Wednesday 17th January 2024.

There are standard non-negotiable penalties for late submissions which you can read about in the UCL Academic Manual. Any extension to the deadline can only be granted where a student has a Summary of Reasonable Adjustments (SoRA) or has successfully claimed extenuating circumstances. Extenuating circumstances are handled by your parent department and not by the teaching department.

4.4    Stata

Throughout the information above on the expected submission structure it is mentioned that you should include supporting evidence from Stata. This is referred to because use of Stata has been taught as part of STAT0021. If you would prefer to make use of other software to perform the analyses, and you believe that you can obtain results just as good as those produced by Stata, then  you are free to do so. If you are considering this, you are strongly encouraged to contact the course lecturer at [email protected] to discuss your decision.




热门主题

课程名

mktg2509 csci 2600 38170 lng302 csse3010 phas3226 77938 arch1162 engn4536/engn6536 acx5903 comp151101 phl245 cse12 comp9312 stat3016/6016 phas0038 comp2140 6qqmb312 xjco3011 rest0005 ematm0051 5qqmn219 lubs5062m eee8155 cege0100 eap033 artd1109 mat246 etc3430 ecmm462 mis102 inft6800 ddes9903 comp6521 comp9517 comp3331/9331 comp4337 comp6008 comp9414 bu.231.790.81 man00150m csb352h math1041 eengm4100 isys1002 08 6057cem mktg3504 mthm036 mtrx1701 mth3241 eeee3086 cmp-7038b cmp-7000a ints4010 econ2151 infs5710 fins5516 fin3309 fins5510 gsoe9340 math2007 math2036 soee5010 mark3088 infs3605 elec9714 comp2271 ma214 comp2211 infs3604 600426 sit254 acct3091 bbt405 msin0116 com107/com113 mark5826 sit120 comp9021 eco2101 eeen40700 cs253 ece3114 ecmm447 chns3000 math377 itd102 comp9444 comp(2041|9044) econ0060 econ7230 mgt001371 ecs-323 cs6250 mgdi60012 mdia2012 comm221001 comm5000 ma1008 engl642 econ241 com333 math367 mis201 nbs-7041x meek16104 econ2003 comm1190 mbas902 comp-1027 dpst1091 comp7315 eppd1033 m06 ee3025 msci231 bb113/bbs1063 fc709 comp3425 comp9417 econ42915 cb9101 math1102e chme0017 fc307 mkt60104 5522usst litr1-uc6201.200 ee1102 cosc2803 math39512 omp9727 int2067/int5051 bsb151 mgt253 fc021 babs2202 mis2002s phya21 18-213 cege0012 mdia1002 math38032 mech5125 07 cisc102 mgx3110 cs240 11175 fin3020s eco3420 ictten622 comp9727 cpt111 de114102d mgm320h5s bafi1019 math21112 efim20036 mn-3503 fins5568 110.807 bcpm000028 info6030 bma0092 bcpm0054 math20212 ce335 cs365 cenv6141 ftec5580 math2010 ec3450 comm1170 ecmt1010 csci-ua.0480-003 econ12-200 ib3960 ectb60h3f cs247—assignment tk3163 ics3u ib3j80 comp20008 comp9334 eppd1063 acct2343 cct109 isys1055/3412 math350-real math2014 eec180 stat141b econ2101 msinm014/msing014/msing014b fit2004 comp643 bu1002 cm2030
联系我们
EMail: 99515681@qq.com
QQ: 99515681
留学生作业帮-留学生的知心伴侣!
工作时间:08:00-21:00
python代写
微信客服:codinghelp
站长地图