代写FIT3154 Assignment 2代做R编程

FIT3154 Assignment 2

Due Date: 11:55PM AEST, Friday 24/10/2025

Introduction

There are a total of two questions worth 16 + 21 = 37 marks in this assignment. Please note that working and/or justification must be shown for all questions that require it.

This assignment is worth a total of 20% of your final mark, subject to hurdles and any other matters (e.g., late penalties, special consideration, etc.) as specified in the FIT3154 Unit Guide or elsewhere in the FIT3154 Moodle site (including Faculty of I.T. and Monash University policies).

Students are reminded of the Academic Integrity Awareness Training Tutorial Activity and, in par-ticular, of Monash University’s policies on academic integrity. In submitting this assignment, you acknowledge your awareness of Monash University’s policies on academic integrity and that work is done and submitted in accordance with these policies.

Submission Instructions: Please follow these submission instructions:

1. No files are to be submitted via e-mail. Submissions are to be made via Moodle.

2. You may not use generative A.I. in when answering this assignment.

3. Please submit:

(a) One PDF file containing non-code answers to all the questions that require written answers. This file should also include all your plots, and must be clearly written and formatted. Please ensure that your assignment answers the questions in the order specified in the assignment.

(b) A single R file containing R code answers. Please use comments to clearly indicate which bits of code are answering which questions.

Please read these submission instructions carefully and take care to submit the correct files in the correct places. GLHF!

Question 1 (16 marks)

In this question we will again examine Bayesian inference of the negative binomial distribution, but this time using a different prior distribution. Recall that the negative binomial distribution is a distribution used to model counts, and has the probability mass function

where µ > 0 is the mean parameter of the distribution and r > 0 is the shape parameter (note that r can be a non-integer) and Γ(·) is the gamma function. In Assignment 1 we used a gamma prior distribution. This time we will use a special heavy-tailed prior distribution which we will call the squared half-Cauchy; this has the probability density

where s > 0 is a scale-parameter that can be used to encode our prior information. If a RV X follows a squared half-Cauchy with scale s, we say X ∼ C 2+(0, s). Our hierarchy is then

where s can be thought of as setting the value of the “best guess” of µ before we see the data. For reference, the analysis in Assignment 1 used a gamma prior distribution with a choice of hyperparam-eters that encoded a prior best guess for µ of 14.5. The posterior mean was E [µ | y] ≈ 15.95 and the 95% credible interval was ≈ (15.06, 16.87).

Questions

Download the file nb.fit.shc.R. This contains the function, nb.sample.shc(y,s,n.samples) that draws n.samples from the posterior distribution for the hierarchy (3-5), where y is the data and s is the hyperparameter for the squared half-Cauchy prior. Use this to generate 100, 000 samples of µ and r for the COVID recovery data in covid.ass1.2025.csv.

1. Plot the histogram of the samples, and report on the posterior mean and the 95% credible intervals; compare these to the posterior mean and intervals found using the gamma prior from Assignment 1 (summarised above). [3 marks]

The second part of this question will look at finding the maximum a posteriori (MAP) estimator for µ, which is found by maximising the posterior, or equivalently minimising the negative log-posterior, i.e,.

where we will assume r is known or given. Please answer the following questions:

2. Write down the formula for the negative log-posterior, i.e., − log p(y | µ, r)π(µ|s), up to constants not dependent on µ. Make sure to simplify the expression. [2 marks]

3. Prove that maximising the posterior for µ is equivalent to solving

where    [2 marks]

4. Write a function nb.map(y,s,r) which takes a vector of data y, the prior hyperparameter s and shape r and returns the MAP estimator by solving the quadratic (7) for the solution that minimises the negative log-posterior. (hint: refer to Studio 9 for tips) [1 marks]

5. Using this function and the data in covid.ass.2025.csv, compute the MAP estimator using s = 14 (our prior best guess); for the value of r, use the posterior mean of r obtained from the samples you generated as part of Q1.1. Compare this estimate to the posterior mean using the squared half-Cauchy found in Question 1.1. Discuss why you think they are similar/different. [1 mark]

6. Using your function nb.map(y,s,phi), explore the sensitivity of the MAP estimator for our COVID recovery data when using choices of prior best guess that are quite different from our guess based on the SARS recovery time. In particular, try the values

s = exp(seq(log(0.01),log(1e4),length.out=100))

Plot the MAP estimates found using these values of s against the values of s. How do these estimates compare to using s = 14? What does this say about the squared half-Cauchy prior?

(hint: use the log-scale for the x-axis of your plot) [2 marks]

7. Find the asymptotic formulae for the MAP estimate of µ, i.e., the solution to (7) when (i) s → 0, and (ii) s → ∞. Comment on these formulae. [2 marks]

As discussed in Assignment 1, it is sometimes more natural to use the alternative parameterisation v = log µ for the negative binomial distribution.

8. Transform. the squared half-Cauchy prior (2) on µ to a prior on v using the transformation of random variables technique from Lecture 5. [2 marks]

9. Plot this prior distribution for the choice of hyperparameters s = 10, s = 102 and s = 103 . How does s affect the prior on v? [1 mark]

Question 2: Negative Binomial Regression (21 marks)

In this question we are going to examine an extension of the negative binomial distribution to a regression setting. Let y = (y1, . . . , yn) be n non-negative integer observations that we would like to model, and let xi = (xi,1, . . . , xi,p) be a vector of p predictors associated with the i-th data point. Negative binomial regression models extend the usual negative binomial distribution in the following way:

where β = (β1, . . . , βp) are the coefficients relating the predictors to the target, β0 is the intercept and the negative binomial distribution has probability mass function given by (1). In this way the negative binomial regression model extends the regular linear regression model to modelling non-negative integer data. The effects of the predictors are additive on the log scale, and therefore multiplicative on the original scale.

Questions: Part I

The first part of this question will examine using the negative binomial regression model in the context of generalized additive model (GAM), using the mgcv package. Download the file daily.cases.2025. This file contains the daily recorded COVID-19 case numbers for a 10 week period starting on Monday, 6th of June. The target, Cases, contains the daily case numbers, the predictor Day is the number of days since 6th of June, and Day.Of.Week is a categorical variable indicating what day of the week each day in the dataset is.

1. Using the gam() function fit a negative binomial regression using maximum likelihood to the daily case data, using Cases as the target (hint: use the family=nb(link=log,theta=NULL) option to fit a negative binomial). We expect a nonlinear trend over time, so please model the Day predictor as a smooth function (hint: see Studio 8). We are also interested in seeing if there is an effect on reported daily case numbers based on the day of the week, so include the Day.Of.Week predictor in your model. Use summary() to inspect the fitted model; from these statistics please answer the following:

(a) Do you think there is evidence for a non-linear trend over time? Do you think there is evidence for an effect based on the day of the week? Justify these answers. [2 marks]

(b) how does being a Saturday affect case numbers, and how does being a Wednesday affect case numbers? [2 marks]

2. We could fit an additive model using an identity link instead of a log-link, i.e., using

and clipping any µi < 0 to µi = 0. Fit a GAM using gam() with an identity link using the option family=nb(link=identity,theta=NULL). Using summary() and the statistics produced by gam(), compare this model to the GAM using a negative binomial distribution with log-link fitted above. Which one do you think fits the data better and why? [1 mark]

3. On the same plot, please plot the observed daily case numbers against Day, and then plot: (i) the predictions for each day based on your negative binomial GAM using the log-link, and (ii) the predictions for each day based on your negative binomial GAM using the identy link. Discuss the difference between the two predictions. Does the log-link seem to have an advantage at modelling the series, and if so, why? [2 marks]

Questions: Part II

There is no simple closed-form. formula for the maximum likelihood estimates for a negative bino-mial regression. The second part will require you to write code to fit a negative binomial regression using maximum likelihood and gradient descent. To get you started I have provided a script. called nbreg.gd.R which contains several functions: nbreg.gd(), my.standardise() and df2matrix().

The main function nbreg.gd() is a skeleton that takes a y and X, standardises the predictors (as per Lecture 3) using my.standardise() and then has space for you to fill in with your gradient descent code. Once your code has finished running, the function unstandardises the coefficients back to the original scale. The function called df2matrix() takes a data frame. and a formula and returns an appropriate matrix of predictors X and vector of targets y. Answer the following questions:

4. Write down the expression for the negative log-likelihood of data y = (y1, . . . , yn) under the negative binomial regression model (8) with predictor matrix X of size n × p, coefficient vector β, intercept β0 and shape r. [2 marks]

5. Find the expressions for the gradient of the negative log-likelihood with respect to the intercept β0 and the coefficients β1, . . . , βp. [3 marks]

6. Write an R function that takes a vector of data y, a matrix of predictors, X, an intercept beta0, a vector of coefficients beta and a shape r and returns: (i) the negative log-likelihood and (ii) the gradients with respect to β0 and the coefficients β1, . . . , βp (hint: the lgamma(x) function implements log Γ(x)) [2 marks]

7. Implement gradient descent to find the maximum likelihood estimates of the intercept β0 and coefficients β1, . . . , βp, given a shape r. Use the function nbreg.gd.R as basis to get started. The function takes as arguments

• An X matrix of predictors and y vector of targets

• A value for the shape r

• An initial value for the learning rate κ.

• The size of the largest absolute value of the gradient used for termination (max.grad).

The function does some initialisation of the estimates of β0 and β for you by using least-squares onto the log-transformed targets. This are good starting points for a search. You will need to write the code to find the maximum likelihood estimates βˆ 0 and βˆ 1, . . . , βˆ p for the X and y that were passed using the gradient descent algorithm. Once it has finished, it should return a list containing:

• The ML estimate of the intercept β0.

• The ML estimates of the coefficients β1, . . . , βp

• The negative log-likelihood at your final estimates

• The gradient at your final estimates

• The negative log-likelihood recorded at every iteration of your gradient descent loop.

Some further details

• Your learning rate, κ, must be adaptive (hint: see Lecture 8).

• Your algorithm should terminate when the largest absolute value of the gradients is less than the max.grad argument passed to your function.

• When writing your code, you should consider computational efficiency and ensure that it is cleanly written and commented.

To help you test your code I have provided a test dataset in the file nbreg.test.csv. This has three predictors and a target (y). Use r=2; if your code is working, then it should return maximum likelihood estimates close to 0.99, 1.94 and −3.00 for the three predictors and an intercept close to 2.03. [4 marks]

8. Now, let us test our code properly. We can build our own additive model by including polynomial terms in our model. Fit a negative binomial regression to the data from Question Q2.1, including both Day.Of.Week and ten polynomial terms for Day (i.e., I(Day) + I(Day2) + ...). You can use the estimate of r found in Q2.1 as the value of r when using your code (access this using fit.gam$family$getTheta(F)).

Report the fitted coefficients, and produce predictions for all the days for your fitted negative binomial regression model using your maximum likelihood estimates of the coefficients and the definition of the negative binomial regression model (8). Plot these along with the observed daily case numbers. How similar do they look to the predictions made by the GAM? [2 marks]

9. Imagine we wanted to to use ridge regression instead of maximum likelihood to fit our nega-tive binomial regression model. How would the objective function and the gradients change to accomodate a ridge penalty on the coefficients β? [1 mark]

At the end of this assignment, you should reflect on how much you have learned since you started this unit. You now have learned how to use the Bayesian approach to statistical inference, you have ideas on how to analyse non-Gaussian data using regression models, and have even managed to write a program to fit such a model using maximum likelihood!



热门主题

课程名

mktg2509 csci 2600 38170 lng302 csse3010 phas3226 77938 arch1162 engn4536/engn6536 acx5903 comp151101 phl245 cse12 comp9312 stat3016/6016 phas0038 comp2140 6qqmb312 xjco3011 rest0005 ematm0051 5qqmn219 lubs5062m eee8155 cege0100 eap033 artd1109 mat246 etc3430 ecmm462 mis102 inft6800 ddes9903 comp6521 comp9517 comp3331/9331 comp4337 comp6008 comp9414 bu.231.790.81 man00150m csb352h math1041 eengm4100 isys1002 08 6057cem mktg3504 mthm036 mtrx1701 mth3241 eeee3086 cmp-7038b cmp-7000a ints4010 econ2151 infs5710 fins5516 fin3309 fins5510 gsoe9340 math2007 math2036 soee5010 mark3088 infs3605 elec9714 comp2271 ma214 comp2211 infs3604 600426 sit254 acct3091 bbt405 msin0116 com107/com113 mark5826 sit120 comp9021 eco2101 eeen40700 cs253 ece3114 ecmm447 chns3000 math377 itd102 comp9444 comp(2041|9044) econ0060 econ7230 mgt001371 ecs-323 cs6250 mgdi60012 mdia2012 comm221001 comm5000 ma1008 engl642 econ241 com333 math367 mis201 nbs-7041x meek16104 econ2003 comm1190 mbas902 comp-1027 dpst1091 comp7315 eppd1033 m06 ee3025 msci231 bb113/bbs1063 fc709 comp3425 comp9417 econ42915 cb9101 math1102e chme0017 fc307 mkt60104 5522usst litr1-uc6201.200 ee1102 cosc2803 math39512 omp9727 int2067/int5051 bsb151 mgt253 fc021 babs2202 mis2002s phya21 18-213 cege0012 mdia1002 math38032 mech5125 07 cisc102 mgx3110 cs240 11175 fin3020s eco3420 ictten622 comp9727 cpt111 de114102d mgm320h5s bafi1019 math21112 efim20036 mn-3503 fins5568 110.807 bcpm000028 info6030 bma0092 bcpm0054 math20212 ce335 cs365 cenv6141 ftec5580 math2010 ec3450 comm1170 ecmt1010 csci-ua.0480-003 econ12-200 ib3960 ectb60h3f cs247—assignment tk3163 ics3u ib3j80 comp20008 comp9334 eppd1063 acct2343 cct109 isys1055/3412 math350-real math2014 eec180 stat141b econ2101 msinm014/msing014/msing014b fit2004 comp643 bu1002 cm2030
联系我们
EMail: 99515681@qq.com
QQ: 99515681
留学生作业帮-留学生的知心伴侣!
工作时间:08:00-21:00
python代写
微信客服:codinghelp
站长地图