代做Comprehensive Exam, MS in Biostatistics代写留学生Matlab语言程序

Division of Biostatistics

Qualifying Exam, PhD in Biostatistics

Comprehensive Exam, MS in Biostatistics

June 17, 2024

9:00am – 3:00pm

The dataset Osteoporosis.csv will be used for problems 1 and 2.

These data come from a study of the demographic, lifestyle, and dietary factors with bone mineral density (BMD) and osteoporosis (Chaudhari et al., 2019, PMCID: PMC6556264). The study included 169 participants aged at least 50 years old, seen in hospitals of Kathmandu, Nepal. The participants were administered questionnaires, and had their bone density measured via dual-energy X-ray absorptiometry (DEXA scans) in three locations: lumbosacral spine, right femur, and left femur. The variables in the dataset are:

age   age in years

sex   participants’ sex (male, female)

occupation   participants’ occupation

ethnicity   participants’ ethnicity (Brahmin chhetri, Janjati, Newar, other)

bmi   body mass index (BMI), in kg/m2

bmd   bone mineral density (BMD) T-score, in standard deviations, compared to healthy 25-35 year olds of same sex and ethnicity. Computed as the lowest T-score from the three DEXA scan locations (lumbosacral spine, right femur, left femur)

diagnosis   osteoporosis (BMD ≤ −2.5), osteopenia (-2.5 < BMD ≤ −1), or normal (BMD > −1)

op   osteoporosis indicator (1 if BMD ≤ −2.5, 0 otherwise)

smoking   smoking status (yes, no)

alcohol   alcohol consumption (yes, no)

exercise   daily exercise (yes, no)

tea   tea consumption (yes, no)

calcium   estimated dietary daily calcium intake, in mg

vitamind   estimated dietary daily vitamin D intake, in IU

l.femur   BMD T-score measured in the left femur

r.femur   BMD T-score measured in the right femur

lumbosacral   BMD T-score measured in the lumbosacral region of the spine

Please use appropriate plots, statistics and explanations in your answers below.

1. Problem 1: Osteoporosis

Consider the relationship between the response bone mineral density (BMD) and the predictor BMI.

(a) Describe the linear association between BMD and BMI quantitatively in a simple sentence. However, show that model diagnostics suggest that the association may not be linear throughout the BMI range.

(b) A BMI ≥ 25 defines overweight. A careful analysis suggests that in this range, the benefit of BMI is greatly diminished. Fit a linear spline (broken-stick) model with a knot at BMI = 25. Is there statistical evidence for different BMI slopes when BMI < 25 and when BMI ≥ 25?

(c) Fit the model BMD ∼ sex + age + sex × age. On a single plot show the relationship between BMD and age, separately for men and women. Use different colors for men and women in the plot.

(d) What is the age slope for men? Include a 95% CI and p-value. Does BMD decline with age for men?

(e) What is the age slope for women? Include a 95% CI and p-value. Does BMD decline with age for women?

The BMD variable was computed as the minimum T-score of the DEXA scans at three locations for each individual (lumbosacral spine, left femur, right femur). The BMD from the individual locations are available in the dataset.

(f) Are there any systematic differences in mean BMD between the three DEXA scan locations? Produce an appropriate graph. Demonstrate the pairwise differences statistically, if there are any.

(g) Is a correction for multiple comparison appropriate in the context of this analysis comparing the BMD at the three DEXA scan locations? If so, does this affect your results?

(h) Instead of calculating BMD as the minimum of the three DEXA measurements (analysis A), another approach is using the first principal component without centering (analysis B), while a third approach is using the average of the three measurements (analysis C). Compare these three BMD definitions graphically and quantitatively in terms of their values, the weights applied to the three individual scan locations, and interpretability. Do all three BMD definitions have the same units of measure?

2. Problem 2: Osteoporosis (Continued)

Please be sure you answer the questions which ask you to ’summarize’ or ’interpret’ or ’discuss briefly’- please demonstrate your ability to communicate what you have done and what it means!! Credit will be given for all reasonable answers, even if not exactly as intended. :)

We use the data from the Osteoporosis study to conduct a brief study of the association of alcohol use with a diagnosis of osteoporosis.

We consider the biological variables age, sex and BMI, because they are well-known to be strongly associated with osteoporosis (younger age has lower risk of osteoporosis, low BMI has higher risk of osteoporosis because of associated hormonal changes). We also consider the dietary factors calcium intake and vitamin D intake (low is bad) and the modifiable health behaviors alcohol consumption (presumably bad), exercise (presumably good), and smoking (presumably bad). The varables are listed below for your convenience.

(a) First, we investigate the association of alcohol use with osteoporosis, without adjusting for any other variables. Write a short paragraph that summarizes the number of alcohol drinkers and non-alcohol drinkers in the study; the prevalence of osteoporosis among the drinkers and the non-drinkers; and gives the difference in risk of osteoporosis between alcohol drinkers and non-drinkers. Include confidence intervals where appropriate. Does this difference in risk demonstrate that alcohol consumption leads to a reduction in the risk of osteoporosis?

(b) Next, we investigate multivariate models which study the joint effects of predictors on the risk of osteoporosis. Create appropriate factors as defined below, and set the reference level for all predictors to the presumed or known low-risk category. Please use categorical versions of the variables for all of Question 2.

• Age60, an indicator for age 60 years or older

• LowBMI, an indicator of BMI < 25 (i.e. not overweight)

• LowCalc, an indicator of dietary calcium < 500 mg

• LowD, an indicator of vitamin D intake < 600 iu

Fit a model with only the known biological predictors of age (categorical version), sex, and BMI (cate-gorical version). (Main effects only- let’s keep it simple! :) ). Do these variables appear to be strongly associated with a diagnosis of osteoporosis, as expected? Explain briefly.

(c) Now we investigate the association of alcohol use with osteoporosis, adjusting for potential confounding factors. We consider the known risk factors age, sex, and BMI, and the potential confounders of dietary calcium, dietary vitamin d, smoking, and exercise. Before you begin your analysis, describe your model selection strategy (Main effects only- keeping it simple! :) ). Tell me why you don’t recommend allowing alcohol to be considered for inclusion/exclusion as part of any variable selection algorithm.

(d) Now, carry out your model selection strategy above. Call the result Model 1. Present Model 1 as a table of estimated coefficients on the odds ratio scale, with confidence intervals and p-values. Briefly summarize your results in a short paragraph. Be sure to report your main conclusion regarding alcohol use and risk of osteoporosis. (To keep the exam simple, do **NOT** present model diagnostics! In fact, the model fits pretty well.)

(e) Next we explore whether Model 1 has enough information to usefully distinguish between high risk and low risk subjects, using the easily observed variables which are included in the model (age, sex, BMI, smoking, etc.).

Use Model 1 to compute the estimated risk of osteoporosis, with appropriate confidence interval, for an hypothetical extremely high risk subject (e.g. an older woman smoker, non-drinker, with low BMI and low calcium) and for an hypothetical extremely low risk subject. Then compute the range and the quartiles of estimated risks for subjects actually observed in the data. Why do the hypothetical and the observed ranges differ? Do you think this model could potentially be useful in identifying people at high risk and at low risk of osteoporosis, for a similar population of patients ? Discuss briefly.

(f) The association of alcohol use with lower osteoporosis risk may be surprising. However, in many obser-vational studies of health outcomes, moderate alcohol use is associated with better outcomes. On the one hand, in some settings (eg red wine and heart disease) some people argue that there is a real causal effect, and on the other hand we know that there are often systematic differences between drinkers and non-drinkers. This motivates us to conduct further analysis.

Recalling a result from Problem 1, add the interaction of age and sex to Model 1, and call the result Model 2. Compare Model 1 and Model2. Does this analyses increase your confidence that the apparent protective effect of alcohol use is real? Why might this be considered an exploratory analysis, rather than the primary analysis in your study? Explain in a few sentences.

(g) Briefly summarize your analysis, and discuss the extent to which these data provide evidence for a protective effect of drinking alcohol for osteoporosis prevention. Be clear, professional, and quantitative in your answer.

Figure 1: Mean composite driving score over time, for each treatment arm (and 95% confidence intervals).

3. Problem 3: Driving Miss Mary Jane

The file Driving.csv includes data from a double-blind, placebo-controlled parallel randomized clinical trial conducted at the UCSD Center for Medicinal Cannabis Research, aiming to determine effect of cannabis on driving performance. A total of N = 190 cannabis users were asked to smoke at least 4 puffs from a cigarette containing either placebo, low dose THC, or high dose THC, according to the randomly assigned treatment arm. The participants completed computer-based driving simulations pre-smoking (baseline, 0 minutes) and at 4 timepoints after smoking: 0.5 hours (h), 1.5h, 3.5h, and 4.5h.

The outcome is composite driving score (CDS), measuring the driver’s overall performance, with higher values indicating worse performance. CDS is a standardized score with no units of measure. A score of 0 reflects average driving ability.

The given dataset is in the long format. The variables in the dataset are:

• pid: participant ID

• treatment (3 levels): placebo, low THC, or high THC

• THC : indicator of THC-containing treatment, 0 if placebo, 1 if low or high THC dose

• time_min: time since smoking in minutes (0 is pre-smoking)

• CDS: composite drive score

• frequent_user : 0 if current cannabis use < 4 times/week, 1 if ≥ 4 times/week

• age: participant’s age in years

• education: participants’s years of education

• gender : Male or Female

• miles_past_year : estimated self-reported number of miles participant drove in past year

(a) Add two additional time-related variables: occasion, a categorical version of the time variable; and an index variable that indexes the measurement time points in increasing order, from 1 (baseline), 2 (30 minutes), . . . , to 5 (270 minutes post-smoking).

Produce a boxplot of CDS as a function of occasion. Is the response distribution approximately normal?

Is there a data transformation that is obviously necessary?

(b) Consider modeling the CDS as a function of time using a longitudinal general linear model. What is a good choice of covariance structure for this general linear model? Use the chosen covariance structure in subsequent modeling.

(c) Choose an appropriate mean model for this randomized clinical trial: consider a response profile model (RPM), and parametric time models with either linear, quadratic, or cubic time effects. Discuss whether there is any advantage in this case in using a parametric model for time, instead of a response profile model.

(d) Based on the RPM, test whether overall there are any differences in the response profiles of the three study arms.

(e) The test at the previous question confirms what Figure 1 suggests: overall, there are significant differ-ences in the trajectories of the three groups.

Follow-up this analysis by testing for pairwise differences in the CDS trajectories between the three treatment arms. Are the results consistent again with what Figure 1 suggests? For each comparison state the null hypothesis in terms of the vector of mean coefficients β.

(f) Interestingly, in Figure 1 the low THC group has apparently worse driving scores than the high THC group. One hypothetical explanation is that the participants in the high THC group can feel the strong THC and do not smoke the entire cigarette. In any case, the previous analysis shows no significant difference in driving scores between the two THC groups. This suggests combining the two THC groups into a single group (see the variable THC in the dataset).

Fit an appropriate response profile model for the trajectories of the two groups over time: THC and placebo, and show that these trajectories differ.

(g) Following up on the previous question, estimate the causal effect of smoking THC on driving at each time point. Include 95% confidence intervals and p-values.

(h) Are the effects of smoking THC on driving large? To put this in perspective, in a cross-sectional study a Cohen’s d effect size d = 0.5 is considered moderate, and d = 0.8 is considered large.

Finally, summarize in a few sentences the study findings regarding the effects of smoking cannabis on driving abilities.




热门主题

课程名

mktg2509 csci 2600 38170 lng302 csse3010 phas3226 77938 arch1162 engn4536/engn6536 acx5903 comp151101 phl245 cse12 comp9312 stat3016/6016 phas0038 comp2140 6qqmb312 xjco3011 rest0005 ematm0051 5qqmn219 lubs5062m eee8155 cege0100 eap033 artd1109 mat246 etc3430 ecmm462 mis102 inft6800 ddes9903 comp6521 comp9517 comp3331/9331 comp4337 comp6008 comp9414 bu.231.790.81 man00150m csb352h math1041 eengm4100 isys1002 08 6057cem mktg3504 mthm036 mtrx1701 mth3241 eeee3086 cmp-7038b cmp-7000a ints4010 econ2151 infs5710 fins5516 fin3309 fins5510 gsoe9340 math2007 math2036 soee5010 mark3088 infs3605 elec9714 comp2271 ma214 comp2211 infs3604 600426 sit254 acct3091 bbt405 msin0116 com107/com113 mark5826 sit120 comp9021 eco2101 eeen40700 cs253 ece3114 ecmm447 chns3000 math377 itd102 comp9444 comp(2041|9044) econ0060 econ7230 mgt001371 ecs-323 cs6250 mgdi60012 mdia2012 comm221001 comm5000 ma1008 engl642 econ241 com333 math367 mis201 nbs-7041x meek16104 econ2003 comm1190 mbas902 comp-1027 dpst1091 comp7315 eppd1033 m06 ee3025 msci231 bb113/bbs1063 fc709 comp3425 comp9417 econ42915 cb9101 math1102e chme0017 fc307 mkt60104 5522usst litr1-uc6201.200 ee1102 cosc2803 math39512 omp9727 int2067/int5051 bsb151 mgt253 fc021 babs2202 mis2002s phya21 18-213 cege0012 mdia1002 math38032 mech5125 07 cisc102 mgx3110 cs240 11175 fin3020s eco3420 ictten622 comp9727 cpt111 de114102d mgm320h5s bafi1019 math21112 efim20036 mn-3503 fins5568 110.807 bcpm000028 info6030 bma0092 bcpm0054 math20212 ce335 cs365 cenv6141 ftec5580 math2010 ec3450 comm1170 ecmt1010 csci-ua.0480-003 econ12-200 ib3960 ectb60h3f cs247—assignment tk3163 ics3u ib3j80 comp20008 comp9334 eppd1063 acct2343 cct109 isys1055/3412 math350-real math2014 eec180 stat141b econ2101 msinm014/msing014/msing014b fit2004 comp643 bu1002 cm2030
联系我们
EMail: 99515681@qq.com
QQ: 99515681
留学生作业帮-留学生的知心伴侣!
工作时间:08:00-21:00
python代写
微信客服:codinghelp
站长地图