代写Final Project Part 1 Research Proposal and Data Introduction代做R编程

Final Project Part 1

Research Proposal and Data Introduction

Due: October 20, 2022 by 11:59PM ET

Goal of the Assessment:

The purpose of this assessment is two parts. First, to give you a head start with your final project by finding an area of interest to study and real-world data to work with. Next, to research a little into your area of interest to see what has been accomplished surrounding your question and to highlight the importance of your proposal.

The steps involved in completing this assignment encompass the general process of proposing a research question and will form the  basis for a solid  introduction section for your final project report (Part 3). Completing this assignment will also give you the chance to think about the appropriateness of linear regression as a tool for answering your proposed research question using your chosen data. Lastly, this assignment provides an opportunity to get some feedback on your writing and research question that can be used to improve your final report.

Instructions:

1.   Decide on one (or a few possible) areas of interest that you may want to explore. These areas of interest can be anything that matters or is of interest to you. Some examples could be (but are certainly not limited to) sports, medicine, public health, economics, video games, literature, etc. Pick something that you really care about.

2.   Next, think about possible research questions you may want to study in these areas. What do you want to know about this area? You want to make sure that your question can be answered/studied using linear regression models. So, you’ll want to frame. your question to be something related to modelling a relationship or predicting a value based on this relationship. You’ll also want to consider whether the variable of interest would allow the assumptions of linear regression to hold (see Module 3 content). See the workshop slides from September 23 for advice on framing your research question effectively.

3.   After producing a research question, you will need to find some open-source data that you may use in your data analysis. You want to make sure that the data you find has both:  1) your response variable of interest (or has variables that could be used to create that variable), and 2) any other variable you may want to use as predictors. By looking for data online, you may realize you need to modify your research question slightly or pick another one if you can’t quite find the data you’re looking for. Alternatively, you can stick with your research question, but be sure to mention that you expect there to be many limitations to the dataset because it doesn’t quite meet your needs. Step 4 can also help you decide what predictors might be needed for you to answer your question.

Examples of open data sources:

o https://open.toronto.ca for open-data from Toronto

o https://data.ontario.ca for open-data from Ontario

o https://www150.statcan.gc.ca/n1/en/type/data?MM=1 for data collected by Statistics Canada

o https://sports-statistics.com/sports-data/ for various sports-related datasets

o https://data.oecd.org for data on various country-level variables

o https://mdl.library.utoronto.ca for links to many other data portals through the University of Toronto library

4.   Once you’ve found your dataset and have decided on your research question (or you can work on steps 2-4 simultaneously and use what you find in all of them to finalize your research question), you need to look at what others have studied in relation to your research question. Do a quick search on the University of Toronto library website or other databases that feature scholarly articles (see workshop slides from Sept. 23) to learn about anything related to your area of interest and research question. Look for academic papers (i.e., peer-reviewed work that has been published in reputable scholarly journals, not websites, blogs, or news articles, etc.) that studied the same research question or something related, that tells you more about what you may need to consider in your analysis. Also use the academic papers to justify why your research question is important.

o Focus on giving your reader a rough idea of how many academic papers have studied your research topic (or closely related concepts to your topic). This process of looking at the number of academic papers which describe a specific topic tells your audience how popular the area of research is and how much research has been done.

o Give examples from a few important papers about what was found or discovered to be important in relation to your question. This can be important variables, important results, surprising results, etc. The process of identifying and describing important papers tells your audience that you are aware of prior results and that you will be using these to plan your analysis.

o Think about how your research question fits into the general area of research about your topic. Is your research question different to research questions in other studies? If so, how?   A  novel research question consists of something that nobody has studied before, or studied in the way you are looking at, or in the population you hope to examine. The process of examining if your research question is novel tells your audience that you see the importance of what you are researching and can frame it against what has already been done.

Library resources:

o https://guides.library.utoronto.ca/librarysearchtips/gettingstarted for more details about searching for articles related to your question

o https://guides.library.utoronto.ca/citing for details about why and how to cite your references

o https://guides.library.utoronto.ca/c.php?g=251103&p=1673071 for help getting the correct citation format

5.   Lastly, perform. a short exploratory data analysis of your chosen dataset. You will want to focus on identifying anything that you may need to consider moving forward. This includes identifying:

a.   skews,

b.   statistical outliers,

c.   variables with high spread or observations that don’t make sense,

d.   missing data

For section 5, you want to make sure you specifically mention the presence of any of the characteristics in 5a-d (or lack thereof) and what this means for the analysis you will eventually perform. For example, this may include describing  how any of the characteristics in 5a-d might cause problems (or not) with the results of linear regression or generalizability. You will need to present numerical and/or graphical summaries describing the variables. Choose the options that highlight the features of the data that you want to point out but will also let your reader clearly understand the data that you will be working with.

Guidelines for Picking a Dataset

o Government data portals often contain many datasets about diverse topics – if one dataset  doesn’t have all the variables you might want to consider, feel free to combine different datasets together

o Just make sure that each unit being measured is the same in both datasets (i.e. it’s reasonable that both measurements are on the same unit)

o There are many data repositories online – if you find a dataset there that is of interest to you, you MUST ensure that your question is different than what the dataset was originally used for.

o YOU  MAY NOT use any dataset that is part of any R package or library, or that is contained in a textbook. If you’re not sure, please ask the instructor or one of the TAs.

o You will need to make sure you have enough variables to be able to showcase the statistical methods that you will learn later in the course. This is because your final project replaces a final exam and so the teaching team needs to assess your knowledge on all topics covered in the course. Some topics the teaching team will require  include model validation and model refinement so please ensure your dataset has at least 5 predictor variables.

o You will also need to make sure you have enough observations to be able to validate your model, which will involve splitting your dataset into two roughly equal parts.

o A good rule of thumb for a minimum number is to have about 10

observations per variable in each half of your dataset (e.g. 6 predictors x 10 observations/variable x 2 halves of the dataset = 120 observations in total)

Proposal Content Requirements:

Your proposal should be created to satisfy the following requirements:

o The proposal should be organized clearly (consider using headings or sections) and include the following information:

a.   Your research question, why you chose it (i.e., why it’s of interest to you), and why it may be of interest to others.

b.   Summaries of academic papers related to your question or topic, highlighting similarities/differences to what you propose, and how you will incorporate this knowledge into your model/project.

c.   Details and summaries on your chosen dataset including the variables collected, the number of observations and anything that stands out in the data that would need to be addressed/investigated further in your analysis.

d.   A discussion about how and why a linear model fits your chosen data. This will allow you to answer your proposed research question, as well as whether you anticipate any problems that may arise in your analysis from EDA.

e.   References for where you located the data, and your background research on your topic

o The proposal should be written/presented for an audience that has some statistics background but is not necessarily familiar with the area of your research question or linear regression models,

o The proposal should contain figures and/or tables with proper labels/titles  as appropriate in your Data Description - Exploratory Data Analysis section,

o The proposal should have references listed in proper APA format, and

o The proposal itself should not contain R codes

Technical Requirements:

Your submission to Quercus should include the following:

1.   A video that presents your proposed research area and question, the dataset you have chosen, and the exploration of your dataset.

o The video should be no more than 5 minutes in length

o You must display your U of T Student ID card (or other valid government-issued photo ID) at the beginning of your video The presenter’s face must be visible throughout the video

o The presentation should include an appropriate visual medium (e.g., slides) to display important information in an easily readable way.

o The video should be hosted on a video-sharing service (e.g., MS Streams, MyMedia are supported by the university)

2.   The proposed dataset you will use in your Final Project, as a csv or xlsx file, or if too large, as a link to cloud storage where the dataset is saved in csv or xlsx.

3.   A copy of the slides/visual aids used in your presentation saved as a PDF document.

4.   The R Markdown file containing the code used to produce your exploratory data analysis and tables/figures.

How to Upload:

o Link to Video Presentation – add as a comment to your submission

o Instructions for uploading to MS Stream: https://learn.microsoft.com/en- us/stream/portal-upload-video

o Instructions for uploading to MyMedia: https://ito-engineering.screenstepslive.com/s/ito_fase/a/1291600-how-do-i-upload-a-video-or-audio-file-to-mymedia

o Both require you to log in with your UofT credentials.

o R Markdown File – as a file upload on Quercus

o Slides used in Presentation – as a file upload on Quercus

o Chosen Dataset – either as a file upload OR as a comment to your submission (best option if the file is large)




热门主题

课程名

mktg2509 csci 2600 38170 lng302 csse3010 phas3226 77938 arch1162 engn4536/engn6536 acx5903 comp151101 phl245 cse12 comp9312 stat3016/6016 phas0038 comp2140 6qqmb312 xjco3011 rest0005 ematm0051 5qqmn219 lubs5062m eee8155 cege0100 eap033 artd1109 mat246 etc3430 ecmm462 mis102 inft6800 ddes9903 comp6521 comp9517 comp3331/9331 comp4337 comp6008 comp9414 bu.231.790.81 man00150m csb352h math1041 eengm4100 isys1002 08 6057cem mktg3504 mthm036 mtrx1701 mth3241 eeee3086 cmp-7038b cmp-7000a ints4010 econ2151 infs5710 fins5516 fin3309 fins5510 gsoe9340 math2007 math2036 soee5010 mark3088 infs3605 elec9714 comp2271 ma214 comp2211 infs3604 600426 sit254 acct3091 bbt405 msin0116 com107/com113 mark5826 sit120 comp9021 eco2101 eeen40700 cs253 ece3114 ecmm447 chns3000 math377 itd102 comp9444 comp(2041|9044) econ0060 econ7230 mgt001371 ecs-323 cs6250 mgdi60012 mdia2012 comm221001 comm5000 ma1008 engl642 econ241 com333 math367 mis201 nbs-7041x meek16104 econ2003 comm1190 mbas902 comp-1027 dpst1091 comp7315 eppd1033 m06 ee3025 msci231 bb113/bbs1063 fc709 comp3425 comp9417 econ42915 cb9101 math1102e chme0017 fc307 mkt60104 5522usst litr1-uc6201.200 ee1102 cosc2803 math39512 omp9727 int2067/int5051 bsb151 mgt253 fc021 babs2202 mis2002s phya21 18-213 cege0012 mdia1002 math38032 mech5125 07 cisc102 mgx3110 cs240 11175 fin3020s eco3420 ictten622 comp9727 cpt111 de114102d mgm320h5s bafi1019 math21112 efim20036 mn-3503 fins5568 110.807 bcpm000028 info6030 bma0092 bcpm0054 math20212 ce335 cs365 cenv6141 ftec5580 math2010 ec3450 comm1170 ecmt1010 csci-ua.0480-003 econ12-200 ib3960 ectb60h3f cs247—assignment tk3163 ics3u ib3j80 comp20008 comp9334 eppd1063 acct2343 cct109 isys1055/3412 math350-real math2014 eec180 stat141b econ2101 msinm014/msing014/msing014b fit2004 comp643 bu1002 cm2030
联系我们
EMail: 99515681@qq.com
QQ: 99515681
留学生作业帮-留学生的知心伴侣!
工作时间:08:00-21:00
python代写
微信客服:codinghelp
站长地图