代做LING 226 Assignment 1, 2023代写C/C++编程

LING 226 Assignment 1, 2023

Short Program and Written Reflection 1 (25% of total grade)

The goal of this assignment is to develop a program to read in text data, perform. some preprocessing on the data, and then compare the effects of different preprocessing on various text metrics. You

should construct a program with these functions:

•    a function to preprocess text data which can:

◦  remove punctuation

◦  remove stopwords

◦  lowercase all words

◦  remove words below/above a certain frequency

•    a function (or functions) to calculate these text metrics:

◦  total number of words

◦  overall lexical diversity of the text

◦  average lexical diversity of text sentences

◦  top ten most frequent words

The course notebooks have everything you need to create these functions. You can reuse any and all of the functions in the course notebooks to create your program. After creating these functions, you need to conduct an experiment. The goal of your experiment is to compare the effects of

preprocessing on the different text metrics. To do so, you need to use data from at least two sources:

1.   One of the built-in NLTK corpora resources (e.g., Brown, State Union)

2.   Data from The Current (data from at least two questions)

Using this data, look for trends and consistent effects that preprocessing has on various text metrics. Also look to see if there are any texts more or less immune to the effects of preprocessing. After conducting your experiment, write a short report (500-600 words) reflecting on your results. You should detail the comparisons and analyses that you conducted, what results you found, and your interpretation of the results. Specifically, you should focus on what happens to these metrics under   different preprocessing conditions, and focus on making conclusions about their implications for text analysis in general.

You should submit your assignment as a .ipynb notebook file in Canvas by the due date. Your notebook should have a text cell at the start with includes your name, your student ID, and whether you are attempting to complete the challenge (see below). The notebook should include all of the code cells, plus your written report as text cells. You are free to mix code and text cells as you deem appropriate.

Marking Guidelines

A-level papers will run a number of comparisons and report the differences between text categories in a clear and descriptive manner. The written reflection will be equally descriptive and include insightful  reflections and deductions on how preprocessing affects these text metrics. These reflections and deductions will be clearly connected to the data and results from the student’s analysis. All of the code cells will work properly. The paper includes a successful attempt at the challenge.

B-level papers will run the comparisons and note the differences between text categories. The written report will be partially descriptive but also include reflections and deductions on how preprocessing affects these text metrics. There are some connections made to the results of the student’s analysis. All of the code cells will work properly. A challenge is attempted to limited success.

C-level papers will run few comparisons and make note of important differences between text categories. The written reflection will be mostly descriptive. All of the code cells will work properly. No challenge is attempted.

D-level papers will run one or few comparisons between texts, make note of some differences between the texts, and include a written reflection which is too short and too descriptive. Some of the code cells may not work properly.

A-level Challenge

A-level papers need to go above and beyond the rest. Students need a way to play to their strengths. The challenge provides that opportunity. Students can either flex their computer science skills, showcase their critical thinking abilities and/or domain knowledge outside of computer science, or   some combination of both. In either case, you should be driven by a desire to have your assignment used as an exemplar for next year’s cohort of students.

I want to leverage my computer science skills:

Go nuts with your program, but in a way that stays within the confines of the assignment prompt. You might want to develop new text metrics or improve upon the ones used in the notebook. You might find a way to efficiently compare data from multiple sources, combining the results computationally as a way to more empirically demonstrate the effects of preprocessing on text. You would still write a report which meets the criteria of A.

I want to leverage my non-computer science knowledge:

Write a report which blows my mind in its ability to make connections between your results and the assignment prompt, but also goes a step further to consider other contexts and domains. You may want to draw from your domain knowledge in languages, linguistics, or other content areas to discuss what might happen in otherlanguages or domains beyond the data used in this assignment. You might even go out and find some additional research or papers on the topic and integrate them into your assignment.




热门主题

课程名

mktg2509 csci 2600 38170 lng302 csse3010 phas3226 77938 arch1162 engn4536/engn6536 acx5903 comp151101 phl245 cse12 comp9312 stat3016/6016 phas0038 comp2140 6qqmb312 xjco3011 rest0005 ematm0051 5qqmn219 lubs5062m eee8155 cege0100 eap033 artd1109 mat246 etc3430 ecmm462 mis102 inft6800 ddes9903 comp6521 comp9517 comp3331/9331 comp4337 comp6008 comp9414 bu.231.790.81 man00150m csb352h math1041 eengm4100 isys1002 08 6057cem mktg3504 mthm036 mtrx1701 mth3241 eeee3086 cmp-7038b cmp-7000a ints4010 econ2151 infs5710 fins5516 fin3309 fins5510 gsoe9340 math2007 math2036 soee5010 mark3088 infs3605 elec9714 comp2271 ma214 comp2211 infs3604 600426 sit254 acct3091 bbt405 msin0116 com107/com113 mark5826 sit120 comp9021 eco2101 eeen40700 cs253 ece3114 ecmm447 chns3000 math377 itd102 comp9444 comp(2041|9044) econ0060 econ7230 mgt001371 ecs-323 cs6250 mgdi60012 mdia2012 comm221001 comm5000 ma1008 engl642 econ241 com333 math367 mis201 nbs-7041x meek16104 econ2003 comm1190 mbas902 comp-1027 dpst1091 comp7315 eppd1033 m06 ee3025 msci231 bb113/bbs1063 fc709 comp3425 comp9417 econ42915 cb9101 math1102e chme0017 fc307 mkt60104 5522usst litr1-uc6201.200 ee1102 cosc2803 math39512 omp9727 int2067/int5051 bsb151 mgt253 fc021 babs2202 mis2002s phya21 18-213 cege0012 mdia1002 math38032 mech5125 07 cisc102 mgx3110 cs240 11175 fin3020s eco3420 ictten622 comp9727 cpt111 de114102d mgm320h5s bafi1019 math21112 efim20036 mn-3503 fins5568 110.807 bcpm000028 info6030 bma0092 bcpm0054 math20212 ce335 cs365 cenv6141 ftec5580 math2010 ec3450 comm1170 ecmt1010 csci-ua.0480-003 econ12-200 ib3960 ectb60h3f cs247—assignment tk3163 ics3u ib3j80 comp20008 comp9334 eppd1063 acct2343 cct109 isys1055/3412 math350-real math2014 eec180 stat141b econ2101 msinm014/msing014/msing014b fit2004 comp643 bu1002 cm2030
联系我们
EMail: 99515681@qq.com
QQ: 99515681
留学生作业帮-留学生的知心伴侣!
工作时间:08:00-21:00
python代写
微信客服:codinghelp
站长地图