代写LING 226 Assignment 2, 2023代写留学生Python程序

LING 226 Assignment 2, 2023

Short Program and Written Reflection 2 (25% of total grade)

The goal of this assignment is to ask and answer a question about the linguistic profiles of different texts. The question being asked should be motivated by your understanding of and possible assumptions about the texts. These motivations and assumptions do not necessary have to be correct, but they should be logical/reasonable/justifiable. For example, we could assume that children’s books  are written with less sophisticated vocabulary when compared to adult books – this is a reasonable assumption. We might also expect a horror novel to include more negative language when compared to a romance novel. In both cases, we can test these questions by constructing a linguistic profile of the texts and comparing them.

For the child/adult book comparison, we may want to create a lexical profile of frequency, diversity, concreteness, and potentially other measures of complexity. For the horror and romance comparison, we may want to construct a profile based on emotional vocabulary, sentiment, and other measures related to affect. In both cases, more than one measure would be employed to compare the texts (hence creating a linguistic profile). If more than one text is used for each category, we would calculate an average score for each category, and compare the results. Your job is to generate and a research question which you answer through calculating and comparing the linguistic profiles of different categories of texts. Specifically, please generate and test one question related to:

1.  A small corpus of your own design which contains at least two categories of texts

A good minimum for your corpus will be ten texts per category, with two categories, with approximately equal total words per category.

You are free to use texts from anywhere, including data provided by Stephen, questions from The Current, or elsewhere.

In addition to developing and comparing the linguistic profiles of your texts, you need to further complicate your analysis by examining the nature of your results when accounting for at least one distributional or syntactic property of the texts. This means information such as part of speech, collocations, or word vs. phrase-level measures (ngrams) should be incorporated into your research questions. For example, instead of just comparing the lexical sophistication between children and adult books for all words in a text, you would instead run separate analyses for nouns and verbs. Or, you might be interested in profiling the nature of collocations and/or ngrams for different measures. The choice is yours and might further interact with your research question (e.g., you might find that  sentiment ratings of adjectives pattern with specific nouns that follow).

Comparison of lexicon and distributional/syntactic information

Lexicon Information

Distributionaland Syntactic information

Sentiment & Emotion Ratings

Bigrams, ngrams

Age of Acquisition

Part of speech

Concreteness

Collocates

Your Python Code

You should create code cells and functions which:

•    Load in and preprocess your text(s)

◦  (The specific choices you make for preprocessing should be appropriate for your analysis)

•    Analyse your data for various lexical and syntactic features

•    Output your analysis either into the notebook, or written to file in a spreadsheet/text document

•    You data should be made available either by reading it in through URL or provided with your submission

Just as before, the course notebooks have everything you need to create these functions. You can reuse any and all of the functions in the course notebooks to create your program. You will however likely need to make modifications in order to adapt these functions / code cells to your particular analysis.

Your Written Reflection

In your notebook, you should prepare a report describing on your analysis, which should include these sections:

    Research Questions

◦  Clear statements of your research questions

◦  The rationale behind your research questions

◦  Predictions of what you will find

    Data

◦  Explanation of the data and what categories it represents

◦  How you gathered the data for your own corpus

•   Analysis

◦  Description the linguistic profiles

◦  Decisions about preprocessing and other preparations of the texts

◦  Which lexical features were included, and why?

    Results & Discussion

◦  Interpretation and discussion of your results (i.e., answering your research questions)

■   Did your expectations hold? Why or why not?

◦  Any remaining questions / limitations based on what happened during your analysis

Your report should be about 500-600 words long (up to 900 if attempting challenge). You should submit your assignment as a .ipynb notebook file in Nuku/Canvas by the due date. You should also provide your corpus data (or load the data in via URL). Your notebook should have a text cell at the start with includes your name, your student ID, and whether you are attempting to complete the challenge (see below). The notebook should include all of the code cells, plus your written report as text cells. You are free to mix code and text cells as you deem appropriate.

Marking Guidelines

A-level papers will contain two clearly articulated research questions and the rationale behind the questions. There are clearly stated decisions behind why certain linguistic features are chosen to construct the linguistic profiles, as well as well as the choice(s) behind distributionaland syntactic measures. The presentation and interpretation of the results are used to provide direct answers to the research questions. The report also reflects on any limitations or remaing questions. A link and/or data files are provided for the corpus. All of the code cells will work properly. The paper includes a successful attempt at the challenge.

B-level papers will contain two research questions and provide some rationale for the questions. The decisions behind why certain lingusitic features were chose may be unclear, as are the reasons for choosing particular distributionaland syntactic measures. Answers to the resarch questions are provided. A link and/or data files are provided for the corpus. All of the code cells will work properly. A challenge may be attempted, but to limited success.

C-level papers will contain unclear research questions and/or research questions with unclear motivation or justification. The linguistic profiles may be under explained and/or weakly connected to the research questions. Data presentation and interpretation will lack detail and/or not clearly connect to the research questions. Some attempt is made to answer the research questions. A link and/or data files are provided for the corpus. All of the code cells will work properly. No challenge is attempted.

D-level papers will have unclear research questions and/or poorly motivated linguistic profiles.

Attempts are made to answer the research questions, which may be only partially successful. Data may not be included, and some code cells may not work. No challenge is attempted.

A-level Challenge

A-level papers need to go above and beyond the rest. Students need a way to play to their strengths. The challenge provides that opportunity. Students can either flex their computer science skills, showcase their critical thinking abilities and/or domain knowledge outside of computer science, or some combination of both. In either case, you should be driven by a desire to have your assignment used as an exemplar for next year’s cohort of students.

I want to leverage my computer science skills:

Just as in assigment 1, expand the boundaries of your program beyond the level of the code provided  in the notebooks in order to provide more sophisticated presentation and analyses of your data. You   might want to explore different visualisations in Python, and/or play around which some classification functions (e.g., some machine learning or statistical comparisons). You would still write a report which meets the criteria of A.

I want to leverage my non-computer science knowledge:

Include more justification and external research for your research questions and analysis. You might conduct your anaysis as a replication of some published research, or instead continue a research direction already established in prior literature. Your interpretation of your results is connected not only to your data, but also results of previous studies.





热门主题

课程名

mktg2509 csci 2600 38170 lng302 csse3010 phas3226 77938 arch1162 engn4536/engn6536 acx5903 comp151101 phl245 cse12 comp9312 stat3016/6016 phas0038 comp2140 6qqmb312 xjco3011 rest0005 ematm0051 5qqmn219 lubs5062m eee8155 cege0100 eap033 artd1109 mat246 etc3430 ecmm462 mis102 inft6800 ddes9903 comp6521 comp9517 comp3331/9331 comp4337 comp6008 comp9414 bu.231.790.81 man00150m csb352h math1041 eengm4100 isys1002 08 6057cem mktg3504 mthm036 mtrx1701 mth3241 eeee3086 cmp-7038b cmp-7000a ints4010 econ2151 infs5710 fins5516 fin3309 fins5510 gsoe9340 math2007 math2036 soee5010 mark3088 infs3605 elec9714 comp2271 ma214 comp2211 infs3604 600426 sit254 acct3091 bbt405 msin0116 com107/com113 mark5826 sit120 comp9021 eco2101 eeen40700 cs253 ece3114 ecmm447 chns3000 math377 itd102 comp9444 comp(2041|9044) econ0060 econ7230 mgt001371 ecs-323 cs6250 mgdi60012 mdia2012 comm221001 comm5000 ma1008 engl642 econ241 com333 math367 mis201 nbs-7041x meek16104 econ2003 comm1190 mbas902 comp-1027 dpst1091 comp7315 eppd1033 m06 ee3025 msci231 bb113/bbs1063 fc709 comp3425 comp9417 econ42915 cb9101 math1102e chme0017 fc307 mkt60104 5522usst litr1-uc6201.200 ee1102 cosc2803 math39512 omp9727 int2067/int5051 bsb151 mgt253 fc021 babs2202 mis2002s phya21 18-213 cege0012 mdia1002 math38032 mech5125 07 cisc102 mgx3110 cs240 11175 fin3020s eco3420 ictten622 comp9727 cpt111 de114102d mgm320h5s bafi1019 math21112 efim20036 mn-3503 fins5568 110.807 bcpm000028 info6030 bma0092 bcpm0054 math20212 ce335 cs365 cenv6141 ftec5580 math2010 ec3450 comm1170 ecmt1010 csci-ua.0480-003 econ12-200 ib3960 ectb60h3f cs247—assignment tk3163 ics3u ib3j80 comp20008 comp9334 eppd1063 acct2343 cct109 isys1055/3412 math350-real math2014 eec180 stat141b econ2101 msinm014/msing014/msing014b fit2004 comp643 bu1002 cm2030
联系我们
EMail: 99515681@qq.com
QQ: 99515681
留学生作业帮-留学生的知心伴侣!
工作时间:08:00-21:00
python代写
微信客服:codinghelp
站长地图