代写CS917 Foundations of Computing 2024/2025代做留学生Python程序-留学生作业帮

代写CS917 Foundations of Computing 2024/2025代做留学生Python程序

CS917 Foundations of Computing

– Algorithms Coursework

2024/2025

Submission Details

This assessed piece of coursework must be submitted by Noon 12:00, Monday, 25th November 2024. Submission is done electronically via Tabula, and two files are expected for submission:

1. A PDF of your solutions named Answers.pdf.

2. A file named SuffixTreeSearch.py of your python solution for the parts of Question 7.

The solutions submitted must be your own work. You are expected to work indepen- dently, not in groups.

Marking

1. Please show the full work progress where appropriate, detailing how you have arrived at your answer.

2. We don’t subtract marks, but you earn marks. Marking criteria are correctness of results; quality of analysis and reasoning, which involves backing up your arguments with relevant examples; the relevance, quality and conciseness of your argumentation; organisation and presentation of results (correctly labelled graphs, etc.); quality of code (well structured, annotated and documented, computationally efficient).

3. To achieve very high marks (> 80%), you are expected to clearly exceed expectations of what is considered a correct solution together with an explanation. This could for example involve providing a highly efficient solution together with a proof of complex- ity, demonstrating that you engaged with the scientific literature, or demonstrating exceptional understanding of a problem through a concise discussion or interpretation.

4. Follow the guidelines on using generative AI tools found in the Handbook (https:// warwick.ac.uk/fac/sci/dcs/teaching/handbook/coursework), specifically on how to quote them and keeping a log of your chat protocol.

Questions

Question 1 (10 points)

For each function f(n) and time t in the following table, determine the largest size n (an integer) of a problem that can be solved in time t, assuming that the algorithm to solve the problem takes f(n) microseconds. Where possible, formulate the analytical solution to the problem by solving for n and compute the answer with the help of an electronic calculator (or Python). If you need to solve a problem numerically, you can use Python to assist in working out answers. Explain your work progress, not just write down the answers. (We will assess your work progress based on whether you have shown a clear understanding of the problem and that you have solved this problem independently; simply writing down the answers without explanation will receive no marks).

	1 second	1 minute	1 hour	1 day
nlog2 n
√2n2
n3
√n
n!

Question 2 (10 points)

Where possible, apply the master theorem for the following. If not possible, state the reason why.

(a) T(n) = 7T( n 2 ) + 47n

(b) T(n) = 2nT( n 2 ) + n

(d) T(n) = 4T( 5 3 n ) + 50n 3

(e) T(n) = 8T(n/4) + f(n) and all you know about f(n) is that f(n) = Ω(n)

Question 3 (10 points)

For each of the following input lists, state what you believe to be the most effective sorting algorithm. Use only the algorithms that are covered in the lecture. Justify your reasoning, taking into account complexity, number of operations, memory usage and any other properties of sorted lists. If there are features or attributes of the algorithm that are implementation dependent, state these in your answer.

(a) [1, 2, 4, 3, 5, 6, 7, 10]

(b) [13, 12, 9, 6, 5, 4, 3, 2, 1]

(d) [6, 5, 6, 5, 9, 11, 1, 23, 7]. Two objects with the same values need to preserve their order after sorting.

(e) [(3, 9), (4, 5), (4, 4), (9, 5), (8, 7), (10, 6)] where each tuple (key, value) is sorted by the key. The ordering of the values must be preserved where the keys are equivalent.

Question 4 (5 points)

Apply Dijkstra’s Algorithm to the following graph, computing the shortest path for all vertices from vertex A.

Write a paragraph to explain your work process and present the results by drawing the updated diagram after each vertex has been processed. In each diagram, highlight the vertices that have been processed and the labels of distance at each vertex.

Question 5 (5 points)

Apply Dijkstra’s Algorithm to the following directed graph, and compute the shortest path for all vertices from vertex A.

As in Question 5, write a paragraph to explain your progress and present your results by drawing the updated diagram after each vertex is processed. In each diagram, highlight the vertices that have been processed and the labels of distances at each vertex.

Question 6 (10 points)

For the following graph:

(a) Apply Prim’s Algorithm, beginning at vertex A. Show the results of each stage of the algorithm. In this case, we define a stage as the addition of a new vertex and a new edge to the MST S={V,E}.

Present the results in the format of the following table (note that values provided are only examples and do not correspond to the solution), and write a paragraph to explain your work process.

Stage	V	E
0	(A)	()
1	(A,B)	((A,B))
.	.	.
. .	. .	. .
n	(A,B,C.D,E,F)	((A,B),(B,C),(C,D),(D,E),(E,F))

(b) Apply Kruskal’s Algorithm. Show the results of each stage of the algorithm. In this case, we define a stage as the processing of an edge from the graph, whether or not it is added to the MST S={V,E}.

Present the results in the format of the following table (values provided are only examples and do not correspond to the solution), and write a paragraph to explain your work process.

Stage	Edges	Components	E
0	((A,B), (B,C), (C,D), (D,E), (E,F))	((A), (B), (C), (D), (E), (F))	()
1	((B,C), (C,D), (D,E), (E,F))	((A,B), (C), (D), (E), (F))	((A,B))
. . .	. . .	. . .	. . .
n		((A,B,C,D,E,F))	((A,B), (B,C), (C,D), (D,E), (E,F))

Note the division of the MST Vertices Set into Connected Components.

Question 7 (50 points)

In Lab 2 you have learnt about Word Trees and how you can efficiently search for words, for example in a dictionary. A practical application where Word Trees are used is for efficiently searching whether ˆ(s), a fragment of a genomic sequence in an experimental sample, belongs to a particular organism. Here we are interested in finding whether ˆ(s) can be found in one of three different Coronavirus (SARS-Cov 2 virus) strains (denoted as Wuhan, Omicron, Hiroshima). You are given the gene sequence data, a string s consisting of the letters of the genetic alphabet Γ = {A,T,C, G}, of the famous S-protein (spike protein) of each strain which is an important target of different vaccination strategies.

Precisely, we are asking whether ˆ(s) is a substring of gene s. Using what is called a suffix tree of s, we can perform. a substring test for ˆ(s) in time O(|ˆ(s)|), where | ˆ(s)| is the length of ˆ(s).

You will find different solutions for creating suffix trees in this context, many of which are more advanced, for example, allowing to process online data. However, for this coursework you are meant to stick as closely to the following definition as possible. Specifically, there is no need to implement suffix links or construct an implicit tree, for example.

Definition: Simple Suffix Tree

Let s = s1 , ..., sn be a string. A directed tree G = (V, E) with a root r is called a Simple Suffix Tree for s if it satisfies the following conditions:

(i) The tree has exactly n leaves which are labelled 1, ..., n

(ii) The edges of the tree are labelled with symbols from an alphabet Γ, or $ indicating an “end” character, needed when a suffix is a prefix of another suffix

(iii) All outgoing edges (from an inner vertex to its children) are labelled with pairwise different symbols

(iv) The path from the root to the leaf i is labelled si , ..., sn

[Hint: (ii) becomes clear if you think about the example s = ATGATG

An example of a Simple Suffix Tree for the sequence s = TACTAG is given in Figure 1.

Figure 1: Example of a Suffix Tree

(a) Simple Suffix Tree (25 points)

Write a Python program that creates a Simple Suffix Tree according to the previous Definition, and performs a substring search. Your program should return the start position of the substring, if found, and allow for multiple occurrences of a substring. In Answers.pdf, provide a few examples of your choice that show that your code correctly produces a Suffix Tree. These should be for arbitrary, short sequences, similar to the examples s = TACTAG or s = TACTAC, Demonstrate that you can perform. a search successfully. You can either draw Suffix Trees by hand and take a picture, or use tools such as networkx.

Your program SuffixTreeSearch.py should take three arguments: 1) a string with the filename of the file containing the sequences (for example ”Wuhan-S.txt”), 2) substringˆ(s) (for example ”AACCTTGG”), and 3) a string indicating which type of Suffix Tree to use, for Question 7a) this will be ”Simple Suffix Tree”, in 7c) this will be ”Compact Suffix Tree” . Inputs should be with quotes.

SuffixTreeSearch.py should return either: 0 (substring has not been found)

i (start index of substring ˆ(s) in s, i ∈ 1,..., n), or

multiple start indices, separated by a blank, e.g. ”3 127 3340”, returned without quotes.

You need to adhere to these conventions strictly, otherwise we cannot perform. auto-

mated tests to check the correctness of your algorithm.

(b) Substring tests for different S Proteins (10 points)

Use your Suffix Tree program to search whether the following substrings are contained in the three S proteins, with the genetic sequence provided in the files Hiroshima-S.txt, Omicron-S.txt and Wuhan-S.txt.

ˆ(s)1 = ACTGAAATCTATCAG

ˆ(s)2 = ACTGAAATCTATCAGGCC

ˆ(s)3 = ACTGAAATCTATCAGGCCGGTAGC State the results in Answers.pdf.

Implement a Compact Suffix Tree in SuffixTreeSearch.py, recognising in Figure 1 that consecutive edges could be collapsed into a single one, with the edge providing pointers to the start and end indices of the original string. For example, the branch labelled ”1” which has seven edges could be represented by three edges:

e1 = TA with pointers start=[1,4], end=[2,5], indicating that TA occurs twice in TACTAG, starting at the first and fourth position and ending at the second and fifth position.

e2 = G with pointers start=[6], end=[6]

e3 = CTAG with pointers start=[3], end=[6]

For genomic sequences which can be highly repetitive, for example having hundreds of ”A”s in a row, a Compact Suffix Tree can reduce memory requirements significantly.

In this task you will be given a lot more flexibility. For example, you could argue whether to maintain the $ ”end” character, or decide on different ways to use pointers. Think about borderline cases first, for example, what if s = AAAAAAAAAAAAAAAAAAA?

You will be able to still score high marks in this task, if you provide a good design concept, including considerations on space complexity, even without a full implemen- tation.

课程名

mktg2509 csci 2600 38170 lng302 csse3010 phas3226 77938 arch1162 engn4536/engn6536 acx5903 comp151101 phl245 cse12 comp9312 stat3016/6016 phas0038 comp2140 6qqmb312 xjco3011 rest0005 ematm0051 5qqmn219 lubs5062m eee8155 cege0100 eap033 artd1109 mat246 etc3430 ecmm462 mis102 inft6800 ddes9903 comp6521 comp9517 comp3331/9331 comp4337 comp6008 comp9414 bu.231.790.81 man00150m csb352h math1041 eengm4100 isys1002 08 6057cem mktg3504 mthm036 mtrx1701 mth3241 eeee3086 cmp-7038b cmp-7000a ints4010 econ2151 infs5710 fins5516 fin3309 fins5510 gsoe9340 math2007 math2036 soee5010 mark3088 infs3605 elec9714 comp2271 ma214 comp2211 infs3604 600426 sit254 acct3091 bbt405 msin0116 com107/com113 mark5826 sit120 comp9021 eco2101 eeen40700 cs253 ece3114 ecmm447 chns3000 math377 itd102 comp9444 comp(2041|9044) econ0060 econ7230 mgt001371 ecs-323 cs6250 mgdi60012 mdia2012 comm221001 comm5000 ma1008 engl642 econ241 com333 math367 mis201 nbs-7041x meek16104 econ2003 comm1190 mbas902 comp-1027 dpst1091 comp7315 eppd1033 m06 ee3025 msci231 bb113/bbs1063 fc709 comp3425 comp9417 econ42915 cb9101 math1102e chme0017 fc307 mkt60104 5522usst litr1-uc6201.200 ee1102 cosc2803 math39512 omp9727 int2067/int5051 bsb151 mgt253 fc021 babs2202 mis2002s phya21 18-213 cege0012 mdia1002 math38032 mech5125 07 cisc102 mgx3110 cs240 11175 fin3020s eco3420 ictten622 comp9727 cpt111 de114102d mgm320h5s bafi1019 math21112 efim20036 mn-3503 fins5568 110.807 bcpm000028 info6030 bma0092 bcpm0054 math20212 ce335 cs365 cenv6141 ftec5580 math2010 ec3450 comm1170 ecmt1010 csci-ua.0480-003 econ12-200 ib3960 ectb60h3f cs247—assignment tk3163 ics3u ib3j80 comp20008 comp9334 eppd1063 acct2343 cct109 isys1055/3412 math350-real math2014 eec180 stat141b econ2101 msinm014/msing014/msing014b fit2004 comp643 bu1002 cm2030

联系我们

EMail: 99515681@qq.com

QQ: 99515681

留学生作业帮-留学生的知心伴侣！

工作时间：08:00-21:00

微信客服：codinghelp

热门主题

课程名