Computer Science CSC263H
September 25, 2024
Homework Assignment #3
Due: October 9, 2024, by 11:00 am
• You must submit your assignment through the Crowdmark system. You will receive by email an invitation through which you can submit your work. If you haven’t used Crowdmark before, give yourself plenty of time to figure it out!
• You must submit a separate PDF document with for each question of the assignment.
• To work with one or two partners, you and your partner(s) must form a group on Crowdmark (one submission only per group). We allow groups of up to three students. Submissions by groups of more than three students will not be graded.
• The PDF file that you submit for each question must be typeset (not handwritten) and clearly legible. To this end, we encourage you to learn and use the LATEX typesetting system, which is designed to produce high-quality documents that contain mathematical notation. You can use other typesetting systems if you prefer, but handwritten documents are not accepted.
• If this assignment is submitted by a group of two or three students, for each assignment question the PDF file that you submit should contain:
1. The name(s) of the student(s) who wrote the solution to this question, and
2. The name(s) of the student(s) who read this solution to verify its clarity and correctness.
• By virtue of submitting this assignment you (and your partners, if you have any) acknowledge that you are aware of the homework collaboration policy for this course, as stated here .
• For any question, you may use data structures and algorithms previously described in class, or in prerequisites of this course, without describing them. You may also use any result that we covered in class (in lectures or tutorials) by referring to it.
• Unless we explicitly state otherwise, you should justify your answers. Your paper will be marked based on the correctness and efficiency of your answers, and the clarity, precision, and conciseness of your presentation.
• The total length of your pdf submission should be no more than 5 pages long in a 11pt font.
|
Question 1. (20 marks) A real estate agency maintains information about a set of houses for sale. For each house it maintains a record (p,s), where p is the price and s is the floor area of the house. Each record of this type is called a listing. You must design a data structure D of listings that supports the following operations:
• Insert(D, x): If x is a pointer to a new listing, insert the listing pointed to by x into D.
• Delete(D, x): If x is a pointer to a listing in D, remove that listing from D.
• MaxArea(D, p): Return the maximum floor area of listings whose price is at most p. If no listing has price at most p, then return −1.
The worst-case run time of each of the above operation should be O(log n) where n is the current number of listings in the data structure D.
In this question, you must describe how to implement the above using an augmented AVL tree T. Since this implementation is based on a data structure and algorithms described in class and in the AVL handout, your should focus on describing the extensions and modifications needed here.
a. Give a precise and full description of your data structure. In particular, specify:
• what are the fields contained in each node of your AVL tree,
• what is the key field, and
• what is (are) the augmented field(s), and what identity should this (these) field(s) satisfy at each node. Illustrate this data structure by giving an example of it (with a small set of your own choice).
b. Describe the algorithm that implements each one of the three operations above, and explain why each one takes O(log n) time in the worst-case. For Insert(D, x) and Delete(D, x), your algorithm descriptions should be in clear and concise English. For MaxArea(D, p), you should give the pseudo-code.
Question 2. (20 marks) A Scheduler S consists of a set of threads; each thread is a tuple t = (id,status) where id is a distinct positive integer and status ∈ {A,R, S}; intuitively, A means active, R means ready to be scheduled, and S means stalled. The operations that Scheduler S supports are:
NewThread(t): Given a thread t = (id,status), add thread t to S. You can assume that the id of t is
different from the id of any thread currently in S (so that all the threads in S have distinct ids). Find(i): If S has a thread t = (i,−) then return t, else return −1.
Completed(i): If S has a thread t = (i,−) then remove t from S, else return −1.
ChangeStatus(i,stat): If S has a thread t = (i,−) then set the status of t to stat, else return −1.
ScheduleNext: Find the thread t with smallest id among all the threads whose status is R in S, set
the status of t to A and return t; if S does not have a thread whose status is R then return −1.
Describe how to implement the Scheduler S described above using an augmented AVL tree such that each operation takes O(log n) time in the worst-case, where n is the number of threads in S. Since this implementation is based on a data structure and algorithms described in class and in the AVL handout, you should focus on describing the extensions and modifications needed here.
a. Give a precise and full description of your data structure. In particular, specify:
• what are the fields contained in each node of your AVL tree,
• what is the key field, and
• what is (are) the augmented field(s), and what identity should this (these) field(s) satisfy at each node.
b. Describe the algorithm that implements each one of the five operations above, and explain why each one takes O(log n) time in the worst-case. Your description and explanation should be in clear and concise English. For the operations ChangeStatus(i,stat) and ScheduleNext, you should give the algorithm’s high-level pseudocode.
Question 3. (20 marks) You are given a list L of n, not necessarily distinct, integers. Your task is to devise an algorithm that outputs a list of the distinct integers in L, in order of non-increasing frequency (the frequency of an element in L is the number of times it occurs in L). For example, if L is 2, 5, 4, 4, 2, 3, 4, 3, then a suitable output is 4, 3, 2, 5; the only other suitable output for this list is 4 , 2, 3, 5.
The expected running time of your algorithm should be O(n), under some assumptions discussed in class. Note: Do not assume that the integers in L are small, or they have a small range: they could be arbitrary. So a solution based on a direct access table or on linear time sorting is not acceptable.
a. Describe your algorithm in clear and precise English.
b. Explain why the expected running time of your algorithm is O(n). Explicitly state every assumption that you need for your complexity analysis.
c. What is the worst-case running time of your algorithm? Use the Θ notation and justify your answer.
Hint: Use hashing with chaining, and find away to sort n numbers in the range 0 to nin O(n) worst-case time.
[The questions below will not be corrected/graded. They are given here as interesting problems that use material that you learned in class.]
Question 4. (0 marks)
In this question, you must use the insertion and deletion algorithms as described in the “Balanced Search Trees: AVL trees” course note posted on Quercus.
a. Insert keys 17, 7, 8, 14, 19, 6, 10, 21, 15, 12, 9, 11 (in this order) into an initially empty AVL tree, and show the resulting AVL tree T, including the balance factor of each node.
b. Delete key 17 from the above AVL tree T, and show the resulting AVL tree, including the balance factor of each node.
In each of the above questions, only the final tree should be shown.
Question 5. (0 marks) Consider an abstract data type that consists of a set S of integers on which the following operations can be performed:
Add(i) : Adds the integer i to S. If this integer already is in S, then S does not change.
Average(t) : Returns the average of all elements of S that are smaller than or equal to the integer t. If all the elements of S are larger than t, then return 0.
Describe how to implement this abstract data type using an augmented AVL tree T. Each operation should run in O(log n) worst-case time, where n = |S| . Since this implementation is based on a data structure and algorithms described in class and in the AVL handout, you should focus on describing the extensions and modifications needed here.
a. Give a precise and full description of your data structure. In particular, specify:
• what are the fields contained in each node of your AVL tree,
• what is the key field, and
• what is (are) the augmented field(s), and what identity should this (these) fields satisfy at each node. Illustrate this data structure by giving an example of it (with a small set of your own choice).
b. Describe the algorithm that implements each one of the two operations above in clear and concise English. For the operation Average, you should also give the algorithm’s high-level pseudocode. Explain why each operation takes O(log n) time in the worst-case.
Question 6. (0 marks) Give an algorithm for the following problem. The input to the algorithm are two unsorted sequences X = {x1 , x2 ,..., xn} and Y = {y1 , y2 ,..., yn} of n distinct positive integers each, and a positive number d. The algorithm should determine whether there is an xi ∈ X and an yj ∈ Y such that if we walk horizontally east for distance xi and then walk vertically north for distance yj , we end up distance d from where we started: if it is possible then the algorithm should output “YES”, otherwise it should output “NO” .
The expected running time of your algorithm should be O(n), under some assumptions discussed in class. Note: Do not assume that the input numbers are small, or they have a small range: they could be arbitrary. So a solution based on a direct access table or on linear time sorting is not acceptable.
a. Describe your algorithm in clear and precise English, and then give the pseudo-code.
b. Explain why the expected running time of your algorithm is O(n). Explicitly state every assumption that you need for your complexity analysis.
c. What is the worst-case running time of your algorithm? Use the Θ notation and justify your answer. Hint: What data structures or algorithms that we learned in class involve reasoning about probability?