A3: Preemptive Threads
Due Oct 27 by 11:59p.m.
Points 90
Available after Oct 8 at 12a.m.
Concurrency - A Preemptive User-Level Threads Package
In the previous assignment, you implemented a cooperative, user-level thread package in which thread_yield causes control to be passed from one thread to the next one. This assignment has five goals:
1. Implement preemptive threading, in which simulated "timer interrupts" cause the system to switch from one thread to another.
2. Implement the thread_sleep and thread_wakeup scheduling functions. These functions will enable you to implement blocking mutual exclusion and synchronization.
3. Use the thread_sleep and thread_wakeup functions to implement thread_wait, which will block a thread until the target thread has finished executing.
4. Implement blocking locks for mutual exclusion.
5. Implement preemptive priority scheduling.
You will be working with your queue.c and thread.c code that you implemented in Assignment 1 and 2.
Outline
1. Background: Timer Signals
2. Setup
3. Task 1: Preemptive Threading
4. Task 2: Sleep and Wakeup
5. Task 3: Waiting for Threads to Exit
6. Task 4: Mutex Locks
7. Task 5: Preemptive Priority Scheduling
8. Hints and Advice
9. Frequently Asked Questions
10. Using Git and Submitting Your Assignment
Background: Timer Signals
User-level code cannot use hardware timer interrupts directly. Instead, POSIX operating systems provide a software mechanism called signals that can be used to simulate "interrupts" at the user level. These signals interrupt your program and give you a chance to handle that interrupt. For example, when you hit Ctrl-C to kill a program, that causes the OS to send the SIGINT signal to your program. Most programs don't handle this signal, so by default, the OS kills the process. However, if you wanted to save the state of your program before it was killed (e.g., a text editor could save any unsaved files), you can register a handler with the OS for SIGINT. Then, when the user hits Ctrl-C, the OS calls your handler; in your handler, you could write out the state of your program and then exit.
More generally, signals are a form. of asynchronous, inter-process communication mechanism. A signal can be sent from one process to another, or from a process to itself. We will use the latter method to have the process that invokes your user-level scheduler, i.e., your thread library functions, deliver timer signals to itself.
The operating system delivers a signal to a target (recipient) process by interrupting the normal flow of execution of the process. Execution can be interrupted after any instruction. If a process has registered a signal handler, that handler function is invoked when the signal is delivered. After the signal handler finishes executing, the normal flow of execution of the process is resumed. Notice the similarities between signals and hardware interrupts.
Because signals can be delivered after any instruction, signal handling is prone to race conditions. For example, if you increment a counter (counter = counter + 1), either during normal execution or in your signal handler code, the increment operation may not work correctly because it is not atomic. The signal may be delivered between the instructions implementing the increment operation. To avoid this problem, you should disable signal delivery while the counter is being updated. Note that this is essentially the old OS strategy of disabling interrupts to protect critical sections on a uniprocessor.
Please read a short introduction to signals (http://en.wikipedia.org/wiki/Unix_signal) to understand how they work in more detail. Make sure to read the "Risks" section or else you may not be able to answer some questions below.
Now go over the code in the files interrupt.h and interrupt.c. (This is the same code we provide for Tutorial 4). You do not need to understand all the code, but it will be helpful to know how these functions should be used. We will use the terms "interrupts" and "signals" interchangeably below. void interrupt_init(bool verbose):
This function installs a timer signal handler in the calling program using the sigaction
(http://linux.die.net/man/2/sigaction) system call. When a timer signal fires, the function interrupt_handler in the interrupt.c file is invoked. With the verbose flag, a message is printed when the handler function runs.
void interrupt_end(void):
This function uninstalls the timer signal handler.
bool interrupt_set(bool enable):
This function enables timer signals when enable is 1, and disables (or blocks) them when enabled is 0. We call the current enabled or disabled state of the signal the signal state. This function also returns whether the signals were previously enabled or not (i.e., the previous signal state). Notice that the operating system ensures that these two operations (reading previous state, and updating it) are performed atomically when the sigprocmask (http://linux.die.net/man/2/sigprocmask) system call is issued. Your code should use this function to disable signals when running any code that is a critical section (http://en.wikipedia.org/wiki/Critical_section) (i.e., code that accesses data that is shared by multiple threads).
Why does this function return the previous signal state? The reason is that it allows "nesting" calls to this function. The typical usage of this function is as follows:
fn() {
/* disable signals, store the previous signal state in "enabled" */
int enabled = interrupts_set(false);
/* critical section */
interrupts_set(enabled);
}
The first call to interrupt_set disables signals. The second call restores the signal state to its previous state, i.e., the signal state before the first call to interrupt_set, rather than unconditionally enabling signals. This is useful because the caller of the function fn may be expecting signals to remain disabled after the function fn finishes executing. For example:
fn_caller() {
int enabled = interrupts_set(false);
/* begin critical section */
fn();
/* code expects signals are still disabled */
...
/* end critical section */
interrupts_set(enabled);
}
Notice how signal disabling and enabling are performed in "stack" order, so that the signal state remains disabled after fn returns.
The functions interrupt_on and interrupt_off are simple wrappers for the interrupt_set function.
bool interrupt_enabled():
This function returns whether signals are enabled or disabled currently. You can use this function to check (i.e., assert) whether your assumptions about the signal state are correct. Note that this function will always return false is the interrupt handler is not installed
void interrupt_quiet():
This function turns off printing signal handler messages.
To help you understand how this code works, you should work through Tutorial 4 (https://q.utoronto.ca/courses/354291/assignments/1380467) .
Setup
You will be doing this assignment within the A3 directory of your MarkUs repository.
Log in to MarkUs (https://markus.teach.cs.toronto.edu/2022-01/) and go to CSC369. Select A3, and then click the button to add the starter files for this assignment. Some of these will be the same base files as for Assignment 2, with the interrupt.[ch] files and the addition of more test programs, and an updated Makefile. queue.c, thread.c, and thread.h are not included in the starter code so that you remember to copy them from your A2.
Clone your MarkUs repo, or use git pull to update your existing checkout. The files for this assignment will be in the userid/A3 subdirectory.
Copy your queue.c, thread.c, and thread.h solution from your A2 directory to your A3 directory.
In thread.c, a new function needs to be defined:
void
set_priority(int priority)
{
/* TODO: set priority of the current thread and then schedule */
(void)priority;
}
In queue.c, implement the following function:
int
queue_count(fifo_queue_t * queue)
{
/* TODO: return the number of items currently in queue */
(void)queue;
return -1;
}
Lastly, add the updated files to your repo (git add *), commit locally, and push the changes back upstream to MarkUs.
Task 1: Preemptive Threading
Now you are ready to implement preemptive threading using the timer signals described above. However, before you start writing any code, make sure that you can answer the following questions:
1. What is the name of the signal that you will be using to implement preemptive threading?
2. Which system call is used by the process to deliver signals to itself?
3. How often is this signal delivered?
4. When this signal is delivered, which function in thread.c is invoked? What would this function do when it is invoked in a program that uses the thread library?
5. Is the signal state enabled or disabled when the function in thread.c above is invoked? If the signal is enabled, could it cause problems? If the signal is disabled, what code will enable them? Hint: look for sa_mask in interrupt.c.
6. What does unintr_printf do? Why is it needed? Will you need other similar functions? Reread a short introduction to signals (http://en.wikipedia.org/wiki/Unix_signal) to find out.
Signals can be sent to the process at any time, even when a thread is in the middle of a thread_yield, thread_create, or thread_exit call. It is a very bad idea to allow multiple threads to access shared variables (such as your ready queue) at the same time. You should therefore ensure mutual exclusion (http://en.wikipedia.org/wiki/Mutual_exclusion) , i.e., only one thread can be in a critical section (accessing the shared variables) in your thread library at a time.
A simple way to ensure mutual exclusion is to disable signals when you enter procedures of the thread library and restore the signal state when you leave.
Hint: think carefully about the invariants you want to maintain in your thread functions about when signals are enabled and when they are disabled. Make sure to use the interrupts_enabled function to check your assumptions.
Note that as a result of thread context switches, the thread that disables signals may not be the one enables them. In particular, recall that setcontext restores the register state (https://q.utoronto.ca/courses/354291/assignments/1379802) saved by getcontext. The signal state is saved when getcontext is called and restored by setcontext (look at the save_interrupt function in the understand_interrupt.c file in Tutorial 4). As a result, if you would like your code to be running with a specific signal state (i.e., disabled or enabled) when setcontext is called, make sure that getcontext is called with the same signal state. Maintain the right invariants, and you'll have no trouble dealing with context switches.
It will be helpful to go over the manual pages (http://linux.die.net/man/2/setcontext) of the context save and restore calls again.
Implement preemptive threading by adding the necessary initialization, signal disabling and signal enabling code in your thread library in thread.c.
After you implement preemptive threading, you can test your code by running the test/preemptive program.
$ test/preemptive
...
9: thread 31 passes potato at time = 59.927460
9: thread 32 passes potato at time = 59.947266
9: thread 33 passes potato at time = 59.986412
cleaning hot potato
preemptive test done
$
Task 2: Sleep and Wakeup
Now that you have implemented preemptive threading, you will extend your threading library to implement the thread_sleep and thread_wakeup functions. These functions will allow you to implement mutual exclusion and synchronization primitives. In real operating systems, these functions would also be used to suspend and wake up a thread that performs IO with slow devices, such as disks and networks. The thread_sleep primitive blocks or suspends a thread when it is waiting on an event, such as a mutex lock becoming available or the arrival of a network packet. The thread_wakeup primitive awakens one or more threads that are waiting for the corresponding event.
The thread_sleep and thread_wakeup functions that you will be implementing for this assignment are summarized here:
Tid thread_sleep(fifo_queue_t *queue):
This function suspends the caller and then runs some other thread. The calling thread is put in a wait queue passed as a parameter to the function. Note that there can be many wait queues in the system, one per type of event or condition. Upon success, this function returns the identifier of the thread that took control as a result of the function call. The calling thread does not see this result until it runs later. Upon failure, the calling thread continues running, and returns one of these constants:
THREAD_INVALID: alerts the caller that the queue is invalid, e.g., it is NULL.
THREAD_NONE: alerts the caller that there are no more threads, other than the caller, that are ready to run. Note that if the thread were to sleep in this case, then your program would hang because there would be no runnable thread.
int thread_wakeup(fifo_queue_t *queue, int all):
This function wakes up one or more threads that are suspended in the wait queue. The awoken threads are put in the ready queue. The calling thread continues to execute and receives the result of the call. When "all" is 0 (false), then one thread is woken up. In this case, you should wake up all threads in FIFO order, i.e., first thread to sleep must be woken up first. When "all" is 1 (true), all suspended threads are woken up. The function returns the number of threads that were woken up. It should return zero if the queue is invalid, or there were no suspended threads in the wait queue.
It may be helpful to remember that each thread can be in only one queue at a time (ready queue or any one wait queue).
When implementing thread_sleep, it will help to think about the similarities and differences between this function and thread_yield and thread_exit. Make sure that thread_sleep suspends (blocks) the current thread rather than spinning (running) in a tight loop. This would defeat the purpose of invoking thread_sleep because the thread would still be using the CPU.
All the thought that you put into ensuring that thread preemption works correctly previously will apply to these functions as well. In particular, these functions access shared data structures (which ones?), so be sure to enforce mutual exclusion.
Implement the thread_sleep and thread_wakeup functions in your thread library in thread.c.
After you implement the sleep and wakeup functions, you can test your code by running the test/wakeup program with an argument, e.g.:
$ test/wakeup 0 # all=0
starting wakeup test, all=0
initial thread returns from sleep(NULL)
initial thread returns from sleep(NONE)
wakeup test done
$ test/wakeup 1 # all=1
...
Recall that in the previous assignment, you had implemented thread_kill(tid), which ensured that the target thread (whose identifier is tid) did not run any further, and this thread would eventually exit when it ran the next time. In the case another thread invokes thread_kill on a sleeping thread, you should immediately remove the thread from the associated wait queue and wake it up (i.e., make it runnable). Then, the thread can exit when it runs the next time.
Be warned that tasks 1 and 2 must be completed before starting any of the subsequent tasks!
Task 3: Waiting for Threads to Exit
Now that you have implemented the thread_sleep and thread_wakeup functions for suspending and waking up threads, you can use them to implement blocking synchronization primitives in your threads library. You should start by implementing the thread_wait function, which will block or suspend a thread until a target thread terminates (or exits). Once the target thread exits, the thread that invokes thread_wait should continue operation. As an example, this synchronization mechanism can be used to ensure that a program (using a master thread) exits only after all its worker threads have completed their operations.
You should now remove the placeholder implementation from A2 and start from scratch.
The thread_wait function is summarized below:
int thread_wait(Tid tid, int *exit_code):
This function suspends the calling thread until the thread whose identifier is tid terminates. A thread terminates when it invokes thread_exit. Upon success, this function returns zero. If exit_code is not NULL, the exit status of the thread that exited (i.e., the value it passed to thread_exit) will be copied into the location pointed to by exit_code. Upon failure, the calling thread continues running, and returns the constant THREAD_INVALID. Failure can occur for the following reasons:
Identifier tid is not a feasible thread id (e.g., any negative value of tid or tid larger than the maximum possible tid)
No thread with the identifier tid could be found.
The identifier tid refers to the calling thread.
One or more other threads are already in the process of reaping the target thread.
You will need to associate a wait queue with each thread. When a thread invokes thread_wait, it should sleep on the wait queue of the target thread. When the target thread invokes exit, it should wake up the threads in its wait queue.
There are a couple of technicalities that you need to consider:
More than one caller of thread_wait(tid,...) can succeed for a target thread tid, as long the target thread has not exited. However, once the target thread becomes a zombie and its wait queue is non-empty, all calls to thread_wait(tid,...) should fail until the tid is reused by a subsequently created thread.
If a thread with id tid exits, i.e., calls thread_exit, before it is waited for, i.e., its wait queue is empty, a subsequent call to thread_wait(tid, &tid_status) must still be able to retrieve the exit status that thread tid passed to thread_exit. In this case, exactly one caller to thread_wait(tid,...) for that tid can succeed. The thread tid should be destroyed after exactly one other thread calls thread_wait on it.
If a thread with id tid is killed by another thread via thread_kill, set its exit code to THREAD_KILLED, then make it runnable again so that it may invoke thread_exit later. Note that you may have already implemented this for A2.
Threads are all peers. A thread can wait for the thread that created it, for the initial thread, or for any other thread in the process. One issue this creates for implementing thread_wait is that a deadlock may occur. For example, if Thread A waits on Thread B, and then Thread B waits on Thread A, both threads will deadlock. We do not expect you to handle this condition for the assignment, but it will be helpful to think about how you could implement thread_wait to avoid any deadlocks.
Implement thread_wait in your thread library and update any other relevant functions. After you implement this functionality, you can test your code by running the test/wait, test/wait_many, test/wait_kill.
$ test/wait_kill
starting wait_kill test
its over
Task 4: Mutex Locks
The next task is to implement mutual exclusion in your threads library. Recall that these primitives form. the basis for managing concurrency, which is a core concern for operating systems, so your library would not really be complete without them.
For mutual exclusion, you will implement blocking locks.
The API for the lock functions are described below:
struct lock *lock_create():
Create a blocking lock. Initially, the lock should be available. Your code should associate a wait queue with the lock so that threads that need to acquire the lock can wait in this queue.
void lock_destroy(struct lock *lock):
Destroy the lock. Be sure to check that the lock is available when it is being destroyed.
void lock_acquire(struct lock *lock):
Acquire the lock. Threads should be suspended until they can acquire the lock, after which this function should return.
void lock_release(struct lock *lock):
Release the lock. Be sure to check that the lock had been acquired by the calling thread, before it is released. Wake up at least one thread that is waiting to acquire the lock.
The lock_acquire, lock_release functions access shared data structures (which ones?), so be sure to enforce mutual exclusion.
Implement these functions in your thread library in thread.c.
After you implement these functions, you can test your code by running the test/lock program.
Task 5: Preemptive Priority Scheduling
The final task is to implement preemptive priority scheduling in your threading library. Recall that a priority scheduler always schedules the thread with the highest priority (lowest value). In other word, the highest priority thread can run indefinitely until it either terminates, or a even higher priority thread becomes ready to run. As such, we actually do not need the timer interrupt in order to implement a real-time priority scheduler. The main trick is to identify all parts of the API functions where either: 1) a thread enters the ready queue, and 2) a runnable thread changes priority. For example, you may use this code fragment at the end of your thread_create function (note that it may not be the optimal solution, and you are encouraged to attempt a better implementation):
/* thread is the newly created thread */
int ret = scheduler->enqueue(thread);
assert(ret == 0);
if (scheduler->realtime) {
ret = thread_yield(THREAD_ANY);
assert(ret >= 0);
}
After we enqueue the thread to the scheduler, now the ready queue may have a thread whose priority is higher than the current running thread. Therefore, we should invoke thread_yield to attempt preemption. However, this poses a challenge where if no threads in the ready queue have higher priority, then scheduler->dequeue() must be implemented to indicate that no context switch is required. You are suggested to make dequeue return current_thread if it is still runnable, and also still the highest priority thread in the system. Adjustment to thread_yield will be needed to make this work.
We also have an if statement to see whether the scheduler is real-time before calling thread_yield. This is to ensure that we do no such preemption when using other schedulers. Only the priority scheduler has realtime set to true.
In prio.c, you should implement each scheduler API function according to the specification outlined in schedule.h. You should have experience with this from implementing first-come, first-served scheduler during A2. You are suggested to take the existing FIFO queue implementation in queue.c, and make it so that it performs sorted insertion. You can gain access to this newly written function in another source file by using the extern keyword:
queue.c:
int
queue_push_sorted(fifo_queue_t * queue, node_item_t * node) {
...
}
prio.c:
// declaration for a function defined elsewhere
extern int queue_push_sorted(fifo_queue_t *, node_item_t *);
int
prio_enqueue(struct thread * thread) {
/* this will call the function in queue.c */
int ret = queue_push_sorted(prio_queue, thread);
...
}
Note that the suggested approach will result in O(n) performance for push. If you are a masochist like your instructor, you may attempt implementing a priority queue (https://en.wikipedia.org/wiki/Priority_queue) , which improves the performance of push to O(log n).
After you implement the scheduler functions and update the thread functions, you can test your code by running the test/prio program. You will likely get a lot of assertion errors that will force you to check the order of operations in functions where preemption can occur.
Implementation Details
The priority of the initial thread (thread 0) should be set to 0.
The highest possible priority is 0, and the lowest possible priority is INT_MAX.
When scheduling, in the case where the current thread's priority is the same as the highest priority thread in the ready queue, you should context switch.
You need to implement the set_priority function so that the current thread can change its own priority.
Hints and Advice
Start early, we mean it!
You are encouraged to reuse your own code that you might have developed in the first assignment or in previous courses for common data structures and operations such as queues, sorting, etc. Make sure to document the sources of anything you did not write specifically for this course.
You may not use code that subsumes the heart of this project (e.g., you should not base your solution on wrappers of or code taken from the POSIX thread library). If in doubt, ask.
This project does not require you to write a large number of lines of code. It does require you to think carefully about the code you write. Before you dive into writing code, it will pay to spend time planning and understanding the code you are going to write. If you think the problem through from beginning to end, this project will not be too hard. If you try to hack your way out of trouble, you will spend many frustrating nights in the assignment. This project's main difficulty is in conceptualizing the solution. Once you overcome that hurdle, you will be surprised at the simplicity of the implementation!
All the general hints and advice (https://q.utoronto.ca/courses/354291/pages/a2-faq) from Assignment 2 apply here as well.
Frequently Asked Questions
We have provided answers to various Frequently Asked Questions (https://q.utoronto.ca/courses/354291/pages/a3-faq) about the assignment. Make sure to go over them. We have provided answers to many questions that students have asked in previous years, so you will save time by going over these answers as you start working on the assignment.
Testing Your Code
You can test your entire code by running the programs in the tester folder. One week before the due date, the MarkUs autotester will become available with additional test cases. There will be no hidden test cases and you will receive exactly what you see after running the public test for the assignment.