CSCI4430
Assignment 2: Adaptive Video Streaming via CDN
Due: March 9th, 2025 @11:59 PM
Video traffic dominates the Internet. In this project,you will explore how video content distribution networks(CDNs)work. In particular,you will implement(1)adaptive bitrate selection through an HTTP proxy server and (2)load balancing.
This project is divided into Part 1 and Part 2.We recommend that you work on them simultaneously(both of them can be independently tested),and finally integrate both parts together.This is a group project;you may work in groups of up to two people.
This project has the following goals:
● Understand the HTTP protocol and how it is used in practice to fetch data from the web.
● Understand the DASH MPEG video protocol and how it enables adaptive bitrate video streaming.
● Use polling to implement a server capable of handling multiple simultaneous client connections.
● Understand how Video CDNs work in real life.
Table of contents
1. Background
2. Getting Started
3. Part 1:HTTP Proxy
4. Part 2:Load Balancer
5. Autograder
Background
Video CDNs in the Real World
The figure above depicts a high level view of what this system looks like in the real world.Clients trying to stream a video first issue a DNS query to resolve the service's domain name to an IP address for one of the CDN's video servers.The CDN's authoritative DNS server selects the "best"content server for each particular client based on(1)the client's IP address (from which it learns the client's geographic location)and (2)current load on the content servers (which the servers periodically report to the DNS server).
Once the client has the IP address for one of the content servers,it begins requesting chunks of the video the user requested.The video is encoded at multiple bitrates.As the cient player receives video data,it calculates the throughput of the transfer and it requests the highest bitrate the connection can support (i.e.play a video smoothly, without buffering if possible).For instance,you have almost certainly used a system like this when using the default "Auto"quality option on YouTube:
Video CDN in this Assignment
Normally,the video player clients select the bitrate of the video segments they request based on the throughput of the connection.However,in this assignment,you will be implementing this functionality on the server side.The server will estimate the throughput of the connection with each client and select a bitrate it deems appropriate.
You'lwrite the components highlighted in yellow in the diagram above(the proxy and the load balancer).
Clients: You can use an off-the-shelf web browser(Firefox,Chrome,etc.)to play videos served by your CDN(via your proxy).You can simulate multiple clients by opening multiple tabs of the web browser and accessing the same video,or even using multiple browsers.You will use network throttling options in your browser to simulate different network conditions(available in both Firefox and Chrome).
Video Server(s):Video content will be served from our custom video server;instructions for running it are included below.With the included instructions,you can run multiple instances of video servers as well on different ports.
Proxy:Rather than modify the video player itself,you will implement adaptive bitrate selection in an HTTP proxy.The player requests chunks with standard HTTP GET requests;your proxy will intercept these and modify them to retrieve whichever bitrate your algorithm deems appropriate,returning them back to the client.Your proxy will be capable of handling multiple clients simultaneously.
Load Balancer:You will implement a simple load balancer that can assign clients to video servers either geographically or using a simple round-robin method.This load balancer is a stand-in for a DNS server;as we are not running a DNS protocol,we will refer to it as a load balancer.The load balancer will read in information about the various video servers from a file when it is created;it will not communicate with the video servers themselves.
After you implement the basics of the HTTP Proxy and the Load Balancer, you will integrate the two together.The proxy can query the load balancer every time a new client connects to figure out which video server to connect to.
Important:IPs and Ports
In the real world,IP Addresses disambiguate machines.Typically,a given service runs on a predetermined port on a machine.For instance,HTTP web servers typically use port 80,while HTTPS servers use port 443.
For the purposes of this project,as we want you to be able to run everything locally,we will instead distinguish different video servers by their(ip,port)tuple.For instance,you may have two video servers running on (localhost,8000)and (localhost,8001).We want to emphasize that this would not make much sense in the real world;you would probably use a DNS server for load balancing,which would point to several IPs where video servers are hosted,each using the same port for a specific service.
Getting Started
This project has been adapted so that it can be run and tested on your own device, without any need for a virtual machine.Although this leads to a slightly less realism,we hope it makes development faster and easier.
Note:The only configuration that cannot be tested locally is running a geographic load balancer in conjunction with a load-balancing miProxy.This willhave to occur on Mininet.However,you are able to locally test both(1)miProxy with a round- robin load balancer and (2)a geographic load balancer on its own.
To get started,clone this Github repository.We are using google drive to store the video files in the Git repo.You can use this link to download the tears-of-steel video files,and this link to download the Soar with CUHK video files.The downloaded video file should be placed under the videoserver/static/videos folder.The structure of the videoserver/static should be:
You can then create your own private GitHub repository,and push these files to that repo.Your repository should be shared only with your group members,and should not be publicly accessible.Making your solution code publicly accessible,even by accident,will be considered a violation of the Honor Code.You can create a private repository through the GitHub website,and add it as a remote to the cloned repository with
$git init
$git remote add origin [email protected]:[Your-User:Your-Repo]
$git add -A
$git commit -m "Initial commit"
$git push --set-upstream origin main
The structure of the files is as follows:
Your Code
As in Part 1,we will be using CMake as our build system.The top-level CMake file is at cpp/CMakeLists.txt.There are also CMakeLists.txt files in every subdirectory.These files have been filled out for you.We encourage you to take a look and see how they work.You may need to modify them if the structure of your code changes.You may not use any external packages other than the ones we provide:spdlog,cxxopts,pugixml, and boost::regex.
We have also included a common folder with a few network utility functions as well as the protocol definition for communicating with the load balancer.You can(and should!) add more network utility functions and other code that can be shared between miProxy and loadBalancer into the common folder.
The structure of the project is otherwise self-explanatory;your implementation for miProxy should go in the miProxy folder,and your implementation of the loadBalancer should go in the loadBalancer folder.The following commands should allow us to build your code from the base of the project:
$mkdir build
$cd build
$cmake ../cpp
$make
This should result in executables build/bin/miProxy and build/bin/loadBalancer.
We also encourage you to integrate CMake with your editor.For instance,VSCode has a CMakeTools extension.This enables VSCode's intellisense to properly find code dependencies,which can eliminate annoying fake syntax errors.There are many resources online for you to figure out how to do this.
The two parts of the project can each be tested on their own before the integrated version is tested.You may wish to parallelize work among your groupmates;feel free to do so,but please remember that all individuals are responsible for understanding the entire project.
Running the Video Server
We have provided a simple video server for you,implemented in the videoserver/ directory.First,you will need to unzip some of the video files:
$cd videoserver/static/videos
$tar -xvzf cuhk.tar.gz
$tar -xvzf tears-of-steel.tar.gz
This should lead to two folders --cuhk and tears-of-steel --being created inside the videos/folder.
For Python,we will be using the uv package manager.Please follow the instructions on the linked Github page to install uv on your machine.
Once you have installed uv,you can navigate to the videoserver/directory and run
uv sync
This willdownload all necessary Python dependencies and create a virtual environment. You can then run
uv run launch_videoservers.py
to launch videoservers.This takes the following command line arguments
● -n|--num-servers:Defaults to 1.Controls how many video servers will be launched.
● -p|--port:Defaults to 8000.Controls which port the video server(s)will serve on.For multiple videoservers,the ports will be sequential;for instance,running the following command will launch three videoservers on ports 8000,8001,and 8002.
uv run launch_videoservers.py -n 3-p 8000
Note:It is not necessary to have multiple videoservers running for the early part of this project.You willonly really need this if you want to test how well your miProxy works with load balancing and separating multiple clients.
Once you launch a videoserver (e.g.on port 8000),you can navigate to 127.0.0.1:8000 (or localhost:8000)in your browser to see it.It will look something like this:
You can click on the linked pages to play the videos.The first one(Tears of Steel)does not have audio,while the second one(CUHK Video)does;this can help vary your testing.The Tears of Steel video also has watermarks to indicate the bitrate of the current segment.The video server will also log helpful output to stdout that can help you debug.
Note that you are currently directly accessing the video server;when testing this project,you will instead navigate to the ip:port of your running proxy,which will communicate with the video server for you.
Libraries
We expect you to use cxxopts for parsing command-line options,and spdlog for variable-level logging.We will require certain logs to be printed using spdlog from both the HTTP proxy and the load balancer in order to faciliateautograding and debugging.
We have also included pugixml,a C++XML-parsing library and the boost::regex library in the CMake files.You do not have to use these libraries,but it will make parsing video manifest files and HTTP requests much easier.Documentation for these libraries is available online.
Why not use #include <regex>?The C++standard library's regex header is widely known to be slow and inefficient.This means that you will instead see packages like re2 or boost used in production code.
We provide a script. download_deps.sh to download these libraries,all the downloaded libraries will be stored under the deps folder.
./download_deps.sh
After downloading,the structure of deps folder should be:
You may have to install Boost on your system.If you are on a Mac,this is very easy. Simply use Homebrew and run
brew install cmake boost
and you're done and ready to skip to "Setting Up the Starter Code".
On Windows or Linux,installing CMake and Boost are also relatively simple.On Ubuntu /WSL,you can run
sudo apt-get install cmake libboost-all-dev
Part 1:HTTP Proxy
Many video players monitor how quickly they receivedata from the server and use this throughput value to request better or lower quality encodings of the video,aiming to stream the highest quality encoding that the connection can handle.Instead of modifying an existing video client to perform bitrate adaptation,you will implement this functionality in an HTTP proxy through which your browser will direct requests.
You are to implement a simple HTTP proxy,miProxy .It accepts connections from web browsers,modifies video chunk requests as described below,opens a connection with the resulting IP address,and forwards the modified request to the server.Any data (the video chunks)returned by the server should be forwarded,unmodified,to the browser miProxy should:
1.Run as a server on a specified port
2.Accept connections from clients,including multiple clients simultaneously
3.Connect to a video server
4.Forward HTTP requests from clients to the appropriate video server
5.Forward HTTP responses from the video server to the appropriate client
6.Measure the throughput of each video segment to each client
7.Capture video manifest file HTTP requests,returning the no-list manifest file to clients while reqeuesting the regular manifest file for itself
8.Capture video segment HTTP requests and modify the request to have the appropriate bitrate
You will implement two modes of miProxy:
1.No load balancing occurs,with a single video server for all clients
2.A load balanced version,where miProxy queries your load balancer (implemented in Part 2)to figure out which video server to assign to each incoming TCP connection.
Clients vs.Incoming Sockets
For optimization,web browsers may open up several TCP connections for a single tab; this can lead to multiple sockets connecting to your miProxy server for a single client We will use the term "client socket"to refer to an individual socket and "client"to refer to a group of sockets of a single tab that form one logical client.