ME5311 PROJECT
Instructions
Each student should use the dataset provided for the project, and conduct a thorough analysis of the dataset choosing some of the tools that you learned during the course.
Report
You are required to compile a brief report in the template provided in Overleaf without changing the font style and size, and the margins of the document. The report should be
• maximum 6-page long (excluding references);
• containing a maximum of 2 figures;
• containing a maximum of 20 references (note, references are not counted for the page limit).
The report is due by 21st April at 23:59.
The Overleaf template can be found here: https://www.overleaf.com/read/mxyqfhkdtbhr
We ask you to copy the project, and start drafting your own project. For those who have no experience with Latex, the only files you need to modify are: ME5311-template.tex (for the report), and ME5311-references.bib (for the bibliography).
Grading
We will be ultimately grading all your projects, based on the report; however, we ask you to grade your own project, and submit your grading as part of your assignment.
Dataset (~ 2 × 1 GB): Spatio-temporal data
Description
The Earth’s climate, widely known as a multi-scale, high dimensional, nonlinear and chaotic system, significantly influences ecosystems, societies, and economies globally. By combining historical observations and numerical model outputs, researchers are now able to provide highly detailed understanding of past weather and climate using reanalysis datasets.
This dataset, with a spatial resolution of 0.5。× 0.5。and daily temporal resolution, spans from 1979-12-31 to 2022-12-31 and covers a portion of the Indo-Pacific region (70。E-150。E, 10。S-40。N). It includes two key atmospheric variables: sea level pressure and two-meter temperature. The dataset comprises 16,071 daily snapshots, each containing 16,261 (161 × 101) grid points. One visualization example of time snapshot (2022- 12-31) is provided in Fig. 1.
Sea Level Pressure and Two-meter Temperature are stored in“slp.nc”and“t2m.nc”, respectively. If your laptop lacks sufficient memory during the project, please refer to the attached code for instructions on how to downsample the data further.
Figure 1: Daily mean sea level pressure and two-meter temperature in 2022-12-31. (The unit of T2m is converted from Kelvin to degree Celsius by minusing 273.15)
How to load the data
An example of how you can load the data in Python is reported below (file is in the same folder as the dataset):
1 import os
2 import xarray as xr
3
4 # dimensions of data
5 n_samples = 16071
6 n_ latitudes = 101
7 n_ longitudes = 161
8 shape = ( n_samples , n_ latitudes , n_ longitudes )
9
10 # load data
11 ds = xr . open_ data set ( ’ slp . nc ’ ) # change to ’ t2m . nc ’ for temperature data
12
13 # visualize dataset content
14 print ( ds )
15
16 # get ’ slp ’/ ’ t2m ’ values
17 # ’msl ’ is the variable name for slp data , change to ’ t2m ’ for temperature data
18 da = ds [ ’msl ’]
19 x = da . values
20 print ( x )
21 print ( x . shape )
22
23 # get time snapshots
24 da = ds [ ’ time ’]
25 t = da . values
26 print ( t )
27
28 # get longitude values
29 da = ds [ ’ longitude ’]
30 lon = da . values
31 print ( lon )
32
33 # get latitude values
34 da = ds [ ’ latitude ’]
35 lat = da . values
36 print ( lat )
37
38 # # ONLY if not enough memory
39 # low_res_ ds = ds [{ ’ longitude ’: slice ( None , None , 2) , ’ latitude ’: slice ( None , None , 2) }]
40 # low_res_ ds . to_net cdf ( path = ’ s lp _ low_res . nc ’) # change to ’ t2m_ low_res . nc ’ for temperature data
Listing 1: Loading the dataset in Python. See file load_data.py
Similarly, you can load the data in Matlab as follow (file is in the same folder as the dataset):
1 clc ; clear ; close all
2
3 % dimensions of data
4 n_samples = 16071;
5 n_ latitudes = 101;
6 n_ longitudes = 161;
7
8 % load data
9 ncfile = ’ slp . nc ’ ; % change to ’ t2m ’ for temperature data
10
11 % visualize dataset content
12 nc info ( ncfile )
13 ncdisp ( ncfile )
14
15 % get ’ slp ’/ ’ t2m ’ values
16 % ’msl ’ is the variable name for slp data , change to ’ t2m ’ for temperature data
17 x = ncread ( ncfile , ’msl ’ ) ;
18
19 % get time snapshots
20 t = ncread ( ncfile , ’ time ’ ) ;
21
22 % get longitude values
23 lon = ncread ( ncfile , ’ longitude ’ ) ;
24
25 % get latitude values
26 lat = ncread ( ncfile , ’ latitude ’ ) ;
Listing 2: Loading the dataset in Matlab. See file load_data.m
Modeling suggestions
Although the provided dataset is related to climate, no background in climate science is required for its analysis.
• What kind of information can you retrieve about this type of data?
• For example, can we model it via a reduced order model?
• Is the system predictable?
• How can you characterize this data from a dynamical systems point of view?
• Some methods introduced in the lectures seem only applicable to low dimensional systems. Is it possible to apply them to high dimensional system?
The above are just some suggestions.
Pick one or more topics from the course, to make a compelling analysis of the data!