代写ENGI46415 : Artificial intelligence and Deep learning调试Python程序

DEPARTMENT OF ENGINEERING COURSEWORK

Title:

AI and Deep Learning Applications

Time Required:

It is expected that you should spend approximately 100 hours on this

coursework assignment. This includes all learning related activities completed

during the year (for example, attending lectures/workshops, completing Problem Sheets, etc).

Deadline(s) for submission:

Monday 20 January 2025 at 14:00hrs.

Date for feedback:

Monday 17 February 2025

Submission instructions:

Your submission must be uploaded to Gradescope in advance of the deadline.

All submissions in the Department are electronic and no hard copy is required.

The maximum file size that can be accepted is 20 MB.

•    All submissions must be saved using the following naming convention: SURNAME-Firstname_ENGIXXXX.pdf

E.g. BLOGGS-Joanne_ENGI46415.pdf

Format:

•     Reports should be submitted in PDF format

Code files (.ipynb) should be submitted in a zip file

The report should not be longer than 10 sides of A4 and the minimum

allowed font size is 12pt. Margins should be no smaller than 2 cm.

Everything including tables and references has to be within the 10 sides of A4. Appendices may be included but will not form. part of the

examined material nor count toward the page limit.

Coursework Brief:

In this assignment, you will learn about supervised learning, unsupervised learning, deep learning, data augmentation, and ensemble learning. You will explore various aspects of this critical issue and contribute to ongoing efforts to improve the early diagnosis of diabetic diseases affecting the eyes.

Part of the assignment is programming numerical methods in Python, some of which have not been covered in lectures, which means that in order to be able to do this assignment, you will have to learn something new on your own. This assignment consists of four tasks which require Python codes that you have to implement and a report that you have to write.

The report should not be longer than 10 sides of A4 and the minimum allowed font size is 12pt. Margins should be no smaller than 2 cm. Everything including tables and references has to be within the 10 sides of A4.

ENGI46415 : Artificial intelligence and Deep learning

Coursework Specification

Introduction

Welcome to Artificial intelligence and Deep learning coursework! In this coursework, you will work with medical data, ensuring the utmost ethical standards are maintained during data collection. The data has been anonymized and labelled to classify two diseases: Diabetic Retinopathy (DR) and Diabetic Macular Edema (DME).

Understanding Diabetes and Its Impact on the Eyes

Diabetes is a condition where your blood sugar levels are too high. This can harm different parts of your body, including your eyes.

Diabetic Retinopathy (DR) happens when high blood sugar damages the blood vessels in the back of your eye, leading to vision problems or even blindness.

Diabetic Macular Edema (DME) is a type of disease where fluid leaks into the part of the eye responsible for sharp vision, causing it to swell and blur your sight.

It’s important to tell these two apart because they can affect the vision differently. Treating them correctly helps protect the eyesight.

Medical Imaging

OCT (Optical Coherence Tomography) is a special eye scan that takes detailed pictures of the inside of the  eye. It’s like taking a slice of a cake  (B-scans)  to see its layers that show cross- sections of the retina (the light-sensitive part at the back of the eye).

With OCT, DR can be seen as changes in the blood vessels damaged or leaking, and DME appears as swelling or fluid buildup in the central part of the retina. Using OCT helps doctors see these problems clearly and decide the best treatment to protect the vision.

The Role of This Project:

This project plays a crucial role in the context of diagnosis of eye diseases. By utilising data gathered from eye images, it aims to develop machine learning and deep learning models capable of distinguishing between DR and DME. The careful analysis of OCT images, coupled with advanced computational techniques, can provide a non-invasive means to detect diabetes-related changes in the eye, potentially allowing for early diagnosis and intervention.

This early diagnosis could significantly impact the lives of individuals affected by diabetes, as early treatment can help manage the disease's progression and alleviate symptoms. It also underscores the broader potential of utilising cutting-edge technology in healthcare for the benefit of patients. This project offers not only a fascinating technical challenge but also the opportunity to make a meaningful contribution to the field of medical diagnostics.

Examining each OCT B-scan (a detailed cross-section of the eye) to identify signs of disease like DR or DME is very time-consuming and can be prone to error. This is because a clinician must carefully review each scan to detect subtle changes and measure how well a person's vision is performing, which involves subjective judgment and can vary between different doctors.  We  have  a  dataset  where  clinicians  have  already  completed  these  detailed measurements and labelled important features related to the disease, known as biomarkers, and other clinical measurements. Our goal is to see if machine learning algorithms can use these biomarkers to accurately diagnose DR or DME.

Next, we want to determine which biomarkers and clinical measurements are most crucial for making a diagnosis. We also plan to explore if deep learning models can analyse the OCT images directly, possibly eliminating the need for manual labelling of biomarkers and clinical measurements. Finally, we will investigate if combining the most useful biomarkers and clinical measurements with OCT images can lead to even better diagnostic accuracy.

The dataset:

The dataset used in this coursework is a refined version of the one introduced in [1]. It includes clinical data, biomarkers, and OCT B-scans per patient. During a clinic visit, patients undergo several tests, including a test to check how well they can see (Best Corrected Visual Acuity or BCVA), vision sharpness for reading and recognizing faces (measurement as Central Subfield Thickness (CST)), and assignment of a unique Eye ID for each eye. These three integer numbers are referred to as "clinical labels."

Additionally, each patient undergoes OCT imaging, capturing 49 2D images (B-scans) of each eye. A trained ophthalmologist, known as a grader, reviews these scans to identify 16 specific features, called biomarkers, which are represented as either present or absent (binary). These biomarkers help determine if a disease, such as DR or DME is present.

The dataset contains information from 96 eyes, with data recorded during two visits per patient (the first and last visits), totalling 192 rows in the Excel file. The columns of Excel file are organized as follows:

- Column 1: Path to the OCT images, including a part of text starting with "V" or "W" indicating the first visit (e.g., V1 or W0) or the last visit (e.g., V22 or W104).

- Columns 2-17: 16 binary biomarkers labelled by the grader for each eye at each visit.

- Columns 18-21: The 3 integer "clinical labels."

The OCT image paths contain 49 B-scans and additional eye images. For this project, you should only use the middle B-scan, which ends with "24." The last column of the Excel file provides the ground truth diagnosis for diseases (DR or DME), which will serve as the target for our AI algorithms.

For tasks 1 and 2, you will use clinical labels and binary biomarkers as inputs for machine learning algorithms. When splitting the data, make sure to split it by the patient's eye to prevent information leakage between the training and test sets. In task 3, you will focus on analysing OCT images instead of numerical data. For task 4, you'll combine both numerical information and images, as outlined in the task description.

You will design machine learning and deep learning models to accomplish the following tasks:

1. Task 1: Supervised Machine Learning, Visualisation, and Feature Importance

Objective: In this task, your objective is to design supervised machine learning models for classifying DR and DME cases using clinical labels and binary biomarkers. Additionally, to visualise the data distribution and explore the importance of features. For each sub-task, present and discuss your findings separately for clinical labels, binary biomarkers, and their combination.

Requirements:

1-1. Visualisation: Apply a dimensionality reduction technique of your choice to the data to visualize the distribution of data points. Partial credit will be given if you focus on only clinical labels or binary biomarkers.

1-2. Selecting Models: Implement two supervised machine learning models: support vector machines and neural networks. Use clinical labels, binary biomarkers, and their combination as inputs for these models. Partial credit will be given if you implement only one model or use only one type of input.

1-3. Model Comparison: Experiment with the two selected models using different types of input data to compare their performances. Use relevant metrics such as accuracy, precision, recall, F1-score, and ROC-AUC. Provide insights into why certain models or inputs perform better for this specific task. Partial credit will be given if you compare fewer than two models or focus on only clinical labels or binary biomarkers.

1-4. Evaluation of Feature Importance: Apply SHAP (SHapley Additive exPlanations) [2] to evaluate feature importance for the models you have developed. Use  SHAP values to understand the contribution of each feature (clinical labels and binary biomarkers) to the model's predictions. Provide an analysis of which features are most influential and how they impact the classification ofDR and DME.

By  conducting  this  task,  you'll  gain  a  deeper  understanding  of  how  to  visualize  data distributions, select and compare machine learning models, and evaluate the importance of different features. This approach encourages a thorough exploration of the problem space while managing the workload effectively.

2. Task 2: Unsupervised Learning

Objective: In this task, your goal is to design an unsupervised learning model for clustering the data and compare its performance with the provided labels.

Requirements:

2-1. Clustering Algorithm: Choose an appropriate unsupervised clustering algorithm such as k-means.

2-2. Comparative Study: Perform. a comparative study between the clustering results and the ground truth labels. Use relevant metrics for evaluating clustering performance.

2-3. Discussion: Discuss the findings, highlighting any insights gained from the unsupervised clustering.  Identify  any  discrepancies  or  agreements  between  clustering  and  the  labelled classes.

3. Task 3: Convolutional Neural Network (CNN) for Disease Classification and Data Augmentation

Objective: Design a Convolutional Neural Network (CNN) architecture from  scratch  for classifying DR and DME cases from the middle scan of OCT data. Additionally, apply data augmentation techniques to assess their impact on the CNN's performance and use a pre-trained classifier to perform. fine-tuning.

Requirements:

3-1.   Network Design   and   Hyperparameter   Optimization: Create   a   custom   CNN architecture, defining the number of layers, types of layers (e.g., convolutional, pooling), activation   functions,    and   other    architectural    choices.   Discuss    the   optimisation    of hyperparameters and network design using techniques such as Optuna, grid search, or any other method of your choice.

3-2.  Data  Augmentation: Apply  four  data  augmentation  techniques  (rotation,  flipping, scaling, and adding noise) to increase the diversity of the dataset. It's important to test all of them (with different ranges) and describe their suitability for this task. If any augmentation is deemed unsuitable, provide a clear explanation and exclude it from further consideration or limit its range.

3-3. Performance Analysis and Metrics: Show the performance of the CNNs with learning curves and analyse these curves in detail to understand how the model's performance evolves during training. Calculate performance metrics such as accuracy, precision, recall, F1-score, and ROC-AUC. Analyse these metrics in the context of disease classification and discuss the impact of data augmentation on these metrics. Make sure to compare the performance before and after augmentation to highlight its effects.

3-4. Fine-Tuning with Pre-trained Model: Select one pre-trained classifier (VGG16), and fine-tune it for the disease classification task. Discuss the depth of freezing in the pre-trained model and why you made this choice. Evaluate the performance of the fine-tuned model using the same performance metrics and learning curve analysis.

This comprehensive approach combines the analysis of learning curves and the assessment of performance metrics,  including  a  clear  comparison  of performance before  and  after  data augmentation. It encourages a thorough evaluation of the CNN's performance and the impact of data augmentation on disease classification.

4. Task 4: Integrating Clinical Labels, Biomarkers, and OCT Images

Objective: In this task, you will combine clinical labels, binary biomarkers, and OCT images to improve the classification of DR and DME cases. You can only rely on the most important features identified in Task 1. Explore two approaches for combining these data sources:

Requirements:

4-1. Feature Fusion with CNN: Design a Convolutional Neural Network (CNN) as described in Task 3. After the flattening layer, integrate the most important numerical features (clinical labels and binary biomarkers) from Task 1 into the network. Assess how adding these features impacts CNN's performance and provide an analysis of this combined approach.

4-2. Ensemble Learning: Create an ensemble model that combines predictions from the models developed in Task 1 and the CNN from Task 3 [3]. Use techniques such as averaging to combine these models [4]. Evaluate the performance of this ensemble approach and compare it to the individual models.

By completing this task, you will explore methods for integrating multiple types of data and assess how these combined approaches enhance classification accuracy for DR and DME.

Report

You can use the report to explain the methods you have implemented and discuss the results. In particular, you must include answers to all requirements in Tasks 1 to 4, details of the design of the model or the choices made and your justification as well as any diagrams or quantitative evidences. Feel free to discuss any other aspect of your work that you consider interesting within the space limitations given above.



热门主题

课程名

mktg2509 csci 2600 38170 lng302 csse3010 phas3226 77938 arch1162 engn4536/engn6536 acx5903 comp151101 phl245 cse12 comp9312 stat3016/6016 phas0038 comp2140 6qqmb312 xjco3011 rest0005 ematm0051 5qqmn219 lubs5062m eee8155 cege0100 eap033 artd1109 mat246 etc3430 ecmm462 mis102 inft6800 ddes9903 comp6521 comp9517 comp3331/9331 comp4337 comp6008 comp9414 bu.231.790.81 man00150m csb352h math1041 eengm4100 isys1002 08 6057cem mktg3504 mthm036 mtrx1701 mth3241 eeee3086 cmp-7038b cmp-7000a ints4010 econ2151 infs5710 fins5516 fin3309 fins5510 gsoe9340 math2007 math2036 soee5010 mark3088 infs3605 elec9714 comp2271 ma214 comp2211 infs3604 600426 sit254 acct3091 bbt405 msin0116 com107/com113 mark5826 sit120 comp9021 eco2101 eeen40700 cs253 ece3114 ecmm447 chns3000 math377 itd102 comp9444 comp(2041|9044) econ0060 econ7230 mgt001371 ecs-323 cs6250 mgdi60012 mdia2012 comm221001 comm5000 ma1008 engl642 econ241 com333 math367 mis201 nbs-7041x meek16104 econ2003 comm1190 mbas902 comp-1027 dpst1091 comp7315 eppd1033 m06 ee3025 msci231 bb113/bbs1063 fc709 comp3425 comp9417 econ42915 cb9101 math1102e chme0017 fc307 mkt60104 5522usst litr1-uc6201.200 ee1102 cosc2803 math39512 omp9727 int2067/int5051 bsb151 mgt253 fc021 babs2202 mis2002s phya21 18-213 cege0012 mdia1002 math38032 mech5125 07 cisc102 mgx3110 cs240 11175 fin3020s eco3420 ictten622 comp9727 cpt111 de114102d mgm320h5s bafi1019 math21112 efim20036 mn-3503 fins5568 110.807 bcpm000028 info6030 bma0092 bcpm0054 math20212 ce335 cs365 cenv6141 ftec5580 math2010 ec3450 comm1170 ecmt1010 csci-ua.0480-003 econ12-200 ib3960 ectb60h3f cs247—assignment tk3163 ics3u ib3j80 comp20008 comp9334 eppd1063 acct2343 cct109 isys1055/3412 math350-real math2014 eec180 stat141b econ2101 msinm014/msing014/msing014b fit2004 comp643 bu1002 cm2030
联系我们
EMail: 99515681@qq.com
QQ: 99515681
留学生作业帮-留学生的知心伴侣!
工作时间:08:00-21:00
python代写
微信客服:codinghelp
站长地图