代做COMP5328 - Advanced Machine Learning Assignment 2代写Python编程

COMP5328 - Advanced Machine Learning
Assignment 2
Due: 11 November 2021, 23:59PM
This assignment is to be completed in groups of 2 to 3 students. It
 is worth
25% of your total mark.
Introduction
The objective of this assignment is to build a transition matrix estimator and two
classification algorithms that are robust to label noise.
Three input datasets are given. For each dataset, the training and validation
data contains class-conditional random label noise, whereas the test data is clean.
You need to build at least two different classifiers trained and validated on
the noisy data, which can have a good classification accuracy on the clean test
data. You are required to compare the robustness of the two algorithms to label
noise.
For the first two datasets, the transition matrices are provided. You can directly
use the given transition matrices for designing classifiers that are robust to label
noise.
For the last dataset, the transition matrix is not provided. You are required to
build a transition matrix estimator to estimate the transition matrix. Then,
employ your estimated transition matrix for classification. Your estimated tran-
sition matrix must be included in your final report. Note that to validate the
effectiveness of your transition matrix estimator, you could use your estimator on
the first two datasets and compare your estimation to the given transition matri-
ces. The code contained in tutorial 9 could be a good starting point.
Data prepossessing is allowed, but please remember to clarify and justify it in
the report carefully.
1 A Guide to Using the Datasets
Three image datasets with .npz format are provided. You can download them via
canvas.
1.1 Attributes Contained in a Dataset
The following code is used to load a dataset and check the shape of its attributes.
import numpy as np
# Remember to r ep l a c e the $FILE PATH
datase t = np . load ($FILE PATH)
Xtr va l = datase t [ ’ Xtr ’ ]
S t r v a l = datase t [ ’ Str ’ ]
Xts = datase t [ ’ Xts ’ ]
Yts = datase t [ ’ Yts ’ ]
print ( Xtr va l . shape )
print ( S t r v a l . shape )
print ( Xts . shape )
print ( Yts . shape )
1.1.1 Training and validation data
The variable Xtr val contains the features of the training and validation data.
The shape is (n, image shape) where n represents the total number of the in-
stances.
The variable Str val contains the noisy labels of the n instances. The shape
is (n, ). For all datasets, the class set of the noisy labels is {0, 1, 2}.
Note that do not use all the n examples to train your models. You are re-
quired to independently and randomly sample 80% of the n examples to train a
model and use the rest 20% examples to validate the model.
1.1.2 Test data
The variable Xts contains features of the test data. The shape is (m, image shape),
where m represents the total number of the test instances.
The variable Yts contains the clean labels of the m instances. The class set
of the clean labels is also {0, 1, 2}.
1.2 Dateset Description
1.2.1 FashionMINIST0.3.npz
Number of the training and validation examples n = 18000.
Number of the test examples m = 3000.
The shape of each example image shape = (28× 28).
The transition matrix T =
0.7 0.3 00 0.7 0.3
0.3 0 0.7
 .
1.2.2 FashionMINIST0.6.npz
Number of the training and validation examples n = 18000.
Number of the test examples m = 3000.
The shape of each example image shape = (28× 28).
The transition matrix T =
0.4 0.3 0.30.3 0.4 0.3
0.3 0.3 0.4
 .
1.2.3 CIFAR.npz
Number of the training and validation examples n = 15000.
Number of the test examples m = 3000.
The shape of each example image shape = (32× 32× 3).
The transition matrix T is unknown.
2 Performance Evaluation
The performance of each classifier will be evaluated with the top-1 accuracy metric,
that is,
top-1 accuracy =
number of correctly classified examples
total number of test examples
∗ 100%.
To have a rigorous performance evaluation, you need to train each classifier
at least 10 times with the different training and validation sets gener-
ated by random sampling. Then report both the mean and the standard
derivation of the test accuracy.
3 Tasks
You need to implement at least two label noise robustness classifiers with at
least one not taught in this course and test their performance on the three datasets.
You need to implement an estimator to estimate the transition matrix. The
code must be written in Python 3. You are allowed to use external libraries for
optimization and linear algebraic calculation. If you have any ambiguity about
whether you can use a particular library or a function, please post your question
on canvas or Ed.
3.1 Image Classification with Known Flip Rates
For the first two datasets, the transition matrices are provided. You can directly
use the given transition matrices for designing classifiers that are robust to label
noise. As mentioned in the section 2, for each classifier, you should report the
mean and the standard derivation of the test accuracy.
3.2 Image Classification with Unknown Flip Rates
For the last dataset, Since the transition matrix is not provided, you need to imple-
ment an estimator to estimate the transition matrix. Then use the estimated
transition matrix to build a noise robust classifier. Note that you can use the
provided transition matrices of the first two datasets to validate the effectiveness
of your transition matrix estimator. You need to include your estimated transition
matrix in the final report. You also need to report the mean and the standard
derivation of the test accuracy for each of your designed noise robustness classi-
fiers. Both estimation accuracy of the transition matrix and the test accuracy on
the last dataset contribute to the final mark.
3.3 Report
The report should be organized similar to research papers, and should contain the
following sections:
• In abstract, you should briefly introduce the topic of this assignment, your
methods, and describe the organization of your report.
• In introduction, you should first introduce the problem of learning with
label noise, and then its significance and applications. You should give an
overview of the methods you want to use.
• In related work, you are expected to review the main idea of related label
noise methods (including their advantages and disadvantages).
• In methods, you should describe the details of your classification models,
including the formulation of the cost functions, the theoretical foundations
or views (if any) of the cost functions, and the optimization methods. You
should describe the details of the transition matrix estimation methods, the-
oretical foundations (if any), and optimization algorithms.
• In experiments, you should introduce your experimental setup (e.g., datasets,
algorithms, evaluation metric, etc.). Then, you should show the experimen-
tal results, compare, and analyze your results. If possible, give your personal
reflection or thoughts on these results.
• In conclusion, you should summarize your methods, results, and your in-
sights for future work.
• In references, you should list all references cited in your report and format-
ted all references in a consistent way.
• In appendix, you should provide instructions on how to run your code.
The layout of the report:
• Font: Times New Roman; Title: font size 14; Body: font size 12
• Length: Ideally 10 to 15 pages - maximum 20 pages
Note: Submissions must be typeset in LaTex using the provided template.
4 Submissions
Detailed instructions are as follows:
1. The submission contains two parts: report and source code.
(a) report (a pdf file): the report should include each member’s details
(student id and name).
(b) code (a compressed folder)
i. algorithm (a sub-folder): your code could be multiple files.
ii. data (an empty sub-folder): although two datasets should be inside
the data folder, please do not include them in the zip file. We will
copy those datasets to the data folder when we test the code.
2. The report (file type: pdf) and the codes (file type: zip) must be named
as student ID numbers of all group members separated by underscores. For
example, “xxxxxxxx xxxxxxxx xxxxxxxx .pdf”.
3. Only one student needs to submit your report (file type: pdf) to Assignment
1 (report) and upload your codes (file type: zip) to Assignment 1 (codes).
4. Your submission should include the report and the code. A plagiarism
checker will be used.
5. You need to clearly provide instructions on how to run your code in the
appendix of the report.
6. Indicate the contribution of each group member.
7. A penalty of minus 5% marks per each day after due (email late submissions
to TA and confirm late submission dates with TA). The maximum delay is
10 days, after that assignments will not be accepted.
8. Remember, the submission deadline is 11 November 2021, 23:59PM.
5 Marking scheme
Category Criterion Marks Comments
Report [80]
Abstract [3]
•problem, methods, and organization
Introduction [6]
•the problem you intend to solve
•the importance of the problem
Previous work [8]
•previous relevant methods used in literature
•their advantages and disadvantages
Label noise methods with known flip
rates [23]
•pre-processing (if any)
•label noise methods’ formulation
•cross-validation method for model selection
or avoiding overfitting (if any)
•experiments
•discussions
Noise rate estimation method [12]
•noise rate estimation method’s formulation
•experiments
•discussions
Label noise methods with unknown flip
rates [10]
•pre-processing (if any)
•label noise methods’ formulation (if different
from above)
•cross-validation method for model selection
or avoiding overfitting (if any)
•experiments
•discussions
Conclusions and future work [3]
•meaningful conclusions based on the results
•meaningful future work suggested
Presentation [8]
•academic style, grammatical sentences, no
spelling mistakes
•good structure and layout, consistent format-
ting
•appropriate citation and referencing
•use graphs and tables to summarize data
Other [7]
•at the discretion of the assessor: illustrate
outstanding comprehensive theoretical analy-
sis, demonstrate the insightful and compre-
hensive assessment of the significance of their
results, provide descriptions and explanations
that have depth but clarity, and are concisely
worded
Code [20]
•reasonable code running time
•well organized, commented and documented
Note: Marks for each category is indicated in square brackets. The minimum mark for the assignment will be 0 (zero).



热门主题

课程名

mktg2509 csci 2600 38170 lng302 csse3010 phas3226 77938 arch1162 engn4536/engn6536 acx5903 comp151101 phl245 cse12 comp9312 stat3016/6016 phas0038 comp2140 6qqmb312 xjco3011 rest0005 ematm0051 5qqmn219 lubs5062m eee8155 cege0100 eap033 artd1109 mat246 etc3430 ecmm462 mis102 inft6800 ddes9903 comp6521 comp9517 comp3331/9331 comp4337 comp6008 comp9414 bu.231.790.81 man00150m csb352h math1041 eengm4100 isys1002 08 6057cem mktg3504 mthm036 mtrx1701 mth3241 eeee3086 cmp-7038b cmp-7000a ints4010 econ2151 infs5710 fins5516 fin3309 fins5510 gsoe9340 math2007 math2036 soee5010 mark3088 infs3605 elec9714 comp2271 ma214 comp2211 infs3604 600426 sit254 acct3091 bbt405 msin0116 com107/com113 mark5826 sit120 comp9021 eco2101 eeen40700 cs253 ece3114 ecmm447 chns3000 math377 itd102 comp9444 comp(2041|9044) econ0060 econ7230 mgt001371 ecs-323 cs6250 mgdi60012 mdia2012 comm221001 comm5000 ma1008 engl642 econ241 com333 math367 mis201 nbs-7041x meek16104 econ2003 comm1190 mbas902 comp-1027 dpst1091 comp7315 eppd1033 m06 ee3025 msci231 bb113/bbs1063 fc709 comp3425 comp9417 econ42915 cb9101 math1102e chme0017 fc307 mkt60104 5522usst litr1-uc6201.200 ee1102 cosc2803 math39512 omp9727 int2067/int5051 bsb151 mgt253 fc021 babs2202 mis2002s phya21 18-213 cege0012 mdia1002 math38032 mech5125 07 cisc102 mgx3110 cs240 11175 fin3020s eco3420 ictten622 comp9727 cpt111 de114102d mgm320h5s bafi1019 math21112 efim20036 mn-3503 fins5568 110.807 bcpm000028 info6030 bma0092 bcpm0054 math20212 ce335 cs365 cenv6141 ftec5580 math2010 ec3450 comm1170 ecmt1010 csci-ua.0480-003 econ12-200 ib3960 ectb60h3f cs247—assignment tk3163 ics3u ib3j80 comp20008 comp9334 eppd1063 acct2343 cct109 isys1055/3412 math350-real math2014 eec180 stat141b econ2101 msinm014/msing014/msing014b fit2004 comp643 bu1002 cm2030
联系我们
EMail: 99515681@qq.com
QQ: 99515681
留学生作业帮-留学生的知心伴侣!
工作时间:08:00-21:00
python代写
微信客服:codinghelp
站长地图