代写FOUNDATIONAL BUSINESS ANALYTICS COURSEWORK 2025-2026代写留学生Python程序

FOUNDATIONAL BUSINESS ANALYTICS

COURSEWORK 2025-2026

Release Date: 13th Oct 2025

Deadline Date: 11th Dec 2025, 3:00 pm

Submission: via Moodle coursework submission link on the FBA module web page

1. The Problem Definition:

To better protect public health, a major city’s Department of Health and Mental Hygiene historically inspected restaurants on a fixed timetable. While this ensured broad coverage, the department has been faced with the difficulty of efficiently managing public health risk, as truly problematic establishments may not be identified and inspected quickly enough. Traditionally, the department passively reacts to inspection cycles and only takes significant action after a poor inspection result is recorded. However, this passive strategy creates a potential for significant public health crises as high-risk establishments may operate for extended periods between inspections.

To ensure public safety and enhance operational efficiency, the department plans to launch a proactive risk management programme. The goal is to enable the department to “see into the future” and know in advance which restaurant inspection is more likely to result in a significant failure. With the ability to predict a high-risk inspection, the department can intervene at the earliest opportunity, prioritizing resources to prevent potential foodborne illness outbreaks and avoiding public health crises.

This data can shed light on the sort of establishments that will fail to meet hygiene standards and the restaurants that are low-risk. Your task, as a consultant, is to analyse the historical dataset and generate a model that can be used to predict if any given inspection will result in a high score.

As well as robustly testing, justifying and unpacking your selected model (guided by the Director’s needs, as detailed below), the department also wants you to produce some business recommendations – what you think the department should focus on as a result of your investigations. You will submit a formal business report (with a strict 8 page and 3000 word maximum). Additionally, you will submit your model implementation, with instructions on how to use it to test new data (written in either Python, Orange3 or some combination – formal specifications are detailed below). Good luck!

2. Important Message from the Director of Food Safety:

Our mission is to keep the public safe. A critical violation, like improper food temperatures or signs of vermin, can lead to serious foodborne illness outbreaks. We need to find these problems before they hurt people. Our inspectors’ time is our most valuable asset. Sending an inspector to a restaurant that we know is always pristine is an inefficient use of their time. The real failure is when a high-risk restaurant goes uninspected for too long and a health crisis occurs.

Our model must help us find the potential problems early. If we use the model to inspect a restaurant and find nothing wrong, weve only lost a few hours of an inspectors day. But if we fail to inspect a restaurant thats a ticking time bomb, the consequencesfor both public health and public trustare severe. Your system’s primary goal must be to successfully identify the restaurants that are likely to have serious issues.

3. The Available Dataset:

You will be working with a real-world dataset of all restaurant inspections conducted in the City. The data follows the schema below:


4. Formal Task Specification

•    You must provide a classification approach to predict which inspections are more likely to result in a high score (SCORE >= 14). This will require a stage of statistical analysis, a stage of model selection, a stage of final model training, and then an analysis of implications. You may use any software you desire for your analysis, but your model must be produced in either Python3 or Orange3 for this coursework (or some combination).

•    Your submission will consist of a zip of the files for your model, and a report of a maximum of 8 pages. Your model will be tested on a hidden dataset (with the same schema as the training dataset, but without the feature SCORE” or “Y”).

•    Your report must  strictly adhere to the following sections. Please take into account the marks available for each in structuring your submission:

Section A: Summarization [10 marks available]

□ In this section you must provide a summary statistical analysis of the dataset. Consider how each input feature present is related to the output variable (“Y”). Additionally, you may want to examine how they relate to each other. Please feel free to use tables, bar charts, or scatter graphs depending on the feature - it is totally up to you. Note, the point of this section is to be informative rather than overloading your client with information, so also summarize the key analytical points you have observed in the dataset.

Section B: Preparation and Feature Engineering [20 marks available]

□ Provide a comprehensive overview of your data preparation process. This should include a detailed description of how you curated and cleaned the dataset, discussing any methods employed to address missing values or outliers.

This section is the most critical for your model’s success. You must detail the advanced feature engineering you performed. The raw features alone are insufficient. You must create and justify new features that summarize a restaurant’s history prior to the inspection you are trying to predict. For each new feature, you must elucidate the rationale behind its creation and why you believe it will be valuable for predicting a high inspection score.

□ Examples of the type of feature engineering (you do not need to exactly follow the example though, ):

Time-based features: time_since_last_inspection, days_since_last_grade_A.

Historical performance features: average_score_in_last_3_inspections,

number_of_past_high_score_failures, max_score_in_previous_year.

Trend features: is_score_trending_upwards, change_in_score_from_last_inspection.

Violation history features: count_of_past_critical_violations,

frequency_of_specific_violation_codes (e.g., vermin-related codes).

Section C: Model Evaluation [20 marks available]

□ Select at least 3 different classification model classes (selecting only from those we cover in FBA lectures: Logistic Regression, Decision Trees, Random Forests, Naive Bayes Classifier and k- nearest neighbours), and assess their effectiveness in modelling your historical training dataset.

□  In  your  report,   detail  the  models   selected  to  test   and  why  they  were   chosen.  Detail  the parameterizations you chose for each model, explaining why you have chosen the parameters that you have.

□ Describe the evaluation strategy you chose to compare models to each other. Your evaluation strategy MUST be based on a chronological, time-based split of the data. For example, you might train your model on all inspections occurring before January  1st, 2023, and test it on all inspections occurring after that date. A random shuffle-and-split of the data is inappropriate for this problem and will be penalized, as it does not reflect the real-world task of predicting future events and leads to data leakage.

□ Analyze the performance of each model using confusion matrices. Crucially, you must justify your choice of performance metric (e.g., Accuracy, Precision, Recall, F1-score) by explicitly linking it to the Director’s message in Section 2.

Section D: Final Assessment [5 marks available]

□ Given the analysis in Section C, justify a ‘winning ’ classifier and why you have selected it for your final model, paying close attention to the business case in your consideration of measuring success.

Section E: Model Implementation [5 marks available for write-up]

□ Having selected the single, best-performing model, that model must then be trained against the whole training dataset ready for deployment. This section should specify that choice and briefly describe the resulting code/project files that are attached to your submission. In particular, this section should be used to supply brief instructions on how the recipient should use your submitted model code/files to process a new test set and make new predictions from your model.

N.b., marks awarded here are only foryour write-up/instructions, with more marks available for the assessment models implementation code/files- see further available marks.

Section F: Business Case Recommendations [10 marks available]

□ Summarize the business case for the Department of Health. Provide actionable recommendations based on your model and analysis. For example, what types of cuisines or neighborhoods warrant more attention? Could inspection schedules be dynamically allocated based on your model’s risk scores?

Further Available Marks:

□ Overall presentation of your report, its argument, and professionalism → [5 marks available]

□ The standard of your submitted Evaluation/Final modelling code/workflows. It will be expected that in this code/workflow, you will also have supplied some means for the user to load new data (in the same format as your supplied dataset) and make new predictions. → [20 marks available]

□ The effectiveness of your model as assessed against our held-back test dataset → [5 marks available]

Note that the models you submit will be tested on another external dataset that I have held out (and which you will not have access to, reflecting the fact that these represent “future” customer repayment status). Thus, as well as receiving marks for your report, your model implementation, and how well you have tested, evaluated, and justified its construction, there are also additional marks for how well it will predict our hidden test set!

6. Submission

→ In your submission, please submit a zip of the following files:

1.  Your Final Report (maximum 8 pages excluding the front page. Please indicate your word count at the end of your report).

2.  Your Evaluation Code / Workflow files and Final Model Code / Workflow

→ Submissions must be submitted via the Moodle submission link → Submission must be received by:

11th Dec 2025, 3:00 pm

Potential Penalties:

→ Late submissions will lose 5% of their final mark per day.

→ Submitted reports over 8 pages will be received, but only the first 8 pages will be assessed. This is a strict rule.

7. Final Important Note on Plagiarism

→ While everyone is provided with the same foundational dataset, your model performance is expected to vary. These differences will primarily stem from your unique feature engineering strategies, which are a key component of this assessment.

→ This means that the set of features you design, create, and justify in your Section B table is a direct reflection of your individual analytical approach. It is expected to be unique to you. Consequently, it will be a primary point of scrutiny for academic integrity.

→ Submissions with identical or suspiciously similar sets of engineered features, justifications, or implementation logic will be considered strong evidence of plagiarism.

→ All code and workflows will also be examined to ensure there is no repetition between submissions. While you are able to discuss high-level ideas and strategies with peers, the final implementation and analysis must be 100% your own individual work. Any plagiarised work will immediately receive zero marks and be reported to the School.

8. Some Additional Tips!

•    Throughout this coursework, showing thought processes and understanding of how you assess a model in light of the business case is more important than the final predictive test result.

AVOID DATA LEAKAGE AT ALL COSTS! This is the most common and most serious mistake in this type of project. Your features for predicting the outcome of an inspection on date ‘X ’ can only use information known before date ‘X ’. If you create features using any information from the inspection you are trying to predict (e.g., using the critical_flag count of an inspection to predict the score of that same inspection), your model’s performance will be artificially inflated and useless in practice. Your model must predict the future, not describe the present.

•    You may use any analysis tools to formulate your report, but your submitted model must be implemented in Python or Orange (or both). You can assume the recipient is using python 3 and Orange respectively, and has sklearn, scipy, numpy, pandas, matplotlib, seaborn installed.

•    Note the page length available in total, and the available marks for each section to assess how much time and effort to place in each.

•    Note that presentation of your work is also being assessed. This is a formal report directed to a business professional, and should be formatted and worded accordingly.

•    Using python rather than Orange will not necessarily gain you any extra marks. However, it will likely give opportunity to show off your work with more sophisticated analysis, and increase potential of obtaining higher marks in those respective areas.




热门主题

课程名

mktg2509 csci 2600 38170 lng302 csse3010 phas3226 77938 arch1162 engn4536/engn6536 acx5903 comp151101 phl245 cse12 comp9312 stat3016/6016 phas0038 comp2140 6qqmb312 xjco3011 rest0005 ematm0051 5qqmn219 lubs5062m eee8155 cege0100 eap033 artd1109 mat246 etc3430 ecmm462 mis102 inft6800 ddes9903 comp6521 comp9517 comp3331/9331 comp4337 comp6008 comp9414 bu.231.790.81 man00150m csb352h math1041 eengm4100 isys1002 08 6057cem mktg3504 mthm036 mtrx1701 mth3241 eeee3086 cmp-7038b cmp-7000a ints4010 econ2151 infs5710 fins5516 fin3309 fins5510 gsoe9340 math2007 math2036 soee5010 mark3088 infs3605 elec9714 comp2271 ma214 comp2211 infs3604 600426 sit254 acct3091 bbt405 msin0116 com107/com113 mark5826 sit120 comp9021 eco2101 eeen40700 cs253 ece3114 ecmm447 chns3000 math377 itd102 comp9444 comp(2041|9044) econ0060 econ7230 mgt001371 ecs-323 cs6250 mgdi60012 mdia2012 comm221001 comm5000 ma1008 engl642 econ241 com333 math367 mis201 nbs-7041x meek16104 econ2003 comm1190 mbas902 comp-1027 dpst1091 comp7315 eppd1033 m06 ee3025 msci231 bb113/bbs1063 fc709 comp3425 comp9417 econ42915 cb9101 math1102e chme0017 fc307 mkt60104 5522usst litr1-uc6201.200 ee1102 cosc2803 math39512 omp9727 int2067/int5051 bsb151 mgt253 fc021 babs2202 mis2002s phya21 18-213 cege0012 mdia1002 math38032 mech5125 07 cisc102 mgx3110 cs240 11175 fin3020s eco3420 ictten622 comp9727 cpt111 de114102d mgm320h5s bafi1019 math21112 efim20036 mn-3503 fins5568 110.807 bcpm000028 info6030 bma0092 bcpm0054 math20212 ce335 cs365 cenv6141 ftec5580 math2010 ec3450 comm1170 ecmt1010 csci-ua.0480-003 econ12-200 ib3960 ectb60h3f cs247—assignment tk3163 ics3u ib3j80 comp20008 comp9334 eppd1063 acct2343 cct109 isys1055/3412 math350-real math2014 eec180 stat141b econ2101 msinm014/msing014/msing014b fit2004 comp643 bu1002 cm2030
联系我们
EMail: 99515681@qq.com
QQ: 99515681
留学生作业帮-留学生的知心伴侣!
工作时间:08:00-21:00
python代写
微信客服:codinghelp
站长地图