N1550 Data Analytics for Accounting & Finance 
	Assessment Instrument Group Project (assessment type PRJ) 
	Your Assessment at a glance. 
	The aim of this assessment is to analyse a dataset of your choice using the techniques covered in the module.
	
		
			| 
					Number of group members 
				 | 
					Two 
				 | 
		
			| 
					Number of words 
				 | 
					2,000 +/- 10% as per Sussex policy. 
				 
					Word count includes tables and charts that are part of the main body (i.e, not part of any optional appendices) 
				 
					Word count excludes optional references and appendices. 
				 
					Please supply tables and charts inline (not at the end). Screenshot all Python code. 
				 
					References are optional in this assignment (apart from a reference to the dataset), if you include them please use Harvard referencing style. 
				 | 
		
			| 
					Percentage of total mark 
				 | 
					40% 
				 | 
		
			| 
					Deadline 
				 | 
					End of Week 10. 
				 
					Please check Sussex Direct for the definite date and time. 
				 | 
	
	Choice of dataset 
	You can choose a dataset of your choice, which must meet the following criteria:
	1.   It must be a public domain, freely available dataset.
	2.   The dataset should ideally contain at least two tables connected by primary keys and foreign keys. If the dataset contains just one table, it should be clear that it has been  denormalised.
	3.   The dataset must contain a metric variable which can realistically serve as a dependent variable (for example, a performance score of some kind)
	4.   The dataset must contain another metric variable which can realistically serve as an independent variable.
	5.   The dataset must contain at least one categorical variable (to assist with analysis). You could create a categorical variable from a metric variable using Python.
	6.   The dataset’s main table must contain at least 500 datapoints (double check with module convenor if you are very keen on a dataset which meets all other criteria, just not this one).
	A good place to look for suitable datasets is Kaggle (https//www.kaggle.com) but this is not required. The textbook has a list of suitable sites in Chapter 2, Exhibit 2-1, p. 55.
	
		To ensure there is no duplication, each dataset must be approved by the module convenor before the  report  is  submitted.  We  approve  datasets  on  a  first-come  first  served  basis,  meaning if a dataset is already used by other students you can no longer use it for your project.
	
	
		Approval does not necessarily mean that your dataset meets the above conditions: it remains your responsibility to ensure that it does.
	
	
		Email your approval request to [email protected], please do not include the actual dataset to avoid large size emails, but just a link to the dataset.
	
	
		Any report with a dataset that does not meet the above criteria and is not pre-approved will normally be capped at 40%.
	
	
		Marking criteria 
	
	
		We will assess your report on the basis of the standard criteria for projects at the Year 2 Undergraduate Level, which you can find on Canvas.
	
	
		More specific marking guidance for this project is provided in the section “Structure of the Report” below.
	
	
		Structure of report 
	
	
		Use the following structure to write your report:
	
	
		
			
				| 
						IMPACT Step 
					 | 
						Mark weighting 
					 | 
						Minimum required 
					 
						(Mark guidance 40%- 60%) 
					 | 
						Going the extra mile (Mark guidance 60%- 80+%) 
					 | 
			
				| 
						1. Identifying the questions 
					 | 
						15% 
					 | 
						Introduce the 
					 
						dataset, and three 
					 
						potential questions you wish to investigate 
					 
						Include equal contribution 
					 
						statement (see below). 
					 | 
						Introduce the 
					 
						dataset, and three 
					 
						potential questions you wish to investigate 
					 
						Include equal contribution 
					 
						statement (see below). 
					 | 
			
				| 
						2. Mastering the Data 
					 | 
						25% 
					 | 
						Produce a database model for the 
					 
						dataset, either ERD or UML. 
					 
						Identify primary and foreign keys. 
					 | 
						Produce a database model for the 
					 
						dataset, either ERD or UML. 
					 
						There are multiple tables for the 
					 
						dataset, and one-to-many 
					 | 
		
	
	
		
			
				| 
						  
					 | 
						  
					 | 
						(The model may 
					 
						contain only one table, but you can and should still identify how the 
					 
						table was constructed from normalised 
					 
						tables) 
					 
						Use Excel VLOOKUP or DB Browser for SQLite to access and join the data into a 
					 
						denormalised table. 
					 | 
						relationships 
					 
						are clearly identified. 
					 
						Identify primary and foreign keys. 
					 
						Use DB Browser for SQLite or Python to import the 
					 
						data. Join 
					 
						the datasets with 
					 
						Pandas and export the final dataset to Excel. 
					 | 
			
				| 
						3. Performing test plan 
					 | 
						25% 
					 | 
						Perform. a regression analysis using Excel 
					 
						Document the outcome. 
					 
						The regression result may relate    to your questions. 
					 | 
						Perform. a regression analysis using Excel or Python. 
					 
						Use Python to import 
					 
						the dataset and highlight some unusual values. 
					 
						Document the outcome. 
					 
						The regression result should relate to your questions. 
					 | 
			
				| 
						4. Address and Refine Results 
					 | 
						25% 
					 | 
						Answer the three questions 
					 
						about your dataset, 
					 
						and use three 
					 
						appropriate 
					 
						visualisations to 
					 
						illustrate your answers. 
					 
						Provide a clear and concise narrative. 
					 | 
						Answer the three questions 
					 
						about your dataset, 
					 
						and use three 
					 
						appropriate 
					 
						visualisations to 
					 
						illustrate your answers. 
					 
						Include traditional & non- traditional 
					 
						charts to illustrate 
					 
						your points (something else other than 
					 
						pie charts, bar charts, or line charts). 
					 | 
		
	
	
		
			
				| 
						5. 
					 
						Communicate Insights 
					 | 
						10% 
					 | 
						Wrap up your report. Write in plain English what you have found. 
					 | 
						Wrap up your report. 
					 | 
			
				| 
						6. Optional References 
					 | 
						  
					 | 
						  
					 | 
						  
					 | 
			
				| 
						7. Optional Appendices 
					 | 
						  
					 | 
						  
					 | 
						  
					 | 
		
	
	
		For a definition of some of the terms, please refer to the module lectures, seminars, and textbook.
	
	
		Document all Python code that is used. A statement such as ‘we used Python’ is not sufficient. Liberally use screenshots to document your points.
	
	
		All screenshots should be full-screen screenshots. We do not accept partial or strategically cropped screenshots.
	
	
		Group dynamics 
	
	
		You are expect to produce this report in pairs of two. We will not accept groups of one, or 3 or more. Any report not produced in pairs would normally be capped at 40%.
	
	
		If you have reasonable adjustments in place for this module, and these adjustments cover your ability to function in a group, please contact the module convenor, and exceptionally you will be able to produce this report on your own.
	
	
		Each report must contain the following statement: “Both authors contributed equally to the final project report”. Any report without this statement would normally be capped at 40%. 
	
	
		Each member of the group will receive the same mark.
	
	
		Please make sure each project member contributes equally to the project report. This
	
	
		doesn’t mean that each project member needs to write exactly 1,000 words, because
	
	
		contributions can also be made in analysis and data modelling. However, it does mean that hours spent to produce the final deliverable should be more or less equal.
	
	
		In case of dispute, which cannot be resolved amicably and in time for the deadline: please
	
	
		submit the report individually and document clearly the source of dispute, and any proposed resolutions that have not helped (outside of the 2,000 word limit).
	
	
		If you cannot find a student partner through no fault of your own, and you have exhausted all reasonable options, please get in touch with the module convenor. You will then be
	
	
		assigned another student who is in the same position. You will be expected to work
	
	
		together as a pair in the same way as other pairs. Such manual assignment will normally be on a first-come first-served basis.
	
	Learning Outcomes being Asssessed 
	The following two course learning outcomes are being assessed with this instrument:
	•    LO2 Work effectively independently and collaboratively
	•    LO4 Communicate information, ideas, problems, and solutions to specialist and nonspecialist audiences using a variety of technologies
	The following two module learning outcomes are being assessed with this instrument:
	•    LO2 Develop and correctly interpret core data management concepts that are
	fundamental to the design of modern information systems in accounting and finance
	•    LO3 Extract, visualise, and communicate key trends and insights from large datasets in the context of accounting and finance