N1550 Data Analytics for Accounting & Finance
Assessment Instrument Group Project (assessment type PRJ)
Your Assessment at a glance.
The aim of this assessment is to analyse a dataset of your choice using the techniques covered in the module.
Number of group members
|
Two
|
Number of words
|
2,000 +/- 10% as per Sussex policy.
Word count includes tables and charts that are part of the main body (i.e, not part of any optional appendices)
Word count excludes optional references and appendices.
Please supply tables and charts inline (not at the end). Screenshot all Python code.
References are optional in this assignment (apart from a reference to the dataset), if you include them please use Harvard referencing style.
|
Percentage of total mark
|
40%
|
Deadline
|
End of Week 10.
Please check Sussex Direct for the definite date and time.
|
Choice of dataset
You can choose a dataset of your choice, which must meet the following criteria:
1. It must be a public domain, freely available dataset.
2. The dataset should ideally contain at least two tables connected by primary keys and foreign keys. If the dataset contains just one table, it should be clear that it has been denormalised.
3. The dataset must contain a metric variable which can realistically serve as a dependent variable (for example, a performance score of some kind)
4. The dataset must contain another metric variable which can realistically serve as an independent variable.
5. The dataset must contain at least one categorical variable (to assist with analysis). You could create a categorical variable from a metric variable using Python.
6. The dataset’s main table must contain at least 500 datapoints (double check with module convenor if you are very keen on a dataset which meets all other criteria, just not this one).
A good place to look for suitable datasets is Kaggle (https//www.kaggle.com) but this is not required. The textbook has a list of suitable sites in Chapter 2, Exhibit 2-1, p. 55.
To ensure there is no duplication, each dataset must be approved by the module convenor before the report is submitted. We approve datasets on a first-come first served basis, meaning if a dataset is already used by other students you can no longer use it for your project.
Approval does not necessarily mean that your dataset meets the above conditions: it remains your responsibility to ensure that it does.
Email your approval request to [email protected], please do not include the actual dataset to avoid large size emails, but just a link to the dataset.
Any report with a dataset that does not meet the above criteria and is not pre-approved will normally be capped at 40%.
Marking criteria
We will assess your report on the basis of the standard criteria for projects at the Year 2 Undergraduate Level, which you can find on Canvas.
More specific marking guidance for this project is provided in the section “Structure of the Report” below.
Structure of report
Use the following structure to write your report:
IMPACT Step
|
Mark weighting
|
Minimum required
(Mark guidance 40%- 60%)
|
Going the extra mile (Mark guidance 60%- 80+%)
|
1. Identifying the questions
|
15%
|
Introduce the
dataset, and three
potential questions you wish to investigate
Include equal contribution
statement (see below).
|
Introduce the
dataset, and three
potential questions you wish to investigate
Include equal contribution
statement (see below).
|
2. Mastering the Data
|
25%
|
Produce a database model for the
dataset, either ERD or UML.
Identify primary and foreign keys.
|
Produce a database model for the
dataset, either ERD or UML.
There are multiple tables for the
dataset, and one-to-many
|
|
|
(The model may
contain only one table, but you can and should still identify how the
table was constructed from normalised
tables)
Use Excel VLOOKUP or DB Browser for SQLite to access and join the data into a
denormalised table.
|
relationships
are clearly identified.
Identify primary and foreign keys.
Use DB Browser for SQLite or Python to import the
data. Join
the datasets with
Pandas and export the final dataset to Excel.
|
3. Performing test plan
|
25%
|
Perform. a regression analysis using Excel
Document the outcome.
The regression result may relate to your questions.
|
Perform. a regression analysis using Excel or Python.
Use Python to import
the dataset and highlight some unusual values.
Document the outcome.
The regression result should relate to your questions.
|
4. Address and Refine Results
|
25%
|
Answer the three questions
about your dataset,
and use three
appropriate
visualisations to
illustrate your answers.
Provide a clear and concise narrative.
|
Answer the three questions
about your dataset,
and use three
appropriate
visualisations to
illustrate your answers.
Include traditional & non- traditional
charts to illustrate
your points (something else other than
pie charts, bar charts, or line charts).
|
5.
Communicate Insights
|
10%
|
Wrap up your report. Write in plain English what you have found.
|
Wrap up your report.
|
6. Optional References
|
|
|
|
7. Optional Appendices
|
|
|
|
For a definition of some of the terms, please refer to the module lectures, seminars, and textbook.
Document all Python code that is used. A statement such as ‘we used Python’ is not sufficient. Liberally use screenshots to document your points.
All screenshots should be full-screen screenshots. We do not accept partial or strategically cropped screenshots.
Group dynamics
You are expect to produce this report in pairs of two. We will not accept groups of one, or 3 or more. Any report not produced in pairs would normally be capped at 40%.
If you have reasonable adjustments in place for this module, and these adjustments cover your ability to function in a group, please contact the module convenor, and exceptionally you will be able to produce this report on your own.
Each report must contain the following statement: “Both authors contributed equally to the final project report”. Any report without this statement would normally be capped at 40%.
Each member of the group will receive the same mark.
Please make sure each project member contributes equally to the project report. This
doesn’t mean that each project member needs to write exactly 1,000 words, because
contributions can also be made in analysis and data modelling. However, it does mean that hours spent to produce the final deliverable should be more or less equal.
In case of dispute, which cannot be resolved amicably and in time for the deadline: please
submit the report individually and document clearly the source of dispute, and any proposed resolutions that have not helped (outside of the 2,000 word limit).
If you cannot find a student partner through no fault of your own, and you have exhausted all reasonable options, please get in touch with the module convenor. You will then be
assigned another student who is in the same position. You will be expected to work
together as a pair in the same way as other pairs. Such manual assignment will normally be on a first-come first-served basis.
Learning Outcomes being Asssessed
The following two course learning outcomes are being assessed with this instrument:
• LO2 Work effectively independently and collaboratively
• LO4 Communicate information, ideas, problems, and solutions to specialist and nonspecialist audiences using a variety of technologies
The following two module learning outcomes are being assessed with this instrument:
• LO2 Develop and correctly interpret core data management concepts that are
fundamental to the design of modern information systems in accounting and finance
• LO3 Extract, visualise, and communicate key trends and insights from large datasets in the context of accounting and finance