代做ACCFIN5246 Data Science & Machine Learning in Finance Spring 2025代做留学生Python程序

Data Science & Machine Learning in Finance (ACCFIN5246)

Course Project – Spring 2025


1   Instruction

(I)  Deadline: 4 March, noon.

(II)  This course project counts towards (i) 35% (via quiz format) + (ii) 50% via the reflective report, to the overall course grade.  This is an individual assessment.  Answer all ques- tions. Both submissions (i)-(ii) to be made electronically via the course Moodle page. Each part specifies further instructions.

(III)  Results should be reported in a clear format. Avoid reporting numbers in the ‘scientific format’ e.g.   7 .2031e-06.  All reported numbers should be rounded to two decimal points. For example, report 0.00 in place of 7 .2031e-06.

(IV)  The grading in (i) is carried out strictly based on the precision of results (with minor variational tolerance).  The reflective component (ii) is graded according to the clarity of arguments, connectedness of the interpretations to the underlying quantitative frame- work, clarity of the visualisations, and financial implications.  The final part is graded based on the relevance of finance analysis supported by the methodologies and empirical results.

2   Data: Acquisition and Description

All data series are to be researched thoroughly to ensure consistency with other variables, in terms economic interpretations, units, frequency, and other characteristics.  The study focus time-span is 2000-2023.  The cleaned dataset should be arranged in both daily and weekly frequencies in preparation for various results requested in Section (3.2).

2.1   Dataset 1 (DS1)

The first dataset (DS1) is provided in the course project, including 63 variables and summaris- ingMSFT public stock transactions and market information. The data is acquired from WRDS- CRSP universe. DS1 only includes data entries — with the data description and data dictio- nary provided separately via WRDS: Data Description Guide. All computations and analyses should explicitly take into account the instructions provided in the data description guide.

The dataset provides part of the variables needed to gather descriptive and inferential statis- tics. The data covers 2000-01-03 to 2023-12-29, on a daily basis. The project set-up below refers to several variable (not all) included in the DS1,for example:

 Calendar Date: Trading day (date)

 Microsoft stock price (PRC)


 S&P500 composite market index return (sprtrn)

 Outstanding shares (Shrout)

 Ask, Bid Quotes (ASK, BID)

• Market index returns data on a value-weighted market portfolio including dividends reinvested (variables vwretd) and excluding dividends (variable: vwretx). This is based on the US Total Market Index produced by the CRSP that comprises nearly 4,000 con- stituents across mega, large, small and micro capitalizations, representing nearly 100% of the U.S. investable equity market.

The dataset includes additional variables. The course project will not require using those vari- ables where there are no entries across an entire column.

2.2   Risk-free Rate (DS2)

Interest rates, associated with US 1-, 2-, and 10-year maturity treasuries.

2.3   US CPI (DS3)

The US consumer price index maybe used to transform. nominal data to real terms.

3   Data Preparation

3.1   Data Cleaning and Arrangement

• Construction of the financial dataset should takes into account the possibility of sporadic observations.  When multiple series of disparate frequencies are used within the same model, variable timestamps must be aligned.

• The combined dataset including all variables alongside a common timeline may amount to encountering missing values and other irregularities.

• The definition of (daily) log-return provides a measurement for value changes between consecutive observation points which may not necessarily be adjacent points in time (days, weeks, etc.)  as a result of discarding missing values or synchronising an unbal- anced set of observations.

 Assume the following when required:

 a ‘calendar year’ comprises exactly 52 weeks, this may amount to minor discrepan- cies since, year ≈ 52.17 weeks disregard this discrepancy.

 a trading year comprises 250 days, thus disregard any variations such as leap years or public holidays affecting the number of trading days.


 a trading month is 25 days and a trading week is 5 days, disregard variations beyond these settings.

 Weekly log-returns are defined as the percentage value change between a week’s first trading day to next week’s first trading day.

3.2   Main Variables

Based on the datasets and instructions in Sections (2)-(3.1), construct the following variables:

• Construct the daily MSFT log-returns (rt) using the price (PRC) as the basis variable.

 Denote the simple market net return using the (sprtrn) as (m,t).

• Construct log-returns using (m,t). Denote the market log-return as (rm,t). You may need to use the S&P index value at 2000-01-03 which was equal to 1455.22.

• Construct the net risk-free rate, using DS2 (variable: DGS2) denoted as (rf,t)

• Construct daily excess MSFT log-returns 儿rt = rt − rf

• Construct daily excess market log-returns 儿rm,t = rm,t − rf.

• Construct the net inflation rate, using DS3 (variable:  CPILFESL), by applying the log- return method, and denoted by πt.

At this stage, the dataset is arranged to have daily observations: {rt , m,t, rm,t, rf,t,xrt,xrm,t,πt}

also extended to include other variables provided is DS1 and DS2. If there are occasional miss- ing values, investigate whether these are due to coding issues (for example, πt  must always be populated at all days in the dataset above) or whether there has been unreported values between the asset, market and the riskfree returns.

4   Objective Component (35%)

Each question carries an equal weight and according to the University Objective Grading Scale. Provide the answers to the following questions via the Objective Component Section on the Course Moodle page.

Note   For all of the questions (Q1.-Q20.), any numerical answer must be rounded to the closest two decimal points.  The computation is carried out based on daily frequency and for whole sample timespan.

Q1 . Average stock price (as recorded by the PRC variable).

Q2 . Average outstanding shares (as recorded by the SHROUT variable). Q3 . Final stock price on 2023-12-29 (as recorded by the PRC variable).

Q4 . Final outstanding shares balance on 2023-12-29 (as recorded by the SHROUT variable). Q5 . Average ask-bid spread (as recorded by the ASK−BID variables).

Q6 . Final ask-bid spread (as recorded by the ASK−BID variables) on 2023-12-29.

Q7 . Construct the log-returns based on the PRC variable, and similarly, percentage changes in the outstanding shares (constructed following a similar method as the log-returns of the prices). Run a simple linear regression using OLS where the outcome variable is the log-returns on the prices, on an intercept and an independent variable which is the percentage change of the outstanding shares. Report the slope coefficient.


Q8 . Select the units of the coefficient reported in the previous part. Q9 . The lowest stock price value in the whole sample

Q10 . The highest stock price value in the whole sample, occurred in which year? Enter the year value only (4 digits, e.g. 2000 without any month or day).

Q11 . Construct the log-returns based on the PRC variable on the original frequency, and similarly, include the percentage changes in the S&P market index (sprtrn, noting the value under this variable is already in percentage changes).  Run a simple linear re- gression using OLS where the outcome variable is the log-returns on the prices, on an intercept and an independent variable which is the percentage changes in the S&P market index. Report the slope coefficient.

Q12 . Construct the log-returns based on the PRC variable on the original frequency, and similarly, include the percentage changes in the S&P market index (sprtrn, noting the value under this variable is already in percentage changes).  Run a simple linear re- gression using OLS where the outcome variable is the log-returns on the prices, on an intercept and two independent variables which are the percentage changes in the S&P market index. Report the slope coefficient, on the percentage changes in the S&P market index.

Q13 . Construct the dividend-yield (DP ratio). Report the sum of DP ratios.

Compute the following financial returns. Assume an investment value of $1, invested in at the start of 2000-01-03 and liquidated on 2023-12-29. For net returns, this should be computed with two decimals, for example 20.75% is 0.2075 and rounded and entered as 0.21.

Q14 . Total nominal net return associated with the S&P market index.

Q15 . The average (annual) nominal net return associated with the S&P market index (ac- count for compounding).

Q16 . Nominal net return associated with the MSFT stock.

Q17 . Real net return associated with the S&P market index. Q18 . Real net return associated with the MSFT stock.

Q19 . The average (annual) real net return associated with with theMSFT stock (account for compounding).

Q20 . Total real excess net return for the MSFT stock.

5   Reflective Component (50%)

1. Investigate the ASK-BID variable and comment if the behaviour of the data series over the full sample is reasonable. Do you observe any irregularities? Explain.

2. Consider the model characterised by the following specification:

xrt     =   αw + βwxrm,t + ut                                                                (1)

note that the object of interest is the time-varying feature of the coefficients α-w and β-w. In particular, β-w summarizes the conditional relationship, given a rolling window incorpo-rating a consecutive but limited span of data, between the market excess log-return and an individual investment excess log-return. The diagram below provides an illustration to describe overlapping windows (w), including a calendar year of data:

Figure 1: The timeline illustrates a rolling window set-up, where each iteration includes a consecutive 52 weekly datapoints, where W1, W2, ... refer to week numbers throughout the entire sample and wi refer to a rolling window identifier.

Focus on the abnormal returns^(α)w  as the outcome variable, and the remaining variables included in DS1-DS3. Provide a shrinkage and regularisation methodology to explain the variations of the abnormal returns on the weekly frequency.

• The proposed methodology and estimation should not introduce new data, however the existing variable maybe combined or transform. (lagged data is acceptable).

 Provide a financial rationale why the variables are selected.

• Provide a quantitative evaluation on how much the model set-up to predict the ab- normal returns may amount to increasing the financial performance.  Benchmark this output versus the value in Q.20 established under Section (4).

6   Background Reading

 Lecture slides

 WRDS: Data Description Guide

 S&P U.S. Indices Methodology

 Microsoft Annual Report (2023)

• Frank A Wolak.  An exact test for multiple inequality and equality constraints in the linear regression model.  Journal of the American Statistical Association, 82(399): 782–793, 1987

• Chong Kiew Liew. Inequality Constrained Least-Squares Estimation.  Journal of the American Statistical Association, 71(355):746–751, 1976.  ISSN 01621459.  URL http: //www.jstor.org/stable/2285614





热门主题

课程名

mktg2509 csci 2600 38170 lng302 csse3010 phas3226 77938 arch1162 engn4536/engn6536 acx5903 comp151101 phl245 cse12 comp9312 stat3016/6016 phas0038 comp2140 6qqmb312 xjco3011 rest0005 ematm0051 5qqmn219 lubs5062m eee8155 cege0100 eap033 artd1109 mat246 etc3430 ecmm462 mis102 inft6800 ddes9903 comp6521 comp9517 comp3331/9331 comp4337 comp6008 comp9414 bu.231.790.81 man00150m csb352h math1041 eengm4100 isys1002 08 6057cem mktg3504 mthm036 mtrx1701 mth3241 eeee3086 cmp-7038b cmp-7000a ints4010 econ2151 infs5710 fins5516 fin3309 fins5510 gsoe9340 math2007 math2036 soee5010 mark3088 infs3605 elec9714 comp2271 ma214 comp2211 infs3604 600426 sit254 acct3091 bbt405 msin0116 com107/com113 mark5826 sit120 comp9021 eco2101 eeen40700 cs253 ece3114 ecmm447 chns3000 math377 itd102 comp9444 comp(2041|9044) econ0060 econ7230 mgt001371 ecs-323 cs6250 mgdi60012 mdia2012 comm221001 comm5000 ma1008 engl642 econ241 com333 math367 mis201 nbs-7041x meek16104 econ2003 comm1190 mbas902 comp-1027 dpst1091 comp7315 eppd1033 m06 ee3025 msci231 bb113/bbs1063 fc709 comp3425 comp9417 econ42915 cb9101 math1102e chme0017 fc307 mkt60104 5522usst litr1-uc6201.200 ee1102 cosc2803 math39512 omp9727 int2067/int5051 bsb151 mgt253 fc021 babs2202 mis2002s phya21 18-213 cege0012 mdia1002 math38032 mech5125 07 cisc102 mgx3110 cs240 11175 fin3020s eco3420 ictten622 comp9727 cpt111 de114102d mgm320h5s bafi1019 math21112 efim20036 mn-3503 fins5568 110.807 bcpm000028 info6030 bma0092 bcpm0054 math20212 ce335 cs365 cenv6141 ftec5580 math2010 ec3450 comm1170 ecmt1010 csci-ua.0480-003 econ12-200 ib3960 ectb60h3f cs247—assignment tk3163 ics3u ib3j80 comp20008 comp9334 eppd1063 acct2343 cct109 isys1055/3412 math350-real math2014 eec180 stat141b econ2101 msinm014/msing014/msing014b fit2004 comp643 bu1002 cm2030
联系我们
EMail: 99515681@qq.com
QQ: 99515681
留学生作业帮-留学生的知心伴侣!
工作时间:08:00-21:00
python代写
微信客服:codinghelp
站长地图