BUSI4528-E1
A LEVEL 4 MODULE, AUTUMN SEMESTER 2023-2024
QUANTITATIVE RESEARCH METHODS FOR FINANCE AND INVESTMENT
1. (a) We used data from one of the OECD countries for the sample period from 1971 to 2007, to examine the predictability of consumption growth. Variables are defined as below:
csumptn : growth in real per capita private consumption.
hours: growth in per capita hours worked.
gov: growth in real per capita government consumption.
r: the real interest rate
inc: growth in per capita real disposable labor income.
We obtained the following estimation results:
Write down the regression model, and interpret the meaning and significance of each coefficient. Are the results consistent with economic intuition? [25 marks]
(b) Explain what the Central Limit Theorem is. [10 marks]
(c) Explain what time series data, cross-sectional data, and panel data are. [15 marks]
(d)The Linear Probability Model (LPM) is a commonly used tool in finance and economics.
(i) Using standard mathematical notation, outline the LPM and its main features. Discuss its interpretation and assumptions. [20 marks]
(ii) What are some of the key limitations of the LPM? In your discussion, consider both the statistical and real-world financial implications. [15 marks]
(e) Define 'spurious regression' and explain why it presents a problem in the context of time series analysis. [15 marks]
2. (a) Consider the simple linear regression model, y = β1 + β2 x + e.
(i) Discuss the assumptions under which the ordinary least squares estimators are the best linear unbiased estimators of β1 and β2 . [20 marks]
(ii) Explain how one can test the hypothesis that β2 = 1. [15 marks]
(b) Explain the difference between population and sample and why we need to do sampling. [15 marks]
(c) Serial correlation often presents a challenge when estimating linear regression models such as Autoregressive Distributed Lag (ARDL) models.
(i) Define serial correlation and explain why it can be problematic in the context of ARDL models. [15 marks]
(ii) What methods can be used to detect serial correlation in the context of ARDL regression models, and how can it be corrected? [20 marks]
(d) Explain the use of time series graphs in preliminary analyses for detecting a unit root in time-series data. [15 marks]
3. (a) Consider an experiment on whether adding one tutorial class after school improves student performance in the final exam. Suppose that we choose one class to attend the additional class, denoted as TREATi=1, and another class not to attend the class, denoted as TREATi=0. Let AFTER t=1 for the two semesters after the introducing the new tutorial class, and AFTER t=0 for the two semesters before the new schedule. Let Finmark i,t be the average final exam marks in each group in each semester. We obtained the estimation below:
What is the treatment effect in the above equation? Is the estimated treatment effect significant at the 5% level? [15 marks]
(b)Provide an explanation of Type I and Type II errors in hypothesis testing and suggest some strategies for minimizing these errors. [15 marks]
(c) Provide a detailed comparison of pooled OLS and fixed effects methods as they apply to panel data estimation, highlighting their differences and similarities. [20 marks]
(d)The Engle-Granger test is a common approach to testing for cointegration in time-series data.
(i) Explain the basic premise of the Engle-Granger test. [15 marks]
(ii) How is the test conducted and what are the underlying assumptions? [20 marks]
(iii)Discuss the interpretation of the test results and the implications of these results in the context of financial data analysis. [15 marks]
4. (a) Define the term multicollinearity and explain how it can be detected. Also, list the consequences of multicollinearity and explain how they can be mitigated. [50 marks]
(b)The Logit regression model is a powerful tool for predicting binary dependent variables.
(i) Explain the formulation of the Logit regression model and discuss how it can be used to predict binary outcomes. [10 marks]
(ii) What are the key assumptions of the model, and how can its results be interpreted? [10 marks]
(iii) Discuss any potential limitations or challenges in using the Logit regression model in financial data analysis. [10 marks]
(c) Generalized Least Squares (GLS) estimation is a method that can effectively address the issue of serial correlation in regression models.
(i) Explain how GLS estimation works and how it helps to mitigate serial
correlation. In particular, discuss the role of the Cochrane-Orcutt method in this context. [10 marks]
(ii) What are the key assumptions and potential limitations of using GLS estimation to deal with serial correlation? [10 marks]