BUSI4528-E1
A LEVEL 4 MODULE, AUTUMN SEMESTER 2022-2023
QUANTITATIVE RESEARCH METHODS FOR FINANCE AND INVESTMENT
1. a) We use data on rice production from a sample of 88 Philippine rice farmers in 1994 to estimate the production function. Variables are defined as below:
lnprod: logarithm of tons of freshly threshed rice. lnarea: logarithm of hectares planted.
lnlabor: logarithm of person-days of hired and family labor. lnfert: logarithm of kilograms of fertilizer.
We obtained the following estimation results:
(i) Write down the regression model and interpret the meaning and significance of each coefficient. [20 marks]
(ii) What could be a potential problem of this estimation? [10 marks]
(iii) Explain how to test the explanatory power of the regression. [10 marks]
b) Explain what time series data, cross-sectional data, and panel data are. [10 marks]
c) Describe two indicators that are suitable to evaluate the goodness-of-fit of binary
choice models (e.g., Probit, and Logit regression models). Explain why these are to be preferred to the conventional R-squared index. [20 marks]
d) Consider the following time series model:
yt = β0 + β1xt + β2xt−1 + yyt−1 + εt
It is suspected that the model suffers from serial correlation in the error term of the form.
εt = Pεt−1 + ut
where ut is an identically independently distributed error term and t denotes the t-th time period. Describe in detail a test for the presence of serial correlation in the above form. of the model. [30 marks]
2. a) Answer the following questions in relation to the linear regression model :
yi = β1 + β2xi2 + β3xi3 + εi
(i) Which assumptions are needed to make the ordinary least squares (OLS) estimator, b, a BLUE estimator of the parameters β? [20 marks]
(ii) Explain how a 95% confidence interval for β2 can be constructed. What
additional assumptions are needed? [10 marks]
(iii) Suppose that the R2 from the regression xi2 = α2 + α3xi3 + vi is 0.75, what will
happen if you try to estimate the model yi = β1 + β2xi2 + β3xi3 + εi? [20 marks]
b) What are the main limitations of the Linear Probability Model (LPM) as compared to the Probit and Logit regression models? [20 marks]
c) Describe how graphs of time series can be used in a preliminary analysis to detect the presence of a unit root. [10 marks]
d) Outline the Dickey-Fuller test for the null hypothesis (H0) of the presence of a unit root in an autoregressive time series model against the alternative hypothesis (HA) that the autoregressive time series model is stationary around a zero mean. [20 marks]
3. a) Answer the following questions about the concept of heteroscedasticity:
(i) Describe how heteroscedasticity can be detected. [25 marks]
(ii) What are the consequences of heteroscedasticity in linear regression [15 marks]
(iii) What are the possible solutions to address this issue. [10 marks]
b) Consider the model
yt = α + β0xt + et
where the error term is AR(1): et = pet−1 + vt , |p| < 1, vt are i.i.d. with mean zero and variance σv(2) , and t denotes the t-th time period. Show how the above model can be rewritten into a new model that has an error term uncorrelated over time. [20 marks]
c) Explain the Engle-Granger test for cointegration and write down the steps involved in the test. [30 marks]
4. a) We estimate two regressions describing the relationship between the cost per student and related factors at four-year colleges in the U.S., covering the period 1987 to 2011, where tc is the logarithm total cost per student, ftestu is number of full time equivalent students, ftgrad is number of full-time graduate students, tt is number of tenure track faculty per 100 students, ga is number of graduate assistants per 100 students, and cf is the number of contract faculty per 100 students, which are hired on a year to year basis, staffsize is number of staff and benstaff is benefit package to staff. Below are the estimation results in our empirical analysis:
Estimation I:
Estimation II:
What are the differences between the above two sets of results in terms of estimation methods? How to determine which estimation method to use? [20 marks]
b) Discuss how the difference-in-difference (DID) estimator (specify the DID regression model) might be used to test for a potential treatment effect of a policy reform. Use graphs where necessary. [30 marks]
c) Explain the setup of the Logit regression model. What interpretation can be given to its regression coefficients? [20 marks]
d) Explain how to use the Durbin-Watson (DW) statistics to test for autocorrelation of the first order, AR(1), of the error terms in static models of time series data. Discuss the limitations of the DW test for detecting serial correlation in the error terms. [20 marks]
e) When dealing with non-stationary time series, what is meant by ‘spurious regression’? [10 marks]