Pacific B usiness R eview I nternational

A Refereed Monthly International Journal of Management Indexed With THOMSON REUTERS(ESCI)
Editorial Board

Prof. B. P. Sharma
(Editor in Chief)

Dr. Khushbu Agarwal
(Editor)

Ms. Asha Galundia
(Circulation Manager)

Editorial Team

Mr. Ramesh Modi

Archives
2020
2019 2018
A Refereed Monthly International Journal of Management

Macroeconomic Variables of India and Finite Sample Properties of OLS under Classical Assumptions

Author

Dr. Mohd Nayyer Rahman

Post-Doctoral Fellow-Indian Council of Social Science Research

Department of Commerce,

Aligarh Muslim University, Aligarh.

Abstract

One of the emerging branches for economic analysis is econometrics. It is a conglomeration of mathematical economics, statistics, and economic theory. Macroeconomic variables are largely used by researchers for myriad analysis and for that matter regression (OLS) is widely used. However, what is generally ignored by a large section of researchers are the classical assumptions of regression. In the present study, an empirical investigation is taken up with respect to selected macroeconomic variables in terms of the OLS classical assumptions. The objective remains to understand the internal dynamics of the time series and to argue the prerequisite of the classical assumptions for any time series analysis in econometrics. The study may be used for understanding the default properties of macroeconomic time series variables.

Keywords: Macroeconomic variables, Classical Assumptions, Gauss-Markov Theorem, Econometrics

Section 1: Introduction

Econometrics as a branch of study is the conglomeration of mathematical economics, economic theory and statistics. Its development is credited to Ragnar Frisch in 1930s. Today, econometrics is widely used as a data analysis tool in most of the areas related to economics, international trade, behavioural economics etc. Out of the classification of data, the most popular classification are that of time series data, panel data and cross sectional data. Though it is true that regression (Ordinary Least Squares) is the basis of analysis for time series, it does not deny the use of regression in techniques relevant to panel and cross sectional data. Thus, OLS regression stands as widely used tool in econometric analysis. In the present body of knowledge, there are many research pieces, which have applied OLS on macroeconomic variables of India but without incorporating a discussion on finite sample properties under classical assumptions. There is a vacuum to be filled regarding finite sample properties and the macroeconomic variables of India such as export, import, GDP (gross domestic product) etc. The objective of the present study is to highlight the pre-requisite of OLS and to empirically evaluate the position of macroeconomic variables with respect to the sample properties. The study is divided into 7 sections. Section 2 discusses the approach followed to achieve the objective of the study by highlighting the conceptual framework. Section 3 captures the review of past studies that are relevant for the study. Section 4 and 5 deals with econometric models and data specification, respectively. While eventually Section 6 and 7 highlights the results and conclusion of the study, respectively.

Section 2: Conceptual Framework

The question posed in the study is “whether macroeconomic variables of India for the sample period adhere to the finite sample properties of OLS under classical assumptions?” In order to answer the question two primary steps are involved. First being the identification of macroeconomic variables and second being the ambit of finite sample properties along with its connotations. For easier understanding, the need is to proceed chronologically.  The Oxford Economics dictionary (Black, Hashimzade & Myles, 2012) while explaining macro econometrics highlights that it deals with macroeconomic data. The usage of the term macroeconomic data is actually what is in common parlance referred to as macroeconomic variables such as export, import, GDP, inflation etc. In the present study, export, import, current account balance, foreign direct investment and GDP of India are used.

OLS regression needs to be understood along with its assumptions also known as finite sample properties. The finite sample properties vary based on type of data. Here the objective is to develop discussion along the lines of time series data. There has remained a difference of opinion regarding the number of finite sample properties. This may be due to the differences in theoretical econometrics and applied econometrics. Whatever the reason may be, in our times the standard assumptions are 6 (Wooldridge, 2009). The discussion on each assumption is developed in coming lines.

Finite Sample Property (now on referred to as FSP) 1: Linearity in Parameters

The random process follows the linear model where the sequence of errors or disturbances is. Here, n is the number of observations.

FSP 2: No perfect collinearity

In the time series, no independent variable is constant nor a perfect linear combination of the others. It allows the explanatory variables to be correlated but it rules out perfect correlation in the sample.

FSP 3: Zero conditional mean

For each t, the expected value of given the explanatory variable of all time periods, is zero. Symbolically,

It means random variable for consecutive time periods must not be correlated. More clearly it means that in a deterministic regression model, and are uncorrelated and is also uncorrelated with past and future values of

FSP 4: Homoscedasticity

Conditional on the variance of is the same for all t. Symbolically,

and are independent.

FSP 5: No serial correlation/ autocorrelation

Conditional on the errors in two different time periods are uncorrelated; Symbolically,

for all or in simpler form,

FSP 6: Normality

The errors are independent of and are independently distributed as normal

In the present study, the discussion revolves around FSP 1 to FSP 6 and objective remains to verify the assumptions on raw data.

Section 3: Review of Literature

The review of literature in this study can be taken up in two forms. One by studying studies utilising regression for macroeconomic variables and their comment or observations on the FSPs. The other way can be to search specifically for studies on the FSPs; how many they are?, what is their status?, which one are necessary and which sufficient? However, taking the former was impractical due to limited availability of time, cost and access to research papers. It is also to be noted that as per the search and access of researcher there is no such study available addressing unequivocally the issue of FSPs with respect to macroeconomic variables of India. Nonetheless, in the available body of knowledge, the discussions on FSPs and how they are ignored are largely available. This also appears to be a positive argument in favour of the study as it justifies the objective. A review of past studies clearly indicates that due emphasis has been put up on checking up the assumptions before using linear regression only with certain minimal variations (e.g. Colenutt, 1968; Johnston, 1963; Campillo, 1993, Osborne & Waters, 2002). However, this is not the case with a large section of the studies that are published. In several fields it has been found that the researchers applying linear regression do not either use the FSPs or do not present the results relating to FSPs (in which case it appears the same as the former). For example, it was reported by and found on the basis of a sample of psychological researchers data, that FSPs were rarely checked and their knowledge about them was poor (Hoekstra, Kiers& Johnson, 2012). An important study relevant to the discussion is that of Ottenbacher, Ottenbacher, Tooth & Ostir (2004), which reviewed research papers published in two journals, i.e. American Journal of Public Health and American Journal of Epidemiology. Out of the 348 articles over a period between 1970-1998, it was found that FSPs of the regression are not checked upon by the researchers. Out of 99 studies selected only 17% of studies discussed the FSP 2 (multicollinearity). Out of the 36 articles on logistic regression, 29 articles (81%) provided no information on the FSPs. These results have raised concern over using of regression analysis while ignoring the FSPs. Similarly in the field of geography, it has been observed late in 1970s that geographers have largely ignored and skipped the discussion on the assumptions i.e. FSPs. To add, historically, it was in 1968 that two researchers J. B. Cole and C. A. M. King warned about the usage of regression without checking for FSPs (Poole &O’Farrell, 1971).

The researchers are unanimous on the issue of FSPs, their adherence for stability of the model. Though few of the FSPs can be relaxed on the basis of objective. For example, if the objective of the model is prediction, the FSP 2 (multicollinearity) can be relaxed but if the objective is quantification of the parameter (point estimation), then FSP 2 (multicollinearity) cannot be relaxed. Thus, it would be befitting if a discussion is developed about FSPs with respect to selected variables. The present study is an attempt to identify FSPs for macroeconomic variables of India so that future researchers may benefit from them andmay assume the default nature of macroeconomic variables for India.

Section 4: Econometric Models and Estimation Methods

In this section, all the models would be specified as well as estimation methods would be elaborated upon from FSP-1 to FSP-6.

FSP-1: Linearity in Parameters

It discusses the linearity in the parameters. The question for us is how to estimate and check whether the univariate series or multivariate series has linearity in parameters. As Williams, Grajales and Kurkiewicz (2013) reports that “it is not possible to investigate these (FSPs) assumptions without estimating the actual regression model” simple default models would be used on the argument of parsimony to check for the assumptions. In the present venture 5 variables such as Current Account Balance (CAB), Export (EXT), Foreign Direct Investment Inflows (FDI), Import (IMT)and Gross Domestic Product (GDP) are used assuming GDP as the explained and others as explanatory variables (Appendix II). This is based on theoretical considerations and empirical justifications (Narayan & Prasad, 2008; Iqbal, Ahmad, Haider & Anwar, 2014).

Symbolically, ……

The linearity parameter restrictions are put on the variables using Wald Test and interpreting on the basis of t-statistic and F-statistic. Out of the 4 explanatory variables, all observations of CAB are negative values while the rest are positive. In order to test the linearity assumption, we take an opposite method of checking, that is instead of checking linearity the researcher checked for non-linearity condition. Moreover, if that condition is fulfilled the parameters are truly non-linear. At this stage, any non-linear parameter will suffice for inference. The following conditions will be simultaneously checked using single p values.

Condition 1:

Condition 2:

Condition 3:

Condition 4:

The null hypothesis will be “The parameters are non-linear”.

FSP-2: No perfect multicollinearity

According to Oxford Economics dictionary “perfect mulitcollinearity occurs when some of the explanatory variables are perfectly correlated” (Black, Hashimzade & Myles, 2012). There are multiple tools available to identify multicollinearity between independent variables. The most commonly used technique is Variance Inflation Factor (VIF). The specification model of VIF is as follows:

Where is the value obtained by regressing the kth predictor on the remaining predictors. As a rule of thumb, a VIF value below 10 is considered acceptable meaning there is no major problem of multicollinearity. This appears to be a liberal view; a more conservative view puts the bar on VIF to be four. However, O’brein (2007) has objected to such rule of thumb and has argued that further model specification is required to identify the problem as in certain cases even the values above 10, 20 and 40 can have no implications for inferences. To follow the objective of parsimony and adhering to a liberal approach rule of thumb of VIF 10 is selected to decide about the magnitude of multicollinearity.

FSP-3: Zero conditional mean of the error term

This property is concerned with the conditional mean of the error term in a given model. Using model 4.1, residual series would be generated and with the help of generated series the conditional mean of the series will be calculated with reference to mean dependent variance. Here, the command code to be used is important as conventionally it is seldom used.

Command code: where y is the name of series and x is the mean conditional variance of y.

FSP-4: Homoscedasticity/ No Heteroscedasticity

In order to test this particular assumption, the White’s Test (1980) is employed both due to its popularity and simplicity. In it the null hypothesis is of “no heteroscedasticity” using auxiliary regression where the squared residuals are regressed on all possible cross products of the regressors. According to our baseline model 4.1, the White’s model of heteroscedsticity is specified in the following manner:

FSP-5: No serial correlation/ autocorrelation

In order to test the autocorrelation, Breusch-Godfrey Serial Correlation LM Test is used due to parsimony and the good results it is used for. The baseline specification of error term used in B-G test is as follows:

The null hypothesis is: read as “no serial correlation of h order”. The order h can be specified at the time of analysis.

FSP 6: Normality

The normality assumption of the error term is the widely checked property but with a deviation. The deviation being that sometimes researchers have checked the normality of the variables instead of checking the normality of the error terms/ residuals. The normality of the error term can be checked through q-q plots or simply with the help of histogram and jarque-bera statistics. The study will utilize jarque-bera statistics. 

Section 5: The Data

The study used five macroeconomic variables of India expressed in US$ millions and the data is taken from UNCTAD database. The five variables are Current Account Balance (CAB), Exports (EXT), Imports (IMT), Gross Domestic Product (GDP) and Foreign Direct Investment Inflows (FDI). The time period for data is from 1980 to 2013. The UNCTAD database has not been updated for 2014 and 2015 with respect to one or more variables in the study. Thus in order to have a symmetry, data until 2013 is used for inferences. The data set is referred to Appendix I.

Section 6: Results

The analysis begins with FSP-1 to FSP-6on the basis of baseline model 4.1. The output of the OLS regression for model 4.1 is shown in Annexure III. On the basis of that model the individual results pertaining to finite sample properties are presented.

FSP-1 result: The wald test is used for testing the linearity in the parameters with a method where the null hypothesis is of non-linear parameters. The output is presented in Table 1. As per the output when all the four conditions with respect to coefficients (parameters) is identified, the null hypothesis of non-linearity is rejected as the probability value of both F-statistic and Chi-square is less than 0.05 (0.0000, 0.00000).

Wald Test Statistics for Linearity

Test Statistic

Value

df

Probability

F-statistic

 3961.712

(4, 29)

 0.0000

Chi-square

 15846.85

 4

 0.0000

Null Hypothesis: C(1)=C(2)/C(3), C(2)=C(3)/C(4),

        C(3)=C(4)/C(1), C(4)=C(1)/C(2)

Null Hypothesis Summary:

 

 

 

 

 

Normalized Restriction (= 0)

Value

Std. Err.

C(1) - C(2)/C(3)

 5.659183

 2.557029

C(2) - C(3)/C(4)

-1.806183

 3.976922

C(3) - C(4)/C(1)

-4.270869

 2.197172

-C(1)/C(2) + C(4)

 8.615776

 0.775793

 

 

 

 

Source: Output generated through Eviews9.5 by the researcher

On the basis of table 1 it is crystal clear that by default the macroeconomic variables of India are linear in parameters as supported by wald test.

FSP-2 result: Earlier in section 4, it was explained that a liberal approach towards VIF value will be used so as to check for “no perfect multicollinearity”. In line with that commitment VIF values are calculated and presented in table 2.

Variance Inflation Factors for multicollinearity

 

Coefficient

Uncentered

Centered

Variable

Variance

VIF

VIF

CAB

 10.64757

 92.24232

 69.35394

EXT

 14.92668

 6818.624

 4173.886

FDI

 4.674395

 16.50489

 11.42620

IMT

 12.57888

 8815.163

 5432.172

C

 1.55E+08

 2.267666

 NA

Source: Output generated through Eviews9.5 by the researcher

From table 2, look to the centered VIF because of the presence of intercept in the baseline model. The liberal approach accepts the VIF till 10 so as to have no problem of perfect multicollinearity. It is clear from the table that all the explanatory variables have a VIF more than 10. Therefore, there is a problem of multicollinearity between the macroeconomic variables (liberal approach criteria).

FSP-3 result: The assumption pertains to zero conditional mean of the error term. The conditional mean of the disturbance term is calculated with the help of command given in econometrics estimation methods section. The outcome is that conditional mean of residuals with respect to mean conditional variance is not zero. It came out to be -3.42E-11 which is other than zero. Thus, FSP 3 is not fulfilled. For residual series refer to Appendix IV.

FSP-4 result: For checking the assumption of no heteroscedasticity (homoscedasticity) White test (1980) is used and the output of the test is shown in table 3.

White’s Test for Heteroskedasticity

F-statistic

6.861793

    Prob. F(14,19)

0.0001

Obs*R-squared

28.38579

    Prob. Chi-Square(14)

0.0126

Scaled explained SS

33.86424

    Prob. Chi-Square(14)

0.0022

Source: Output generated through Eviews9.5 by the researcher

As per the White’s output, the null hypothesis is “there is homoscedasticity” and as the probability value is less than 0.05 (0.0001, 0.0126, 0.0022), the null hypothesis is rejected. This means that the macroeconomic variables have no homoscedasticity (there is heteroscedasticity). Thus, FSP-4 is accepted and verified for the macroeconomic variables of India.

FSP-5 result: The fifth finite property is about no autocorrelation in the specified model. The output is shown in table 4.

Breusch-Godfrey Serial Correlation LM Test

F-statistic

2.095741

    Prob. F(2,27)

0.1425

Obs*R-squared

4.568888

    Prob. Chi-Square(2)

0.1018

Source: Output generated through Eviews9.5 by the researcher

The null hypothesis under B-G test was “there is no serial correlation”. As the prob. value is more than 0.05 in both F-statistic and Chi-square (0.1425, 0.1018), the null of no serial correlation is accepted. This means that the macroeconomic variables have no problem of autocorrelation.

FSP-6 result: In order to check the normality of the error term the residual series has been generated and using the histogram and jarque bera statistics decision regarding normality is taken. Remember if the prob. of the jarque-bera statistic is more than 0.05 then the data is supposed to be normal. The histogram is shown as Figure 1 and jarque-bera statistic as table 5.

Figure 1.Histogram for residuals form specified model

Source: Prepared by researcher through Eviews9.5

Descriptive of residuals

 Mean

-3.42E-11

 Median

 5563.326

 Jarque-Bera

 3.104849

 Probability

 0.211734

 Observations

 34

Source: Output generated through Eviews9.5 by the researcher

The probability value of jarque-bera statistics is 0.2117 which is more than 0.05, indicating that the residuals (error term) are normally distributed. This verifies the assumption. The summarized results are shown in table 6.

Summarized Results

S.No.

Finite Sample Property/ Assumption

Symbol

Status

1

Linearity in parameters

FSP-1

Accepted

2

No perfect collinearity

FSP-2

Rejected

3

Zero conditional mean

FSP-3

Rejected

4

Homoscedasticity

FSP-4

Rejected

5

No serial correlation/ autocorrelation

FSP-5

Accepted

6

Normality

FSP-6

Accepted

Source: Summarized by the researcher

Section 7: Conclusion

In the sample of macroeconomic variables of India it has been eventually concluded that three of the finite sample properties were accepted (FSP-1, FSP-5, FSP-6) while three were rejected (FSP-2, FSP-3, FSP-4). This outcome can be inferred to state that the macroeconomic variables of India are linear, has no autocorrelation and the residuals are normally distributed. However, the other finite sample properties are not satisfied in the macroeconomic variables. In this sense, the study is conclusive. However, the study is inconclusive due to the limitations of the baseline model. Still it will be helpful for the researchers to bear caution for applying regression without checking for the finite sample properties. The results of the study can also be used by researchers as default characteristics of the macroeconomic variables of India.

References

Black, J., Hashimzade, N., & Myles, G. (Eds.). (2012). A dictionary of economics. OUP Oxford.

Campillo, C. (1993). Standardizing criteria for logistic regression models. Annals of internal medicine119(6), 540.

Colenutt, R. J. (1968). Building linear predictive models for urban planning. Regional Studies2(1), 139-143. Doi: http://dx.doi.org/10.1080/09595236800185111

Hoekstra, R., Kiers, H., & Johnson, A. (2012). Are assumptions of well-known statistical techniques checked, and why (not)?. Frontiers in psychology3, 137. Doi: http://dx.doi.org/10.3389/fpsyg.2012.00137

Iqbal, N., Ahmad, N., Haider, Z., & Anwar, S. (2014). Impact of foreign direct investment (FDI) on GDP: A Case study from Pakistan. International Letters of Social and Humanistic Sciences5, 73-80.

JOHNSTON, J. (1963). Econometric methods McGraw-IIi11 Book. Company Inc., New York.

Narayan, P. K., & Prasad, A. (2008). Electricity consumption–real GDP causality nexus: Evidence from a bootstrapped causality test for 30 OECD countries. Energy Policy36(2), 910-918.

O’brien, R. M. (2007). A caution regarding rules of thumb for variance inflation factors. Quality & Quantity41(5), 673-690.DOI 10.1007/s11135-006-9018-6

Osborne, J., & Waters, E. (2002). Four assumptions of multiple regression that researchers should always test. Practical assessment, research & evaluation8(2), 1-9.

Ottenbacher, K. J., Ottenbacher, H. R., Tooth, L., & Ostir, G. V. (2004). A review of two journals found that articles using multivariable logistic regression frequently did not report commonly recommended assumptions. Journal of clinical epidemiology57(11), 1147-1152.DOI:http://dx.doi.org/10.1016/j.jclinepi.2003.05.003

Poole, M. A., & O'Farrell, P. N. (1971). The assumptions of the linear regression model. Transactions of the Institute of British Geographers, 145-158.DOI: 10.2307/621706

White, H. (1980). A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica: Journal of the Econometric Society, 817-838.DOI: 10.2307/1912934

Williams, M. N., Grajales, C. A. G., & Kurkiewicz, D. (2013). Assumptions of multiple regression: correcting two misconceptions. Practical Assessment, Research & Evaluation18(11), 2.

Wooldridge, J. M. (2009). Introductory econometrics: A modern approach. Nelson Education.

Appendices

Appendix I: Data Set (in US$ millions)

Year

FDI

EXT

IMT

GDP

CAB

1980

79.16

11274.4

16927.95

181116.2

-1785.13

1981

91.92

11234.71

17397.43

193190.8

-2698.33

1982

72.08

12159.03

17517.74

197258.9

-2523.54

1983

5.64

13059.98

17572.63

215224.9

-1936.94

1984

19.24

13423.63

17857.8

213177.6

-2311.07

1985

106.09

12849.2

18984.13

221993.5

-4140.58

1986

117.73

13476.23

19631.83

243226.2

-4567.7

1987

212.32

15247.4

22290.08

269161.7

-5171.17

1988

91.25

17301.08

25412.6

297762.5

-7143.23

1989

252.1

20283.7

28127.95

294788.2

-6812.77

1990

236.69

22911.06

29526.65

320349.7

-7035.65

1991

75

23020.36

27031.88

283967.7

-4291.73

1992

252

24953.49

29665.6

285176.4

-4485.22

1993

532

27122.92

30604.96

278384

-1875.8

1994

974

31560.65

37872.37

318925.1

-1676.28

1995

2151

38013.22

48225.1

361957.2

-5563.23

1996

2525

40975.69

54960

381492.8

-5956.14

1997

3619

44812.71

58172.8

414237.5

-2965.2

1998

2633

45766.8

59367.9

416885.4

-6903.11

1999

2168

51386.3

62827.5

444434.8

-3228.02

2000

3587.99

59931.7

73075.2

458561.1

-4601.25

2001

5477.638

62130.2

71311.2

473441.7

1410.18

2002

5629.671

70619.3

75741.5

494986.7

7059.5

2003

4321.076

84795

92959.1

579668.7

8772.51

2004

5777.807

116219.6

131179.9

701347.4

780.196

2005

7621.769

154703.3

181978.5

820980

-10283.5

2006

20327.76

193498.1

225268.1

929215.2

-9299.06

2007

25349.89

240712.9

279416.3

1182321

-8075.69

2008

47102.42

305729

380088.5

1268588

-30972

2009

35633.94

260847.5

328257.5

1311852

-26186.4

2010

27417.08

348035

439059

1668768

-54515.9

2011

36190.46

446375

553062

1892420

-62517.6

2012

24195.77

443629.5

579405.919

1869210

-91471.2

2013

28199.45

464187.7

559767.3941

1936088

-49226

Source: UNCTAD database; http://unctad.org/en/Pages/Statistics.aspx

Appendix II: Variable Description

Name

Measurement

Symbol

Current Account Balance

US$ millions

CAB

Exports

US$ millions

EXT

Imports

US$ millions

IMT

Gross Domestic Product

US$ millions

GDP

Foreign Direct Investment Flows

US$ millions

FDI

Source: Prepared by the researcher

Appendix III: Baseline Model OLS Output

Variable

Coefficient

Std. Error

t-Statistic

Prob.  

CAB

6.368638

3.263062

1.951737

0.0607

EXT

-2.369604

3.863507

-0.613330

0.5444

FDI

-3.340036

2.162035

-1.544857

0.1332

IMT

5.928138

3.546671

1.671465

0.1054

C

192934.4

12451.83

15.49446

0.0000

R-squared

0.993055

    Mean dependent var

630004.7

Adjusted R-squared

0.992097

    S.D. dependent var

542359.5

S.E. of regression

48215.12

    Akaike info criterion

24.53979

Sum squared resid

6.74E+10

    Schwarz criterion

24.76425

Log likelihood

-412.1764

    Hannan-Quinn criter.

24.61634

F-statistic

1036.657

    Durbin-Watson stat

1.275324

Prob(F-statistic)

0.000000

 

 

 

Source: Output generated through Eviews9.5 by the researcher

Appendix IV: Actual, Fitted and Residual series plotted

Source: Prepared by researcher through Eviews9.5