AN ALGORITHM TO FIND SIGNIFICANT COMMODITIES IN STOCHASTIC LASPEYRES REGRESSION MODEL

Arfa Maqsood 1 , S. M. Aqil burney 2 and Suboohi Safdar 1 . 1. Department of statistics, university of Karachi, Karachi. 2. Institute of business management, Karachi. ...................................................................................................................... Manuscript Info Abstract ......................... ........................................................................ Manuscript History Received: 11 February 2019 Final Accepted: 13 March 2019 Published: April 2019

Basically, the detecting methods have been proposed to find leverages and influential observations in regression analysis with one regressor and more than one regressors. The detailed description on unusual observations are given by Barnett and Lewis (1994). Belsley et al. (1980), and Chatterjee and Hadi (1986) classified the types of such abnormal observations with respect to regression analysis of two variables X and Y, via the help of graphical tools on two dimensional plots. The role of hat matrix (H) with reference to both regression analysis and analysis of variance (ANOVA), using a calibration point 2p/N is given by Hoaglin and Welsch (1978). They found the diagonal elements ii h , a very worthwhile diagnostic towards detecting leverages in multiple regression. The properties of hat matrix are described and used by many authors like Hoaglin and Welsch (1978), Cook and Weisberg (1982), and Draper and Smith (1998). The chief role of hat matrix is in finding the leverages in simple regression as well as multiple regression analysis, Hoaglin and Welsch (1978) found hat values as a good indicator to detect the leverages. The larger the value of ii h , the more significant value of regressor to estimate y value. We use the hat values in order to find the significant commodities in Laspeyres index numbers.
One useful measure of change is difference in regression coefficients  (DFBETA), when calculated on the basis of full sample and reduced sample. Thus, defined as Miller (1974) for the derivation of DFBETA formula). A scaled version of DFBETA is proposed by Welsch and Kuh (1977), which is obtained by dividing to the standard error of ̂. The values of DFBETAs are compared to the size of sample N and if these are greater than √ , the corresponding observations are considered as unusual. Özkale and Açar (2015) used these influence measures to find unusual observations in linear regression model with more than one regressors. Maqsood and Burney (2014) and Burney and Maqsood (2014) used the technique of hat matrix and DFBETA measure to find the influential commodities in Laspeyres index number model with autocorrelated errors.
The organization of the paper is as follows. Section 2 provides the stochastic Laspeyres regression model with brief description of how the Laspeyres index number estimators are obtained. The formulae of hat values and DFBETA values are also recalled in this section. An algorithm to find significant commodities is presented in section 3 and well displayed by flow chart. Lastly, section 4 gives the conclusion.

Stochasic Laspeyres Regression Model
The stochastic simple model of Laspeyres price index number is defined as follows; Where are the parameters of autoregeressive process, are the weights of ith commodity and is kronecker delta which takes the value one for i=j, and zero otherwise. For an stationary autoregressive process the roots of autoregressive polynomial must lie outside the unit circle. The matrix formulation of model (1) where Q is a lower triangular matrix. Under these assumptions, the best linear unbiased estimator (BLUE) of  in model (4) can be obtained using the method of generalized least square (GLS) as given below The transformed model is obtained by multiplying both sides of equation (4) by Q , and then we apply the simple ordinary least square (OLS) estimator to the transformed data to obtain estimated generalized least square (EGLS). We have The transformation matrix Q is obtained by Burney and Maqsood (2014) when the errors are generated from autoregressive process of order p with p=1, and by Maqsood and Burney (2014) for p=2. With these assumptions Maqsood and Burney (2014) obtained the estimator of  , the familiar Laspeyres index number, written as The order of autoregressive process does not have an impact on estimator of Laspeyres price index number. We have confirmed for p=1 and p=2 and get the same formula for Laspeyres as given in equation (6). The standard errors of estimated Laspeyres index numbers are derived by Maqsood and Burney (2017). To find the influential observations and its impact on Laspeyres regression model we consider the two familiar influence measures hat matrix and difference in parameter vector beta (DFBETA). For this purpose, we find the hat matrix for transformed data using the equation given below Another measure to determine the influential observations is DFBETA, given by the following formula We use D for DFBETA for the sake of convenience and subscript ' itj ' is used to describe 'i' for commodities, 't' for time period, and 'j' for parameters. DFBETA depends on the order of autoregressive process and a different estimator is obtained for p=1 and p=2. The DFBETA estimator is obtained by Burney and Maqsood (2014) for AR(1) error process and by Maqsood and Burney (2014) for AR(2) process. Burney and Maqsood (2014) derived the DFBETA estimator for Divisia index number which is same as we get for Laspeyres index number. The estimators and their explanations are given in these references.

Algorithm to Find Significant Commodities
To achieve our objective, we must pursue a step by step procedure to discover the influential commodities in Laspeyres index number. For the sake of convenience, we convince the reader to carry out an algorithm for employing the methods described in sections 2. Figure 1 displays the algorithm using flow chart, in which the two phases are recommended to accomplish. Firstly, the first phase includes the parameter estimation of Laspeyres regression model, finding error series, verifying the stationary scenario of series, and fitting of an AR(p) model to error series. Then the second phase certainly comprises of utilizing methods to extract the influential commodities. The influential commodities in the sense, that it disturbs the regression estimates of index numbers and its related aspects. We use here the two widely used methods the hat matrix, and the DFBETA matrix, and the decision is made on calibration point that have been described in flow chart.
The hat values alone can not provide the accurate picture of the pattern that commodities in the consumer basket reflect. It may mislead the analyst to wrong and incorrect situation. Because the commodities, for which we are computing the higher hat values, do not necessarily have a big DFBETA value too. To clear that, we must start to establish the following hypothesis; H o : ith commodity is not influential to estimate the underlying index number Suspecting Influential Suspecting Influential Influential

Conclusion:-
In this paper, we considered the general form of hat matrix and DFBETA measure to detect the influential commodities in estimating the stochastic Laspeyres index number when the errors are serially correlated with AR(p) process. For this purpose, we defined a step by step procedure consisting of two phases. The first phase computes the estimates of Laspeyres index numbers. The second phase helps to determine the influential commodities using hat values and DFBETA values. These two phases are well explained by flow chart. A hypothesis is generated for a specific commodity whether has significant impact on estimated index number in section 3. To check this hypothesis a figure representing the three zones of acceptance, suspicious, and critical is shown. The decision is made using the given calibration points for both hat values and DFBETA measures. One can reach a decision easily by looking the  Res. 7(4), 792-797 797 different scenarios in table 1 for acceptance or rejection of the null hypothesis. This proposed algorithm is useful for the researchers working on influential commodities using these measures.