MODELLING THE TREND OF FLOWS WITH RESPECT TO RAINFALL VARIABILITY USING VECTOR AUTOREGRESSION

Wahab A. Iddrisu*, Kaku S. Nokoe and Isaac Akoto. Department of Mathematics and Statistics, University of Energy and Natural Resources, P.O. Box 214, Sunyani, Ghana. ...................................................................................................................... Manuscript Info Abstract ......................... ........................................................................ Manuscript History


126
Bui dam (at the border of the Northern Region (Bole) and Brong-Ahafo Region (Wenchi)), quantifying changes of water flows in time is fundamental in addressing issues of power generation and survival of ecosystem downstream.
A number of studies have been conducted the world over concerning rainfall and flow modelling. Examples include; Top-down and data-based mechanistic modelling of rainfall -flow dynamics at the catchment scale (Young, 2003); Simulating rainfall and river flow dynamics in Ghana (Ampadu, 2007); Statistical modelling of rainfall and river flow in Thailand (Boochabun et al., 2004); Comparative analysis of several conceptual rainfall-runoff models (Franchini & Pacciani, 1991) and Rainfall-runoff modelling and water resources assessment in north western Ivory Coast (Servat & Dezetter, 1993).
In Ghana, Awotwi et al. (2015) conducted a study on the prediction of hydrologic response to climate change in the White Volta catchment in West Africa. An ensemble of Regional Climate Model (REMO) was used to simulate and project the climate at local scale in order to investigate the hydrological impact of possible future climate change in White Volta Catchment (West Africa). They found that, with a small increase of 8% and 1.7% of the mean annual precipitation and temperature respectively, annual surface runoff, annual base flow and evapotranspiration recorded increment of 26%, 24% and 6% respectively. In a study conducted by the CSIR Water Research Institute (CSIR -WRI) in 2000 (CSIR & WRI, 2000), under the United Nations Framework Convention on Climate Change (UNFCC) and coordinated by the Environmental Protection Agency (EPA), it was found that there is a decrease and highly variable rainfall pattern, and frequent and pronounced dry spells in Ghana. The study also estimated a general reduction in annual river flows in Ghana by 15-20 % for the year 2020 and 30-40 % for the year 2050.
VAR models have been extensively developed and used in econometrics. Park (1990) used a set of rigorous diagnostic techniques to evaluate the forecasting performance of some multivariate time-series models for the U.S. cattle sector, including Bayesian vector autoregression (BVAR), unrestricted vector autoregression (UVAR), restricted vector autoregression (RVAR), and vector autoregressive moving-average (VARMA) models. He found using the root-mean-squared-error criterion along with an evaluation of the rankings of forecast errors that the Bayesian vector autoregression (BVAR) and the unrestricted VAR (UVAR) models generate forecasts which are superior to both a restricted VAR (RVAR) and a vector autoregressive moving-average (VARMA) model. Bessler (1984) used VAR models to study agricultural prices, money supply and industrial prices in Brazil; Kaylen (1988) used VAR together with other models to forecast the U.S Hog market; McCarty & Schmidt (1997) used VAR models to study State-Government expenditure; Andersson (2007) compared the forecast performance of random walk (RW), autoregressive (AR) and VAR models to forecast Swedish real GDP growth in which VAR emerged as the best in terms of forecasting.
In relation to natural phenomena, especially rainfall forecasting, use of VAR models has been rather scarce and virtually nonexistent in hydrological modelling. Saputro et al. (2011) studied the correlation between rainfall in a region and rainfall in other nearby regions. Adenomon et al. (2013) examined the dynamic relationship between rainfall and temperature in Niger State, Nigeria using monthly data from January 1981 to December 2010 and found a bi-directional causation from rainfall to temperature and from temperature to rainfall. Nugroho et al. (2014) applied VAR models to forecast future rainfall in Semerang, Central Java, Indonesia using monthly data on rainfall, humidity and temperature from 2001 to 2013 collected from 5 measurement stations. The VAR method was found to perform better than ARIMA method regarding forecasting as having smaller Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) and was also quite accurate in forecasting rainfall in their study area.  Ampadu et al. (2013) presented a review of some of the approaches employed in rainfallriverflow modelling highlighting on the rationale and structure of the modelling approaches, their strengths and weaknesses which may assist in making an informed choice of a modelling approach for hydrological studies. In this paper, vector autoregression (VAR) for multivariate time series is used to model the trend of water flow at Bui with respect to rainfall variability and investigates whether there is any causality (Granger or instantaneous) between rainfall and flow in relation to forecasting.

Materials and methods:-Study area:-
The Bui Dam as shown in Figure 1 is located at the border of the Bole and Wenchi districts, approximately 150km upstream of Lake Volta. With portions of the Dam site falling within Bui National Park, the entire Dam lies within Ghana. The terrain of the area is gently undulating with some deeply incised stream valleys. Deep and wide valleys of the Black Volta River, inselbergs and ridges, and the Banda Hills are some of the major relief and geomorphologic features of the area. Eroded granites corresponding to the Gonja Plateau and greywackes of the Birrimian and Tarkwaian formations are the two distinct geological features underlying the area. Upland soils and alluvial soils are the two major soil types occurring in the area (Ministry of Energy/Bui Development Committee, 2007).
The area is characterized by double-peak wet season, maxima in May-June and October. Average yearly rainfall in the area is around 1140mm. Whiles monthly temperatures range from around 26 0 C in August to around 30 0 C in March; the area has a mean annual relative humidity of 75% and a mean annual pan evaporation of 1781mm. The dam is drained into the Volta Lake by the Black Volta which has its headwaters in Burkina Faso, where it is called the Mohoun River (Ministry of Energy/Bui Development Committee, 2007). Vector autoregressive (VAR) model:-Vector autoregressive (VAR) model as one of the most widely used and flexible models for analyzing multivariate time series is a natural extension of the univariate autoregressive model to dynamic multivariate time series and has superior forecast ability compared with those from univariate time series models and theory-based simultaneous equations models.
The basic vector autoregressive (VAR) model of order p, as suggested by Sims (1980)  The parameters in equation (1) are estimated by generalized least squares (GLS). This is done by first estimating the individual equations of the system by ordinary least squares (OLS). The residuals are then used to estimate the white noise covariance matrix ∑ as which is used to compute the GLS estimator.
Bayesian vector autoregressive (BVAR) model:-VAR models generally uses equal lag length for all variables in the model which leads to the estimation of many parameters, some of which may not be significant. This has become a major drawback of the model resulting in possibly large out-of-sample forecasting errors. Doan et al. (1984) proposed the BVAR model which involves the use of Bayesian prior information to impose certain restrictions on the coefficients of the VAR so as to overcome the problem of over parameterization associated with VAR models. The prior means and variances they suggested, also known as the Minnesota priors take the following form; Where represents the coefficients associated with the lagged dependent variable and denotes any other coefficient in each equation of the VAR model. Whiles the prior means for lagged dependent variables are set to one, a prior mean of zero is assigned to all other coefficients in the equation. This is done in belief that lagged dependent variables are important whiles all other variables in each equation of the VAR model are viewed as less important. The prior variances indicate uncertainty about the prior means of one, for lagged dependent variables and prior means of zero, for all other variables. Doan et al. (1984) proposed a formula to generate the standard deviations which is a function of a few hyperparameters: , , and a weighting matrix ( , ), to deal with the problem of over parameterization associated with VAR models. The standard deviation of the prior imposed on variable j at lag k in equation i is: Where and are the estimated standard errors from a univariate autoregression involving variables i and j respectfully. is labelled as the 'overall tightness' to reflect the standard deviation of the prior on the first lag of the dependent variable. − is a lag decay function with 0 ≤ ≤ 1 which has the effect of imposing the prior means of zero more tightly as the lag length increases. The tightness of the prior for variable j in equation i relative to the tightness of the own-lags of variable i in equation i, is indicated by the function , . The weighting matrix used in the standard Minnesota prior is given in equation (4) while the overall tightness and lag decay hyperparameters used are = 0.1 = 1.0 respectively.
BVAR models with the standard Minnesota prior are usually estimated using the mixed estimation method in Theil and Goldberger (Theil & Goldberger, 1961).
Optimal lag length selection criteria:-For a range of lag orders n the individual equations of the system are estimated by OLS. The optimal lag order is selected by minimizing one of the following information criteria: Hannan-Quinn criterion, = log det ∑ ( ) + 2 log log 2 129 Schwarz Criterion, = log det ∑ ( ) + log 2 Final Prediction Error criterion, = , * is the total number of parameters in each equation of the model when is the lag order of the endogenous variables.
Structural analysis:-Despite the fact that VAR coefficients capture the anticipated impact of a variable, there are often a lot of coefficients to interpret. It is usually more common to examine the model's residuals which represent unforeseen contemporaneous events. The next subsections provide relatively non-technical explanations of some of the common techniques used for structural analysis of VAR models.
Causality analysis:-Both the Granger-causality and instantaneous causality were investigated. For both tests, the vector of endogenous variables is divided in two sub vectors, 1 2 , with dimensions 1 2 , respectively, so that = 1 + 2 . The sub vector 1 is said to be Granger-causal for 2 if the past of 1 significantly help predicting the future of 2 via the past of 1 alone (Granger, 1969). For testing this property, a model of the form Therefore this null hypothesis is tested against the alternative that at least one of the 21, is nonzero. An F-test statistic which is distributed as ( 1 2 , − * ) is used for testing the restrictions. Here * is the total number of parameters in the system including the parameters of the deterministic term (Lütkepohl, 1991). The role of 1 2 can be reversed to test Granger-causality from 2 to 1 .
Instantaneous causality is characterized by nonzero correlation of 1 2 . Thus the null hypothesis 0 : 1 2 ′ = 0 is tested against the alternative of nonzero covariance between the two error vectors in testing for instantaneous causality. A Wald test statistic is used to test this hypothesis.

Impulse response analysis:-
In impulse response analysis the exogenous and deterministic variables are treated as fixed and may therefore be dropped from the system. The adjusted endogenous variables are now denoted by . If the process is stationary (I(0)), it has a Wold moving average (MA) representation Where Φ 0 = and the Φ can be computed recursively as is the usual residual standard deviation estimator based on the full sample.
It is plotted for = + 1, … , together with the lines ± − + 2( − )/ − , where depends on the desired significance level of the resulting test. If the CUSUMs wander beyond these lines, then there is evidence against structural stability of the underlying model.

Results and Discussion:-
Descriptive statistics:-At an initial step, descriptive statistics for the data on rainfall and flow are produced in Table 1. The results contains among others, the minimum, maximum, and average values of rainfall and flow for the period considered. Whiles the minimum, maximum, and average values of rainfall are 0.00mm, 407.50mm, and 95.09mm respectively, the minimum, maximum, and average values of flow are 0.01cms, 2417.30cms, and 206.42cms respectively.  Figure 2 shows the time series plot for both rainfall and flow. The figure does not show any level of trend which is an indication of stationarity even though the Augmented Dickey Fuller (ADF) test will still be used formally to test for stationarity.

Stationary test and optimal lag selection:-
To examine the stationarity (a necessary condition for VAR modeling) of the two time series data, the Augmented Dickey Fuller (ADF) unit root test was employed. The ADF test results for rainfall and flow are shown in Table 2.
The results reveal stationarity for both series confirming the results from the time series plot. This indicates that VAR is suitable for modelling the time series data. 132 According to Lütkepohl & Saikkonen (1997), when the fitted VAR model order ( ) is assumed to vary as the cube root of the size of the time series, then ( + 1) is fitted to the data such that goes to infinity with sample size. Hence for our dataset, we considered VAR models from lag 1 to lag 10 and computed AIC, HQ, SC, and FPE values for each. By choosing the fitted candidate model corresponding to the minimum values of these measures, one is attempting to select the candidate model which is rendered most plausible by the data at hand. Therefore, the smaller the AIC, HQ, SC, and FPE value, the better the model. We observe from Table 3 that, while the AIC, the HQ and the FPE criterion favour VAR(10), the SC information criterion rather favoured VAR(4). However for purposes of forecasting, economists' wisdom suggest it is better to use the model favoured by the Schwartz (SC) information criterion (Galvao, 2012). Therefore, based on SC values we selected VAR (4) for model fitting.  Table 4 shows the VAR(4) results for rainfall and flow. The parameter estimates together with their respective standard errors, t-values and p-values are presented. Without exceptions, all parameter estimates are significant at the 0.05 level of significance. It is however imperative to note that interpreting the impulse response function for the model is much more informative than interpreting the individual parameter estimates.  Figure 3 is the diagram of fit and residuals for the rainfall model. The standardized residuals plot in the figure shows the residuals are distributed randomly around zero which is an indication that the residuals from the model have a constant variance. The Auto-Correlation Function (ACF) and the Partial Auto-Correlation Function (PACF) residual plots in the figure suggest that there is no autocorrelation in the residuals of the model which indicate no lost of information by the rainfall model.   Table 5. For the rainfall model, only flow at lag 1 is insignificant among the lags for flow whiles the only significant lag among the lags for rainfall is lag 1. On the other hand, rainfall at lags 1 and 2 are significant in the flow model whiles flow at lag 3 is insignificant.   Table 6 presents the mean absolute errors (MAEs) and coefficients of determination for both flow and rainfall under the VAR and BVAR models. The VAR model accounts for a little over seventy percent of the variability in flow and a little over forty percent of the variability in rainfall whereas the BVAR model only accounts for about twenty-nine and sixty-one percent of the variability in rainfall and flow respectively. Also, by inspecting the MAE values which is a measure of prediction accuracy, we observe that MAE values for VAR models are smaller than those for the BVAR models. Hence the VAR model is selected for structural analysis and forecasting. Results of the structural analysis:-As indicated earlier, it is often more informative and common to conduct structural analysis after fitting a VAR model by examining the model's residuals. The next subsections provide results of the structural analysis of the fitted VAR model.  135 Impulse response analysis:-Impulse response analysis was utilized to analyze the dynamic interactions between rainfall and flow of the VAR (4) process. The orthogonal impulse response of rainfall to flow is presented in Figure 5. The response of rainfall has an obvious fluctuation; there is a highest positive effect of flow on rainfall in the fifth month (May) and lowest negative effect of flow on rainfall in the ninth month (September). In Figure 6, the orthogonal impulse response of flow to rainfall is presented, which shows another obvious fluctuation. Whiles the highest positive effect of rainfall on flow was recorded in the second month (February), the lowest negative effect of rainfall on flow was recorded in the fifth month (May). Forecast error variance decomposition:-Forecast error variance decompositions (FEVDs) are also popular in interpreting VAR models. Results for the FEVDs for both rainfall and flow are presented in Figure 7. The results reveal that on the average, whiles about 81% of the variability in the trend of rainfall has been explained by past innovations in rainfall figures at Bui, a significant proportion (about 19%) of the variability in the trend of rainfall have been explained by past innovations in flow values at Bui. The results further reveal that on the average, whiles about 78% of the variability in the trend of flow has been accounted for by past innovations in flow values at Bui, about 22% (which is very significant) of the variability in the trend of flow has been explained by past innovations in rainfall figures at Bui. This confirms the results of the two-way relationship between rainfall and flow obtained from the granger-causality test, and indicate that modelling rainfall and flow together will further improve the forecast of each of them. Stability results:-Parameter stability throughout the study period is an important assumption in VAR modelling. If the parameters of the model are different during the forecast period than they were during the sample period, then the estimated model will not be very useful irrespective of how well it was estimated. Furthermore, if the parameters of the model were unstable over the sample period, then the model was not even a good representation of how the series evolved over the sample period. Visual examinations of the graphs of the recursive residuals are useful in evaluating the stability of the model. The ordinary least squares cumulative sum (OLS-CUSUM) graphs for the fitted models in Figure 8 indicate structural stability of the underlying models since the cumulative sums wander within the control lines for both models.

Conclusion:-
The implications from this study are quite clear. First of all, we investigated to determine between unrestricted and Bayesian VAR models the one that is best suited for forecasting flows with respect to rainfall variability at Bui. The unrestricted VAR model proved to have superior forecasting abilities than the Bayesian VAR model based on MAE values. We also investigated to determine whether past rainfall figures at Bui significantly helps in predicting future trends of flows via past flow values alone. The Granger causality test conducted revealed a two-way causality from rainfall to flow and from flow to rainfall which was confirmed by results of the forecast error variance decomposition. Finally, we modelled the trend of flow with respect to rainfall variability at Bui. Results from the fitted VAR(4) model revealed that modelling rainfall and flow together improves the forecast of each of them at Bui, which is an improvement in our understanding and knowledge of hydrological (Flows) variability in the context of climate (Rainfall) variability.