DAILY MAXIMUM ANNUAL RAINFALL STATISTICAL REGIONALIZATION IN ANDALUSIA

The annual maximum daily rainfall (Amdr) is an important variable to model the runoff, justifying its study. This work aimed at regionalize the Amdr for the region of Andalusia through the spatialization of parameters of the probability distribution. Daily rainfall data between the years of 2001 and 2015 from 56 meteorological stations in Andalusia were evaluated and adjusted to ten probability distributions, as also the possibility of tendencies in the data was explored. The Gumbel II distribution was the one with better results for 20 of the 56 meteorological stations. Decreasing tendencies in three meteorological stations were verified. The Amdr was estimated for the payback periods of 10, 20, 50, 100 and 1000 years, obtaining average values of 76.76, 90.75, 110.79, 127.69 and 201.23 mm, respectively. Maps with the distribution of the parameters α and β of the Gumbel II distribution for Andalusia were obtained, apart from maps with the Amdr distribution for each payback period.


INTRODUCTION
The rainfall is one of the climatic factors that present higher spatiotemporal variability, justifying the study of extreme events of annual maximum daily rainfall (Amdr) and its distribution in the space and time.The study of these extreme events is one of the main factors when dimensioning urban and rural drainage systems, contour farming and when rectifying waterways, once the Amdr is related to severe damages to the human activities, due to its potential to cause soil saturation, surface runoff and soil erosion (IPCC, 2007;Tammets & Jaagus, 2013).
Past extreme climatic events are studied to estimate the probability of future events to be equal to or greater than past ones, therefore being necessary to estimate its payback periods.Payback period is the average time needed, in years, for an event (Amdr) to be equaled or overcome, in any year (Naghettini & Pinto, 2007).
To estimate Amdr for different payback periods requires a dataset of at least 15 years without gaps, being preferable a dataset of 30 years.Due to the difficulty in obtaining data in certain regions, the geostatistics is used to allow the estimative of extreme events in not monitored areas, through the data from nearby meteorological stations.
Kriging is the generic name adapted by the geostaticians to the family of algorithms of generalized least squares regression (Goovaerts, 1997).The kriging methods use a spatial dependency expressed in the semivariogram between nearby samples to estimate values in any position of the region, without tendency and with a minimum variance, what makes them very good estimators in the study of the spatial distribution of the rainfall (Machado et al., 2010).
Many studies have used the quota as an auxiliary variable in the estimative of different main variables, with good results (Brito et al., 2010;Viola et al., 2010;Di Piazza et al., 2011).However, the quantification of its contribution in this aspect, regarding the interpolators still needs to be better determined.
The purpose of this study is the regionalization of the annual maximum daily rainfall (Amdr) in the autonomous community of Andalusia.It also aims to research the contribution of the longitude and latitude and the covariate altitude as auxiliary in the determination of estimatives of the parameters of the chosen probability distribution, allowing the representation, through maps, of the distribution of extreme events for different payback periods.

MATERIAL AND METHODS
Andalusia is the second largest autonomous community of Spain in extension, with approximately 87600 km 2 , between the longitudes of 07º 31' W and 01º 38' W, and between the latitudes of 38º 44' N and 36º 00' N. Around 31.42% of the territory has altitude between 0 and 200 m; 39.38% of the territory between 200 and 600 m of altitude; 25.94% between 600 and 1400 m of altitude and 3.27% higher than 1400 m.
According to the Köppen & Geiger (1928) climate classification, the climate Csa prevails in the region of Andalusia, characterized as temperate with hot and dry summer.Except for one area of the province of Almeria that presents a climate BWh (hot desert) and for the region of Sierra Nevada, that presents a climate Dsc (cold with dry and fresh summer).The used dataset is constituted by the annual maximum daily rainfall (Amdr) of 56 meteorological stations (Figure 1 and Table 1), the datasets with an extension of 15 years (from 2001 and 2015, both included).
Based on the Amdr data series, from the respective stations, the data were adjusted to ten probability distributions, as follows: (i) Lognormal, (ii) Weibull, (iii) Gamma, (iv) Cauchy, (v) Normal, (vi) Logistic, (vii) Birnbaum-Saunders, (viii) Gumbell-II, (ix) Gumbel and (x) Rayleigh generalized.To adjust the parameters of the distributions the maximum likelihood method (ML) and the Akaike Information Criterion (AIC) were used to choose the distribution that better adjusted to the data, through the least AIC value.Therefore, the distribution that presented better adjusts to the Amdr data of most of the meteorological station was chosen to represent all studied region and, thus, the parameters of the chosen distribution were obtained and the Amdr was estimated for the payback periods (T) of 10, 20, 50, 100 and 1000 years.
The geostatistic analysis was performed using the parameters of the distribution that presented better adjust to the Amdr data, aiming at regionalizing the Amdr for the studied area.
Initially, an exploratory analysis of the variables to be interpolated (parameters of the best distribution for most of the stations) was performed, aiming at confirming some presuppositions assumed by the geostatistic model, among them: (i) normality, (ii) no spatial tendentiousness, and (iii) outliers removal.
The parameters of the function of the chosen distribution were analyzed according to the approach of the geostatistics models (Diggle & Ribeiro Junior, 2007).This way, we sought to adjust the model's parameters (Equation 1) by the maximum likelihood method. (1) In which, Y(Xi) is the annual maximum daily rainfall in the i line of the matrix of X coordinates; β is the global average of a specific area; S(Xi) is a Gaussian process with a function of a mathematical model with variance parameter σ 2 and reach parameter φ; εi is the random noise normally distributed with average zero and variance τ 2 .
Different tendency models were tested, defined by linear and quadratic relations between the covariates X, Y and quota in the location of the meteorological stations.To adjust the model's parameters the ML (maximum likelihood) method was used.Therefore, the assessment of the performance of each model, with and without spatial factor, in the interpolation of Amdr and of the parameters of the chosen distribution was performed through the AIC method.
Therefore, the best trend withdrawal method was chosen, characterized by the least value of AIC, and thus six models of covariance functions were tested, being them: (i) exponential; (ii) Gaussian; (iii) spherical; (iv) circular and; maternal with softness parameters equal to (v) 1.5 and (vi) 2.5.The evaluation of each model's performance in the estimative of the parameters of the chosen distribution was performed by the AIC.
Considering the geostatistic model's components (Equation 1), the selection of the model according to AIC followed the steps: (i) for the bxi component, firstly the S(xi) component was considered constant and then the best tested trend model was chosen; (ii) for the S(xi) component, once modeled the bxi component, it was fixed, and thus the best covariance function tested was chosen.After the choice of the model and estimative of its parameters, the ordinary kriging was used to interpolate the studied variables.
The data reading and handling to elaborate the Amdr data series (package hy-droTSM -"Hidrologic Time Series Managemet"), as also the adjust of the distribution models (package fitdistrplus -"Parametric Distribution to Non-Censored or Censored Data"), the goodness of fit test (package (ADGofTest), among other calculations regarding statistics, geostatistics (packages geoR, MASS, rgdal and raster), the plots (package RColorBrewer, maptools and SDMTools) and the data series trend analysis (package Kendall) were performed in the statistical open software R Statistical 3.1.2® (R CORE TEAM, 2014).

RESULTS
The trends of the extreme events for each meteorological station were assessed by the test of Mann-Kendall (Table 2).From the 56 trend values, 36 were negative (a negative value indicates a decreasing trend of Amdr over time) and 20 were positives (increasing trend).However, when the significant values (p-value < 0.05) were considered, it was observed that only three stations presented significant values (emphasized in bold in Table 2), with decreasing trend (negative values).The others, that did not presented statistic significance, are explained by the natural variability of the rainfall.Therefore, world and regional climate changes that could be associated, for example, with the urban heating, did not affect significantly the rainfall records.
In Table 2 it is shown the obtained parameters for the chosen probability distribution for each station, emphasizing that both ten probability distributions were accepted according to the Anderson-Darling goodness of fit test (p<0.05).The distributions that presented the best adjusts for most of the Amdr data were Gumbel II (20 stations) and Birnbaum-Saunders (14 stations) The Normal and Cauchy distributions did not presented the best fit for any of the stations.Therefore, the distribution which presented the best fit for most of the Amdr data was the Gumbel II being chosen to represent the whole region of the present study.Thus, the scale (α) and shape (β) parameters of the chosen distribution were obtained for the remaining stations.The chosen distribution was evaluated by the Anderson-Darling test (p<0.05); the obtained values for this test were much lower than the critic values, being this distribution considered adequate in the 56 meteorological stations.One of the presuppositions of geostatistics is the existence of data normality thus, the Box-Cox family test was performed for the Gumbel II parameters, obtaining values of 0.97 and 0.99.Indicating that the parameters present tendency to a normal distribution.
Twelve types of spatial trend were considered, combining algebraically the longitude, latitude and quota.Among the tested spatial trend models, those who presented greater prediction precision are emphasized in Table 3, as the estimatives of the parameters (τ², σ² e φ).
The circular model considers the trends of latitude with the covariate quota and can be used to estimate b.For the variable α it was not necessary to include the covariance function S(xi), it was adjusted only with the spatial trend of latitude.Representing the best model that explains the spatial variability of the Gumbel II parameters.
The semivariogram range parameter indicates the maximum distance where the sampling points are correlated between them, in other words, the points located in an area which radius is the range parameter are more similar between them than between the ones that are separated by greater distances.The range of 68.63 km for the b parameter shows that all the nearby stations inside this radius can be used in the estimative of values in smaller distances (Table 3).To estimate the Gumbel II parameters not sampled in locations maps of spatial distributions were generated (Figure 2), obtained by the interpolation by kriging method, from the parameters of the models fitted to the semivariograms (Table 3).
From the parameters of the Gumbel II distribution (Table 3) and from the maps of Figure 2, maps for each payback period were obtained (Figure 3).

DISCUSSION
Regarding the probability distributions that presented higher number of best fits for the meteorological stations, Almeida et al. (2014) obtained similar results, when studying the distribution of extreme events of daily rainfall in the state of São Paulo, Brazil.These authors verified that the Gumbel II distribution well represented the rainfall conditions of the region.The Gumbel II distribution is the one that is more frequently used in literature.José et al. (2014) obtained spatial distribution for the parameters of Gumbel II distributions associated to Amdr in the southeast region of Brazil.
Concerning the models of a trend, as discussed by Diggle & Ribeiro Junior (2007), it should ideally have a physical natural interpretation.Thus the choice of a simple model that explains most of the spatial variability would be ideal.More complex models are generally more difficult to interpret.
Observing Figure 3, it is possible to verify a trend of elevation of the rainfall in the region of the Guadalquivir Valley (Northeast-Northwest) and in the Mediterranean and Atlantic coasts, comprehending the quotas from 0 to 150 meters.The higher values are in the Mediterranean coast, the Cadiz coastline (Southeast) and the Almeria coastline (Southeast).This trend can be explained by the orographic effect, caused by the orientation of the Sierra de Roda in the Cadiz region and of the Cordillera Penibética in Almeria.
According to Muñoz-Díaz & Rodrigo (2004) this orographic region is influenced by the Atlantic and Mediterranean frontal systems, with a significant gradient of rainfall in the North-South direction.

CONCLUSION
The results revealed that the Amdr, associated to its respective payback periods and to the parameters of the Gumbel II distribution, present spatial tendency, from which Amdr maps were generated for the Andalusia autonomous community, Spain.

Fig. 1 :
Fig. 1: Digital Model of the Terrain (DMT) and distribution of the meteorological stations studied, in the Andalusia Autonomous Community, Spain

Table 3 :
Figure 2: Maps of the scale (α) and shape (b) parameters of the Gumbel II distribution in the Andalusia autonomous community,Spain (2001Spain ( -2015)   )

Figure 3 :
Figure 3: Maps of annual maximum daily rainfall, associated to the payback periods of 10 (a), 20 (b), 50 (c), 100 (d) and 1000 (e) years in the Andalusia autonomous community, Spain

Table 2 :
Parameters of the most appropriate distribution models in the determination of the probability of happening the maximum daily rainfall in