11/14/07

'Selfish' HOF response curves

Generally speaking, when searching for most suitable model describing relationship between variables, two types of criteria have to be taken into consideration - criterion of 'goodness' from statistical point of view and criterion if interpretability from ecological point of view. When these criteria go against each other, we have to face the decision: which of them to prefer?
Modeling species response curves (describing shape of species response along particular ecological gradient) represents quite special statistical problem. These curves are usually examined in terms to answer these questions: Where is the optimum of species (if any) along gradient? What is the species tolerance (or width of niche, respectively) along this gradient? And, what's the shape of species response - is it monotonous, symmetric or asymmetric?
Not all modeling strategies are suitable for answering these questions - e.g. most commonly used models - GLM and GAM - gives often quite inappropriate and unrealistic response curves (see Oksanen & Minchin, 2002a). A promising option seems to be Huisman-Olff-Fresco (HOF) models, which were designed specially with respect to flexibly express species response along gradient in ecologically meaningful and interpretable way (Huisman et al., 2003). They constitute a hierarchical set of five models with increasing complexity in terms of number of parameters and shape of resulting response curve (see Fig. 1, taken from Huisman et al., 2003). For fitting these models, Jari Oksanen wrote the program HOF and also developed library 'gravy' for R statistical environment (for more info, see website of Jari Oksanen). The algorithm of the model fitting is in these routines based on maximum likelihood method and selection between models can be done using criterion of deviance or some of more complex criteria (AIC or BIC).



However, flexibility of HOF models goes too far, causing in some cases the fail of ecological interpretability of resulting curves. Problem is with existence of 'sharp shapes', which can be produced by these models (see Fig. 2, comparing response curve of Alchemilla vulgaris along gradient of pH, as drawn by GAM and HOF). HOF model 'V' was selected on the base of the lowest 'AIC' criterion from all HOF models, which means that from statistical point of view is it the most suitable model. However, the species optimum is unrealistically shifted to the extreme position.



More detailed inspection of this problem reveals the reason. Fig. 3 shows cumulative deviances for particular curves; at the end of gradient, HOF 'V' model will have the lowest deviance, which results from the flat section of its cumulative curve in final stage. Its flatness is caused by the ability of response curve to produce fast decline from maximum to zero and to stop further increase of deviance. This sharp shape is enabled by the parametric nature of the models and occurs, when some of the four equation parameters rise to extreme values (over 100 and more). This represents the selfish behavior of HOF response curve - it gives way to lowering of deviance even in cost of unrealistic shift of optimum and losing ecological interpretability.




'Correction for sharp shapes', as introduced in JUICE procedure for species response models
'gravy' package is susceptible to produce these 'sharp shapes' in HOF model 'III' and 'V'. According to my personal communication with Jari Oksanen, this behavior is caused due to improvement of fitting mechanism - fitting of model 'V' starts with symmetric model 'IV' and moves position of optimum in terms of minimizing log-likelihood, and this improvement enables to shift optimum to more extreme position than previous, less effective algorithm published in Oksanen & Minchin (2002b). It means that the older algorithm is far less susceptible to produce 'sharp shapes' and from this point of view seem to me to be more appropriate for fitting of HOF models.
'Correction of sharp shapes', as introduced here, is a simple mechanism removing sharp shapes. 'gravy' package is used for fitting of curve and selection of the best model (based on AIC criterion). If any of the four parameters (a, b, c, d) included in model equation exceeds value 100, curve is regarded to be too sharp. In this case, fitting is processed again using older algorithm for particular model. If even the older algorithm results into sharp curve shape, correction procedures search between other HOF models for the one with nearest higher AIC value and non-sharp shape. It needs to be said, that the way how the correction is done is not too 'pretty' - but works.
This is example, how this correction works. Models calculated from bootstrap selection from original data are included in graphs (black color is for curve based on original data, different colors are used for different models based on bootstrap data). The idea of bootstrapping the data was invented to me by J. Oksanen.






For more information about species response curves in JUICE see http://sci.muni.cz/botany/zeleny/hof.php


References
Huisman, J., Olff, H., Fresco, L.F.M., 1993. A hierarchical set of models for species response analysis. J. Veg. Sci. 4, 37-46.
Oksanen J., Minchin P.R., 2002a. Continuum theory revisited: what shape are species responses along ecological gradients? Ecological Modelling 157: 119-129.
Oksanen J., Minchin P.R., 2002b. Non-linear maximum likelihood estimation of Beta and HOF response models. URL: http://cc.oulu.fi/~jarioksa/softhelp/hof3.pdf