Abstract
There is a growing need to heed some caveats in numerical data analysis. In 2013, I set out some issues regarding multiple regression frameworks. Here, I used both hypothetical and real data sets collected in Brazil to discuss the implications of, and provide suggestions for, some statistical issues regarding to collinearity and spatial structure of ecological data. For example, a weak treatment of collinearities might lead to discarding important variables for the model, and this can be avoided by a correct approach to collinearities before the model selection. Moreover, studies have demonstrated that the spatial structure in both predictor and response variables is an important point to be addressed, rather than the presence of this structure only in the residuals. Aiming to facilitate the controlling of such bias, I provide two fully explained scripts for R language. Considering the seriousness of spatial structure, my opinion is that no article that presents confirmatory analysis should be considered for publication if their authors do not heed that caveat; facing this issue, I strongly suggest that one performs a variance partitioning scheme.
References
Austin MP, Heyligers PC (1989) Vegetation survey design for conservation: gradsect sampling of forests in North-East New South Wales. Biol Conserv 50:13–32
Borcard D, Gillet F, Legendre P (2011) Numerical ecology with R. Springer, New York
Burnham KP, Anderson DR (2002) Model selection and multimodel inference: a practical information-theoretical approach, 2nd edn. Springer-Verlag, New York
Chakrabarty KC (2012) Uses and misuses of Statistics. Available at <http://rbidocs.rbi.org.in/rdocs/Bulletin/PDFs/02SPBUL090412.pdf>. Acessed on 26 Feb, 2014
Chase JM, Myers JA (2011) Disentangling the importance of ecological niches from stochastic processes across scales. Philos T R Soc B 366:2351–2363
Diniz Filho JAF, Bini LM (2005) Modelling geographical patterns in species richness using eigenvector-based spatial filters. Glob Ecol Biogeogr 14:177–185
Diniz Filho JAF, Bini LM, Hawkins BA (2003) Spatial autocorrelation and red herrings in geographical ecology. Glob Ecol Biogeogr 12:53–64
Diniz Filho JAF, Rangel TFLVB, Bini LM (2008) Model selection and information theory in geographical ecology. Glob Ecol Biogeogr 17:479–488
Dormann CF et al (2013) Collinearity: a review of methods to deal with it and a simulation study evaluating their performance. Ecography 36:27–46
Dray S, Legendre P, Peres-Neto P (2006) Spatial modeling: a comprehensive framework for principal coordinate analysis of neighbor matrices (PCNM). Ecol Model 196:483–493
Dunning JB, Danielson BJ, Pulliam HR (1992) Ecological processes that affect populations in complex landscapes. Oikos 65:169–175
Eisenlohr PV (2013) Challenges in data analysis: pitfalls and suggestions for a statistical routine in Vegetation Ecology. Braz J Bot 36:83–87
Gotelli NJ, Ellison AM (2004) A primer of ecological statistics. Sinauer Associates, Sunderland
Götzenberger et al (2012) Ecological assembly rules in plant communities–approaches, patterns and prospects. Biol Rev 87:111–127
Guisan A, Zimmerman NE (2000) Predictive habitat distribution models in ecology. Ecol Model 135:147–186
Heffernan JB et al (2014) Macrosystems ecology: understanding ecological patterns and processes at continental scales. Front Ecol Environ 12:5–14
Landeiro VL, Magnusson WE (2011) The geometry of spatial analyses: implications for conservation biologists. Nat Cons 9:7–20
Legendre P, Gallagher E (2001) Ecologically meaningful transformations for ordination of species data. Oecologia 129:271–280
Legendre P, Legendre L (2012) Numerical ecology, 3rd edn. Elsevier, Amsterdam
Legendre P, Dale MRT, Fortin M-J, Gurevitch J, Hohn M, Myers D (2002) The consequences of spatial structure for the design and analysis of ecological field surveys. Ecography 25:601–615
McCune B, Grace JB (2002) Analysis of ecological communities. MjM Software Design, Gleneden Beach
Meloun M, Militký J, Brereton RG (2002) Crucial problems in regression modelling and their solutions. Analyst 127:433–450
Öpik M, Bello F, Price JN, Fraser LH (2014) New insights into vegetation patterns and processes. New Phytol 201:383–387
Peres-Neto PR, Legendre P (2010) Estimating and controlling for spatial structure in the study of ecological communities. Glob Ecol Biogeogr 19:174–184
Pinheiro JC, Bates DM (2000) Mixed-effects models in S and S-plus. Springer, New York
Quinn GP, Keough MJ (2002) Experimental design and data analysis for biologists. Cambridge University Press, Melbourne
Ramage BS et al (2013) Pseudoreplication in tropical forests and the resulting effects on biodiversity conservation. Conserv Biol 27:364–372
Richards SA (2005) Testing ecological theory using the information-theoretic approach: examples and cautionary results. Ecology 86:2805–2814
Ricklefs RE, Schluter D (1993) Species diversity in ecological communities: historical and geographical perspectives. University of Chicago Press, Chicago
Scheiner S, Gurevitch J (2001) Design and analysis of ecological experiments. Oxford University Press, Oxford
The R Foundation for Statistical Computing (2013) http://www.r-project.org/. Accessed 30 Dec 2013
Zar JH (2010) Biostatistical analysis. Prentice-Hall, New Jersey
Zuur AF, Ieno EN, Smith GM (2007) Analysing ecological data. Springer, New York
Zuur AF, Ieno EN, Elphick CS (2010) A protocol for data exploration to avoid common statistical problems. Method Ecol Evol 1:3–14
Acknowledgment
I am grateful to CNPq (Conselho Nacional de Desenvolvimento Científico e Tecnológico) for the financial support. I am also indebted to the reviewers, for their insightful comments on an earlier version of this article; MSc. Pamela Moser (Universidade de Brasília), who allowed me to use her dissertation data; Dr. Danilo R.M. Neves (University of Leeds), who prepared earlier versions of the scripts presented in the online supplementary material; MSc. Mário J.M. Azevedo (Universidade Estadual de Campinas), who provided important suggestions for these scripts; Professor Pierre Legendre (Université de Montréal), who has solved essential questions; Professor Ary T. de Oliveira Filho (Universidade Federal de Minas Gerais), other colleagues and my graduate students, who helped me in fruitful discussions and gave me important feedbacks regarding the first manuscript (from 2013).
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
40415_2014_64_MOESM2_ESM.docx
S2 Variance partitioning as a tool for controlling the type I error rate in regression/canonical routines: a suggested R code. (DOCX 17 kb)
40415_2014_64_MOESM3_ESM.docx
S3 Variance partitioning as a tool for controlling the type I error rate in ANOVA/MANOVA routines: a suggested R code. (DOCX 17 kb)
Rights and permissions
About this article
Cite this article
Eisenlohr, P.V. Persisting challenges in multiple models: a note on commonly unnoticed issues regarding collinearity and spatial structure of ecological data. Braz. J. Bot 37, 365–371 (2014). https://doi.org/10.1007/s40415-014-0064-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40415-014-0064-3