Journal list menu

Volume 89, Issue 9 p. 2623-2632
Article

FORWARD SELECTION OF EXPLANATORY VARIABLES

F. Guillaume Blanchet

Corresponding Author

F. Guillaume Blanchet

 Present address: Department of Renewable Resources, University of Alberta, 751 General Service Building, Edmonton, Alberta T6G 2H1 Canada. E-mail: [email protected]Search for more papers by this author
Pierre Legendre

Pierre Legendre

Département de Sciences Biologiques, Université de Montréal, C.P. 6128, Succursale Centre-ville, Montréal, Québec H3C 3J7 Canada

Search for more papers by this author
Daniel Borcard

Daniel Borcard

Département de Sciences Biologiques, Université de Montréal, C.P. 6128, Succursale Centre-ville, Montréal, Québec H3C 3J7 Canada

Search for more papers by this author
First published: 01 September 2008
Citations: 1,588

Corresponding Editor: F. He.

Abstract

This paper proposes a new way of using forward selection of explanatory variables in regression or canonical redundancy analysis. The classical forward selection method presents two problems: a highly inflated Type I error and an overestimation of the amount of explained variance. Correcting these problems will greatly improve the performance of this very useful method in ecological modeling. To prevent the first problem, we propose a two-step procedure. First, a global test using all explanatory variables is carried out. If, and only if, the global test is significant, one can proceed with forward selection. To prevent overestimation of the explained variance, the forward selection has to be carried out with two stopping criteria: (1) the usual alpha significance level and (2) the adjusted coefficient of multiple determination (inline image) calculated using all explanatory variables. When forward selection identifies a variable that brings one or the other criterion over the fixed threshold, that variable is rejected, and the procedure is stopped. This improved method is validated by simulations involving univariate and multivariate response data. An ecological example is presented using data from the Bryce Canyon National Park, Utah, USA.