Journal list menu

Volume 84, Issue 1 p. 45-67
Article

Rarefaction and extrapolation with Hill numbers: a framework for sampling and estimation in species diversity studies

Anne Chao

Anne Chao

Institute of Statistics, National Tsing Hua University, Hsin-Chu 30043 Taiwan

E-mail: [email protected]

Search for more papers by this author
Nicholas J. Gotelli

Nicholas J. Gotelli

Department of Biology, University of Vermont, Burlington, Vermont 05405 USA

Search for more papers by this author
T. C. Hsieh

T. C. Hsieh

Institute of Statistics, National Tsing Hua University, Hsin-Chu 30043 Taiwan

Search for more papers by this author
Elizabeth L. Sander

Elizabeth L. Sander

Department of Biology, University of Vermont, Burlington, Vermont 05405 USA

Search for more papers by this author
K. H. Ma

K. H. Ma

Institute of Statistics, National Tsing Hua University, Hsin-Chu 30043 Taiwan

Search for more papers by this author
Robert K. Colwell

Robert K. Colwell

Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, Connecticut 06269 USA

University of Colorado Museum of Natural History, Boulder, Colorado 80309 USA

Search for more papers by this author
Aaron M. Ellison

Aaron M. Ellison

Harvard University, Harvard Forest, 324 North Main Street, Petersham, Massachusetts 01366 USA

Search for more papers by this author
First published: 01 February 2014
Citations: 2,183

Abstract

Quantifying and assessing changes in biological diversity are central aspects of many ecological studies, yet accurate methods of estimating biological diversity from sampling data have been elusive. Hill numbers, or the effective number of species, are increasingly used to characterize the taxonomic, phylogenetic, or functional diversity of an assemblage. However, empirical estimates of Hill numbers, including species richness, tend to be an increasing function of sampling effort and, thus, tend to increase with sample completeness. Integrated curves based on sampling theory that smoothly link rarefaction (interpolation) and prediction (extrapolation) standardize samples on the basis of sample size or sample completeness and facilitate the comparison of biodiversity data. Here we extended previous rarefaction and extrapolation models for species richness (Hill number qD, where q = 0) to measures of taxon diversity incorporating relative abundance (i.e., for any Hill number qD, q > 0) and present a unified approach for both individual-based (abundance) data and sample-based (incidence) data. Using this unified sampling framework, we derive both theoretical formulas and analytic estimators for seamless rarefaction and extrapolation based on Hill numbers. Detailed examples are provided for the first three Hill numbers: q = 0 (species richness), q = 1 (the exponential of Shannon's entropy index), and q = 2 (the inverse of Simpson's concentration index). We developed a bootstrap method for constructing confidence intervals around Hill numbers, facilitating the comparison of multiple assemblages of both rarefied and extrapolated samples. The proposed estimators are accurate for both rarefaction and short-range extrapolation. For long-range extrapolation, the performance of the estimators depends on both the value of q and on the extrapolation range. We tested our methods on simulated data generated from species abundance models and on data from large species inventories. We also illustrate the formulas and estimators using empirical data sets from biodiversity surveys of temperate forest spiders and tropical ants.