DISCUSSION
It is extremely important to emphasise that the two new distribution models disagree with the log series because they explicitly assume a hard limit to species richness. No conflict among theories could be more basic. Indeed, each novel distribution comes with an elementary species richness estimator that follows directly from its single governing parameter (eqns. 10 and 16).
Therefore, we now have within reach a real solution to the problem of estimating diversity that plagues the ecological literature (Colwell & Coddington, 1994; Bunge et al., 2014). One could argue nonetheless that a distribution-free approach to estimating richness is preferable. This is beside the point because fitting distributions is a straightforward process, so identifying the right one is often easy.
Regardless, the most common so-called non-parametric richness estimators are all designed to produce lower bounds instead of accurate, unbiased values. Examples include the jackknife estimators (Burnham & Overton, 1978), the bootstrap estimator (Smith & van Belle, 1984), Chao 1 (Chao, 1984), the abundance coverage estimator (Chao & Lee, 1992), and interpolation and extrapolation based on sample coverage (iNEXT: Hsieh et al., 2016).
The reason for their systematic error is that they implicitly assume uniform abundance distributions, as I have briefly noted before (Alroy, 2017). Otherwise, for example, the Poisson sampling theory that can be used to easily derive Chao 1 (Alroy, 2017) would make no sense. The fact that real-world distributions are virtually never uniform renders all of the many lower-bound estimators non-starters.
Computing joint evenness-richness measures such as Hill numbers (Hill, 1973; Chao et al., 2014) is a widespread alternative approach to the diversity problem, and it is motivated by the difficulty of extrapolating richness from incomplete survey data (Chao et al., 2014). After all, the key Hill number called the Simpson index (Simpson, 1949) has no sampling bias if properly formulated (Hurlbert, 1971), and the classic Shannon index (Shannon, 1948) is also robust. However, the odds and expe models produce no sample size bias – even when their assumptions aren’t met (Figs. 4B, C). Thus, the undersampling problem that drives the Hill number and extrapolation literature is now moot.
As for evenness, it does not exist as a separate concept in cases where any of the three one-parameter models hold: there is no role for distribution ”shape” in these scenarios, so evenness is always fixed. Specifically, the x parameter of the log series, the µ parameter of the odds distribution, and the λ parameter of expe are all instead sampling intensity measures. The evenness concept is fundamental to ecology (Pielou, 1966; Hill, 1973; Tuomisto, 2012) and is routinely taught to undergraduate students as a core part of the discipline’s theory. Ironically, the existence of ”evenness” as an attribute of real-world communities can now be called into question. In cases where evenness is a non-concept, the Hill number approach is even less relevant: differences amongst Hill numbers reduce to differences in how richness and evenness are balanced (Hill, 1973). The only Hill number entirely representing richness is richness itself.
The simulation models undergirding the three models make completely conflicting assumptions about birth and death processes: they are either entirely invariant (log series), variable both taxonomically and temporally (odds), or variable taxonomically but not temporally (expe ). The latter two distributions both follow from assuming that birth and death rates jointly track random uniform variates. The only difference is that expe holds the rates constant through time, so they result from the invariant ecological traits of individual species.
Testing the models with empirical data could be straightforward. For example, a study of tree community ecology could examine counts of seedlings or saplings sorted by species. The counts might follow a geometric distribution because this is assumed of birth rates in both of the new models. Just as interestingly, repeated censuses across several years could be used to show whether varation is temporally consistent: do species with high counts in a given year still have high counts in following years? If so, then the expe model should hold for counts of adult individuals – meaning that traits matter. If not, then the odds model might be sufficient.
The published empirical data suggest that the expe dynamic is actually the most common when sampling is intense, richness is high, and a coherent, non-saturated pattern is present (Fig. 2A). The database is so highly eclectic that this unlikely to reflect a bias in the primary literature or sampling of that literature – trees (Fig. 2B), birds (Fig. 2C), and certain insect groups (Fig. 2D) have little in common, but all of them often drift into the expe range of the diagram. There is some additional interesting variation across ecological categories, but either expe or odds is often the strongest model. This fact is not very surprising considering that their assumptions are biologically basic: there is every reason to think that reproductive rates do vary amongst species in the same communities, regardless of whether that variation is consistent through time.
By contrast, the log series is usually not the best model for well-sampled terrestrial data (Fig. 2A; Antão et al., 2021). It is widely believed that most empirical SADs do have a ”hollow curve” pattern, with long tails of rare species that loosely fit the log series (McGill et al. 2007). But such RADs very often fit expe .
The fact that the log series isn’t dominant has very broad implications. The description of the model by Fisher et al. (1943) is a classic in the field, and the deeply related neutral model of biodiversity (Hubbell, 2001) has been profoundly influential (Rosindell et al., 2011). Ironically, a large body of literature has been devoted to altering Hubbell’s original model by adding yet more biological assumptions: variation through dispersal limitation and habitat preference (Zillio & Condit, 2007), conspecific frequency dependence (Jabot & Chave, 2011), and much more (e.g., Al Hammal et al., 2015). The neutral model was complex from the start, explicitly assuming a major role for varying immigration rates and steady speciation rates in addition to assuming complete ecological equivalence among species. Its variants are even more complex, and elaborate models are unlikely to have fidelity to the world (Finocchiaro, 2021).
The odds and expe models succeed in capturing large differences among communities that one might sensibly expect to exist, but they sidestep all the complexity. It is literally not possible for simpler models to exist. As a result, ecologists may now have the tools to discern biologically important patterns by contrasting highly testable and distinct theories of community assembly – and to finally estimate species richness in a robust and justifiable manner.