Saturated models
Intuitively, certain SAD shapes might be so unusual that shoehorning
them into any of the models under consideration would be uninformative.
A good test is to show whether a distribution generated directly from
the raw data fits better. By this I mean what might be expected to be
found in a new data set highly resembling the one under consideration.
Specifically, I assume that the individual counts are random outcomes of
a geometric sampling process. The maximum likelihood estimate of the
governing parameter of the geometric series p is just 1/(n+ 1) where n is an individual count. I propose finding pfor each n , computing the PMF of the geometric series based on
each p , and averaging the PMFs that result. A smooth PMF lacking
gaps results. It is necessary to assume there are no zero counts for the
calculation to replicate observed richness. Thus, each geometric series
PMF needs to be rescaled before summation, specifically by multipling
the entire range of probabilities by n /(n + 1).
The alternative would be to assume a Poisson process (Bulmer, 1974). But
Poisson variation around high integers is small, making it difficult to
fill the gaps at the high end of a predicted SAD shape. Furthermore, a
direct comparison of the two possible saturated models, which I lack
space to illustrate, shows that the one based on the geometric series
yields better log likelihoods (LLs) for the current empirical data set.
Assessing LLs here is non-trivial because any saturated model overfits
the data by preserving too much fine-scale detail. A good solution is to
randomly vary the species list across many trials. During each one, the
counts are sampled with replacement (i.e., bootstrapped), the averaged
PMF is recomputed, and the LL is found. The LLs are then averaged across
trials. Consistent with the logic of all bootstrapping methods, this
procedure renders the PMF more of a prediction of future data then a
description of existing data, which is appropriate.