INTRODUCTION
Taxonomy has been the focus of debate since the XIX century, and even recently the recognition of the taxonomic research is subject of discussion (Packer et al., 2018, Zeppelini et al., 2021). The global diversity crisis exposes the urgency for investment in taxonomy to reveal the largely unknown species diversity. Using Collembola as a parameter, where about 20% of its estimated diversity is known (Hopkin 1997), between 100 and 120 new species are described each year, and it would take to taxonomists more than 400 years to uncover and describe all the unknown species diversity (Potapov et al., 2020). To be able to understand the diversification processes in Collembola, we need to speed up the rates of species description. This is a matter of concern in every area of entomology, and in some extent, the whole zoology.
Collembola Lubbock, 1870 are minute wingless arthropods, basal hexapods found in every terrestrial habitat on the planet, including soil, leaf litter, canopy trees and caves (Bellinger et al., 1996-2023, Hopkin 1997). There are about 9000 described species, and its diversity is extensively underestimated and poorly known (Bellinger et al., 1996-2023, Hopkin 1997). They play important role in the food web and the global metabolism (Bardgett & van der Putten 2014, Filser et al., 2016, Potapov et al., 2023, Rusek 1998).
Similar to many other taxonomic groups of meso and micro fauna, Collembola taxonomy is largely based on morphological analysis, observing, and describing discrete variations in diagnostic characters. The most abundant morphological source of information for species definition in Collembola is the number, distribution, and shape of cuticular chaetae, this is called chaetotaxy. The current morphological approaches for inference of homology, chaetotaxic systems for chaetal identification, are often room for great subjectivity depending on what is seen and what is visible under an optic microscope, and often different chaetotaxy systems are hardly comparable (Betsch 1997, Betsch & Waller 1994, Bretfeld 1990, 1999, Potapov et al., 2020). The challenges and perspectives for Collembola taxonomy is discussed in detail, and the need for an integrative taxonomy and international efforts to direct financial support and expertise recognition to face the global biodiversity crisis, was also the focus of debate (Potapov et al., 2020, Zeppelini et al., 2021).
The impact of recent technologies of high-resolution imaging, molecular sequencing and machine learning will be a great deal towards taxonomic techniques that can improve new and known taxa recognition (Potapov et al., 2020). Integrative taxonomy, combining morphological and molecular data to define species limits is likely to be a trend for most taxonomic groups, not only Collembola.
There is, however, a particular aspect in Collembola (and nearly every taxon of the micro-fauna) that affects the viability of including molecular sequences in new species descriptions, in many, if not most cases. It is rather a logistic problem, but many times there is not an alternative. The problem is that almost all new species are discovered under light microscope, which means that the specimen was mounted in a slide, after being cleared under several different techniques of chemical washes, which destroy the tissues and, consequently, genetic material.
It is only after the taxonomic identification, that a species is recognized as new for science or undescribed. More often than not, the material analyzed is a limited set of specimens, and there is no available material for molecular analysis after the taxonomic identification and morphologic study. Accepting that molecular analysis facilities are available, many times the biological specimens needed for molecular sequencing may be available only in a future, after the species is described. Even when Scanning Electron Microscopy (SEM) is possible, depending on the structure, it is hard to get images of all diagnostic features and light microscopy may be needed as well. However, high-resolution imaging and molecular data are powerful tools, and may be indispensable for accurate taxonomic research and species delimitation. Therefore, the morphologic descriptions must be dynamic, open to easy amendment and additional data insertion. Furthermore, it must be presented in an interchangeable language, to allow the information to flow across different disciplines.
Among all methods applied to the external morphology study of Collembola, chaetotaxy is certainly the most complex and extensively detailed (Betsch & Waller 1994, Cassagnau 1974, Deharveng 1983, Fjellberg 1999, Jordana & Baquero 2005, Nayrolles 1988, 1990a, 1990b, Potapov 2001, Szeptycki 1979, 1972, Yosii 1960). There are many chaetae and groups of chaetae that vary in position and shape in such a way that they allow a great deal of homology inferences. However, the most advanced approaches are also very complex, which makes interpretation difficult and increases ambiguity. These aspects circumscribe the deep taxonomic research to restricted groups of experts, posing difficulties to comparative studies even among different Orders of Collembola. In addition, the traditional descriptive texts with morphological and chaetotaxic information are difficult to integrate with machine learning and computational novelties, which could give a lot of agility to phylogenetic analysis, big data comparison, biogeography, and their various applications (Potapov et al. 2020).
Despite all advances in technological instruments and methods, taxonomic descriptions are still written basically in the format as it was about two centuries ago, with a hermetic language in nearly incomprehensible texts for non-experts. This is often a greater barrier to communication among different areas of science, than the access of high-tech equipment and analytical facilities.
The proposal of a coded and illustrated description of new species that can be easily imported, transformed, amended, corrected, or expanded is presented as an alternative to the traditional descriptive taxonomic method.
The strength of the coded description is that new characters, whether morphological, molecular, ecological, can be easily added to the list and improve the descriptive matrix as new information is produced. These matrices can be uploaded to public libraries and kept up to date with all available information about the species, and linked to data bases as GBIF, ZooBank and electronic taxonomic catalogs available in different parts of the world e.g., fauna.jbrj.gov.br/fauna/listaBrasil (Zeppelini et al., 2023); www.collembola.org(Bellinger et al., 1996-2023) .