Efficacy Rates Across Contexts

When assessing the efficacy of CDD, one must be consistent in which measures are considered. However, it can be unclear what a study is measuring and terms like ‘detection rates’ may be used without stating what they quantify regarding the search and dog performance . recommend sensitivity (i.e., proportion of target samples found out of total available), also known as ‘accuracy’, and precision (i.e., proportion of alerts that are true positives), also known as ‘reliability’ or ‘predictive positive value’, as measures to be used for evaluating CDD performance. Sensitivity can investigate performance during training and testing which can then help predict the probability of detection during operational searches, as sensitivity in the field is difficult to ascertain without estimating the total number of targets in an area often using techniques with high margins of error like playback . Precision aids in determining the ability of the CDD to distinguish and discriminate the target scent from other odours. propose measuring sensitivity and ‘specificity’ (i.e., proportion of non-alerts that are true negatives) in tandem as key to scent detection work. However, they also acknowledge that specificity is often challenging to accurately measure due to the limitless number of distractor scents that may be available during field trials or operational searches, as well as the difficulty of ascertaining that the target scent is completely absent in a natural environment. As such for this review, sensitivity and precision will be the measures of focus (see Table 1).
In controlled training and testing trials in CDD literature, the ability to find present targets accurately appears to vary greatly. For insects like beetles, bumble bees, and stonefly, reported sensitivity has ranged from 55% to 100% with the use of targets like nest material, infested wood, or larvae . For plant species, rates were high with 81% to 100% sensitivity and 85% to 100% precision . Work with reptiles and amphibians reported rates of between 62% to 100% for sensitivity and 50% to 100% for precision . CDD detecting carcasses of birds and bats on windfarms were reported to show sensitivities between 71% and 100% . Searching for bird species through scat, carcasses, or eggs has resulted in sensitivity rates between 66.7% and 100% with precision reported between 50% to 100% . However, the study by , where dogs searched for rock ptarmigan (Lagopus muta ) scat in lab conditions, had three dogs out of four perform no better than chance and none of the dogs or handlers had any previous experience of training for CDD work.
Mammals are by far the most common animals searched for in CDD work. For small mammals, sensitivity in training and testing contexts ranges from as low as 29% to as high as 100% with 44% to 100% precision . Regarding the 29% sensitivity in , this was during a search for both natural bat roosts and suspended bags of guano where guano was the original trained target. This could have caused the CDD to have a preference for the guano samples (i.e., ono which they had been imprinted and trained ) over the bat roosts which were novel. Indeed, sensitivity was 79% on guano bags alone, and 77% when only searching for bat roosts. The concepts of using different samples in training versus testing, generalisation of CDD to non-trained targets, and the effects of odour concentration in search performance are elaborated on further in the Training section.
For larger mammals, sensitivity rates during training and testing of between 23.8% for sheep remains and 93.3% for cheetah scat occurred with demonstrating 100% precision on cheetah scat. Although 23.8% sensitivity for CDD seems low, this was compared to 2.5% sensitivity of human searchers looking for the same carcasses . Improvements in detection of even small proportions can be hugely beneficial as conservation projects often rely on methods with overall low detection rates . These examples demonstrate how there appears to be little pattern regarding the target species when it comes to success during training and testing except for greater variation with mammal targets which could be due to a few things: an inherent issue with the target odours, the quality of the study, or the simply greater number of papers in that area (i.e., out of 67 studies reviewed: 44 on mammals, eight on reptiles, seven on birds with three of these overlapping with mammal studies, seven on invertebrates, three on plants, one on amphibian (see Table 1)).
CDD efficacy should be evaluated during training and testing rather than waiting until operational searches to assess performance, however, many published studies simply investigate whether CDD can discriminate the target odour in a simple controlled trial and do not progress to testing the CDD in a field environment under operational conditions. Indeed, of the 67 studies examined in this review, 42% focus only on training and testing, 42% assess solely field performance, and 16% look at both. Of those studies that measure training and testing performance, 31% conduct their experiments in purely lab-based or controlled field conditions. Moreover, seemingly obvious statistics are sometimes stated such as strong positive correlations between CDD alerts and true positives which simply means that the dog is doing what it has been trained to do; an unsurprising result given the decades of effective scent detection work performed by canines. Is there a question at present as to whether dogs can distinguish scents? Or should the literature have accepted this as a fact by now given the longstanding history of scent detection dogs and instead be moving towards assessing field work capabilities and cementing methodological practices?
Sensitivity and precision rates within field work vary similarly to those of training and testing. Although most operational windfarm mortality searches did not report precision, achieved rates of 100% meaning all indications were true positives. Of studies assessing performance in the field, scat surveys of mammals are the most prevalent with precision rates of between 28% to 100% . Low rates of precision may occur as it can be difficult for the handler to accurately identify scats visually which can lead to them accidentally rewarding indications on non-target scats (i.e., false positives) hence reinforcing and leading to a subsequent increase in their frequency. Additionally, CDD may be correctly alerting, and DNA barcoding and profiling of the scat can be wrong due to contamination from non-target species resulting from coprophagy, urination, and contact with saliva . Furthermore, both and used CDDs which had also been trained to indicate on other targets as part of previous work. Training CDD to detect multiple species with overlapping habitats can lead to indications on all targets. As such, most of the false positives in these studies were for the previously trained targets which although classified as a false positive in the context of the study, is not a false positive in the context of the dog’s training.
Unfortunately, even while assessing the ability of CDD using these set measures, not every study reports results clearly enough to make inferences. Sometimes, the number of targets found is the only measure reported due to budget constraints, being unable to verify true positives in the field (e.g., small mammals hiding or denning in inaccessible places ), or simply a lack of information given within the study itself . Although these results are still valuable for comparisons with other methods and establishing species presence, without any information on error rates it cannot be determined whether the CDD is performing efficiently or if the authors are merely reporting successes and ignoring mistakes.
Despite this, it is clear that across training, testing, and operational tasks, CDD perform generally well and much better than other methods with CDD outperforming humans in 96% of the 24 studies analysed where comparisons were made as well as other analytical tools (see Table 1, Columns 3 and 4), excluding select cases: bumble bee nest detection where performance was equivalent to humans and rhinoceros scat searches where the size of the scat means CDD do not provide a distinct advantage over the standard method . However, this review has established that sensitivity and precision rates still vary by a large margin across the literature regardless of target species and search context. So, the question remains, what drives the variation in CDD efficacy?