
3
Revista de Biología Tropical, ISSN: 2215-2075, Vol. 72: e59992, enero-diciembre 2024 (Publicado Ago. 13, 2024)
a species as an independent observation leads
to pseudoreplication (Hurlbert, 1984).
This issue is apparent in the case of H.
impatiens, where 22 of the samples appear to
share an identical sequence and had a match
via BLASTn but not BOLD. By the authors’
analysis, this was counted as 22 independent
cases of GenBank outperforming BOLD; rather,
this should reflect only one instance of differing
performance. Consequently, their approach has
resulted in inflated values where they report
identification error rates (e.g., see Table 1 in
Chacón-Monge et al., 2024), obfuscating the
real difference in performance between the two
platforms. Alternatively, the authors could have
taken advantage of these replicates to examine
intraspecific variation, as has been done in prior
studies (e.g., Layton et al., 2016). Contrasting
intraspecific and interspecific variation would
have yielded further insights into the perfor-
mance of COI barcodes for species delimitation
in Central American echinoderms.
Sampling Errors: The authors’ sweeping
interpretation of mismatched species identi-
fications as failures of the identification plat-
forms is flawed, as there are alternative, more
parsimonious explanations in a number of
cases. For example, sequence BMAR368-19 was
supposedly derived from an easily recogniz-
able sea star, Nidorellia armata (a member of
the order Valvatida), but had > 97 % sequence
similarity to records from Toxopneustes spp.
(sea urchins from the order Camarodonta) in
both GenBank and BOLD, which the authors
scored as an identification error. Conversely,
sequence BMAR369-19 was meant to repre-
sent T. roseus, but was identified by the BOLD
Identification Engine as N. armata, which was
also interpreted as a misidentification. Neither
of these outcomes is very likely to be correct;
rather, the obvious interpretation is that the
two samples were swapped during sampling or
subsampling, with this field error later being
mistakenly attributed to the two platforms. We
found 13 such instances, in which a sequence
shared a high degree of similarity to an unre-
lated species (i.e., different genus, order, or
class) that was included in the sampling effort,
suggesting sample mix-ups or contamination
(see SMT1). In another 17 instances, we noted
probable contamination or misidentification
either due to identifications being mismatched
at the rank of class or higher, or due to the
query sequence failing to match an available
congener sequence in one or the other data-
base. We additionally noted eight instances of
possible mix-ups or contamination between
samples of related species of Holothuria, mak-
ing it difficult to determine whether the result-
ing species identifications represented true
errors or false negatives. This reinforces the
importance of interpreting results carefully,
such as considering sequencing results for each
species holistically.
Lack of Standardization Between Plat-
forms: There are fundamental differences in
the operation of the two molecular identifica-
tion platforms used by the authors, which were
not addressed in their study. The BLASTn
tool is not intended to provide a species-
level identification, but rather to align to the
most similar sequence(s) in the database. Thus,
when highly similar sequences are missing
from the database, matches can still be returned
from distantly related taxa. In contrast, BOLD
employs divergence thresholds to avoid return-
ing distantly related taxa as a “species match”
and will abort the identification algorithm if a
sequence match exceeding 97 % is not found.
Thus, quantifying the cases of “no match” does
not provide a meaningful comparison of the
two platforms because only one of them is likely
to yield this result.
This distinction is especially important
when considering genus-level identifications
because the BOLD Identification Engine is not
designed for this purpose. To illustrate, con-
sider a case where the best available matches for
a query sequence are at most 96.9 % identical,
and these sequences are present in both data-
bases: these would appear in BLASTn results
and could readily be interpreted as genus-level
matches, whereas BOLD would simply return
“no match” (i.e., no species match). We note