Margaux Perhirin1*, Hannah
Gossner2, Jessica Godfrey2, Rodney
Johnson2, Leocadio Blanco-Bercial2,
Sakina-Dorothée Ayata1
1 Sorbonne Université, UMR 7159 CNRS-IRD-MNHN,
LOCEAN-IPSL, Paris, France
2 Bermuda Institute of Ocean Sciences, Arizona State
University, St. Georges, Bermuda
* corresponding author:
Abstract (250 words)
Mesozooplankton is a very diverse group of small animals ranging in size
from 0.2 to 20 mm not able to swim against ocean currents. It is a key
component of pelagic ecosystems through its roles in the trophic
networks and the biological carbon pump. Traditionally studied through
microscopes, recent methods have been however developed to rapidly
acquire large amounts of data (morphological, molecular) at the
individual scale, making it possible to study mesozooplankton using a
trait-based approach. Here, combining quantitative imaging with
metabarcoding time-series data obtained in the Sargasso Sea at the
Bermuda Atlantic Time-series Study (BATS) site, we showed that
organisms’ transparency might be an important trait to also consider
regarding mesozooplankton impact on carbon export, contrary to the
common assumption that just size is the master trait directing most
mesozooplankton-linked processes. Three distinct communities were
defined based on taxonomic composition, and succeeded one another
throughout the study period, with changing levels of transparency among
the community. A co-occurrences’ network was built from metabarcoding
data revealing six groups of taxa. These were related to changes in the
functioning of the ecosystem and/or in the community’s morphology. The
importance of Diel Vertical Migration at BATS was confirmed by the
existence of a group made of taxa known to be strong migrators. Finally,
we assessed if metabarcoding can provide a quantitative approach to
biomass and/or abundance of certain taxa. Knowing more about
mesozooplankton diversity and its impact on ecosystem functioning would
allow to better represent them in biogeochemical models.
Key words (4 to 6)
Transparency, carbon export, metabarcoding, quantitative imaging,
trait-based approaches, zooplankton
Introduction
Mesozooplankton is a group comprised of small metazoans of 0.2 to 20 mm
in size drifting with ocean currents. They constitute a critical link
between primary producers and higher trophic levels and play a major
role in the biological carbon pump. They contribute directly to
Particulate Organic Carbon (POC) export through the production of
sinking faecal pellets, moults and carcasses (Lomas et al., 2010;
Steinberg & Landry, 2017). Species in this group can also repackage
particles into larger and faster-sinking faecal pellets or can fragment
sinking POC, particularly by sloppy feeding, into smaller and
slower-sinking particles. Additionally, Diel Vertical Migrations (DVM)
realised by some mesozooplankton species between the surface and the
deep ocean increase the export of dissolved carbon and the
remineralisation of nutrients at depth (Kelly et al., 2019; Tarrant et
al., 2021). Hence, mesozooplankton is a highly diverse group in terms of
species, ecological strategies, and size, among others (Litchman et al.,
2013). Its biomass and community composition can impact both the
productivity and the biogeochemistry of the environment (Schnetzer &
Steinberg, 2002). Better understanding of how variations in
mesozooplankton diversity impact ecosystem functionalities will be a
fundamental step to improve the general knowledge of ocean ecosystems.
This would also contribute to improving the representation of
mesozooplankton diversity in biogeochemical models since this key
component of the marine ecosystem is usually represented by only one
variable (Quéré et al., 2005).
The Sargasso Sea has been identified as one of the highest zooplanktonic
diversity locations in the world (Rombouts et al., 2010). The Bermuda
Atlantic Time-series Study (BATS, Lomas et al., 2013) site is located in
the vicinity of Bermuda Islands (31°40‘N, 64°10‘W) in the North Atlantic
subtropical oligotrophic gyre and has been studied since 1988. Long-term
sampling and research have shown that environmental variables and
zooplanktonic biomass follow a classical seasonal cycle. The
phytoplankton bloom, stimulated by optimal nutrients and light
conditions, is followed by a peak in zooplankton biomass, usually in
March or April. In summer, thermal stratification is strong and leads to
nutrient depleted upper waters and thus, a decrease in zooplankton
biomass (Deevey, 1971; Madin, 2001). Secondary peaks can occur in late
summer or early fall, when the mixed layer deepens again. Zooplankton
biomass is minimal during winter (Ivory et al., 2019) when nutrient flux
fertilises the surface layer thanks to deep mixing and mode water
formation (Steinberg et al., 2001). Distinct taxa assemblages, based
exclusively on the community composition, are associated with each of
the four seasonal hydrographic periods (Blanco-Bercial, 2020).
Historically, zooplankton have been investigated by collecting samples
using plankton nets and identifying specimens at the species (or genus)
level under a microscope. However, this technique of visual
identification is time consuming and complex, partly due to the presence
of cryptic species, species with multiple life stages or with distinct
sexual dimorphism (Huo et al., 2020; Rey et al., 2020). Moreover,
taxonomic identification can be strongly biased by expert’s
specialisation (Harvey et al., 2017) and more generally, biased towards
the largest and most known plankton species. Molecular studies have
revealed that plankton knowledge is biased towards sampled or cultivable
species, representing less than 30% of the total estimated plankton
diversity, leading to a ”Mare Incognitum” (Chust et al., 2017).
Next-generation sequencing technologies, such as DNA metabarcoding,
could therefore be an alternative to overcome these microscopy-based
monitoring limitations (Abad et al., 2016). DNA metabarcoding allows for
the identification of taxa by identifying specific genes (Hebert et al.,
2003) from a bulk sample containing DNA from dozens of taxa. This
inexpensive and rapid method can detect rare and cryptic species (Huo et
al., 2020), and be used to explore the diversity and structure of marine
eukaryotic communities (Chen et al., 2021) as well as their ecosystem
dynamics (Matthews et al., 2021). However, it is not free of several
technical biases: barcode and primers selection, DNA amplification’s
efficiency, bioinformatics pipelines’ parameters choice, or reference
database quality are known sources of biases (Harvey et al., 2017). This
method produces compositional data, meaning that values inferred from
counts of reads for any taxon will be influenced by the ones from other
taxa (MacNeil et al., 2022; Brisbin et al., 2020). Hence, the
quantitative property of these data is still controversial and hard to
evaluate (Ershova et al., 2021), necessitating verification by imaging
methods (Harvey et al., 2017).
Over the past few decades, imaging systems (e.g., Underwater Vision
Profiler, Picheral et al., 2010; In Situ Ichthyoplankton Imaging System,
Cowen and Guigand, 2008 or Zooscan, Gorsky et al., 2010) have been
developed, along with machine learning algorithms, to acquire and
analyze large numbers of individual organisms’ images in a quantitative
way (Irisson et al., 2022). These systems are less or non (if in
situ ) invasive compared to classical sampling methods (Martini et al.,
2021), allowing for relatively rapid acquisition of a large quantity of
data, and offering new opportunities to study some previously neglected
organisms (Biard et al., 2016).
These images give consistent rich morphological information of
individual organisms, making it possible to use a trait-based approach
and thus a more ecosystem-focused point of view than a species or
taxonomic one (Martini et al., 2021). Trait-based approaches refer to
functional traits, i.e., any individual’s characteristic
(morphological, physiological) impacting its fitness (Violle et al.,
2007), and are based on the assumption that the fitness of an individual
is based on its success regarding feeding, reproduction and survival
which would depend on a few key traits (Brun et al., 2017). Size is
easily measured, and due to its influence on many physiological and
ecological aspects of an organism’s life, it has been defined, notably
in zooplankton, as a “master trait” (Barton et al., 2013; Litchman et
al., 2013). For example, larger organisms will produce larger faecal
pellets (Uye & Kaname, 1994) and/or have a stronger migration
behaviour, hence playing an important role in export flux out of the
euphotic zone (Stamieszkin et al., 2015; Steinberg et al., 2001).
Organisms’ transparency, although an important zooplankton morphological
characteristic (Orenstein et al., 2022), is however understudied
(Johnsen & Widder, 1998). Quantifying transparency might give
information about the organism’s ecology (reproduction with gonads or
eggs, feeding with guts content, Orenstein et al., 2022). Transparency
and colour can also be related to physiological functions (Vilgrain et
al., 2022), e.g., carotenoids effects against oxidative stress
(Brüsin et al., 2016), or possibly the utilization of pigmentation and
DVM in order to escape visual predators (Hays et al., 1994).
While both metabarcoding and imaging techniques can provide information
on the temporal and spatial distribution of zooplankton communities and
their environmental dynamics (Vilgrain et al., 2021), imaging data is
also able to quantify taxa’s abundance and biomass (e.g.,Monferrer et al., 2022). Hence, it would be ideal to combine high
throughput image acquisition methods with metabarcoding to improve
zooplanktonic community studies (Bucklin et al., 2019). Each of these
approaches has uncertainties (Monferrer et al., 2022) but when combined,
many resolution issues are resolved. For example, metabarcoding fills in
some of the limited size spectrum resolution of imaging techniques
(MacNeil et al., 2022) and allows a joint study of zooplankton
communities taxonomic and functional aspects. Previous studies have
tried to assess the quantitative property of metabarcoding by comparing
counts of sequence reads to (relative) biomass (e.g., Djurhuus et
al., 2018; Durkin et al., 2022 on particles) or to (relative) abundance
(e.g., Ibarbalz et al., 2019; Rey et al., 2020) for various
zooplankton taxonomic groups but results from each approach seldom agree
(Harvey et al., 2017), requiring further studies.
This study, therefore, seeks to use the complimentary methods of
metabarcoding and high throughput imaging in order to link variations in
mesozooplankton community composition throughout the year to
morphological changes in the highly diverse ecosystem of the Sargasso
Sea. Additionally, we evaluate how modifications in community
composition will impact the ecosystem’s export intensity. Finally,
utilising the potential of our “twin” morphological and molecular
datasets, we aim to test the correlations between abundance and biomass,
and relative read counts, for wide taxonomic categories.
Material & Methods
Environmental
data acquisition
Data were collected at BATS site (31°40′N; 64°10′W) on a monthly basis,
from March 2016 to May 2017, except in March 2017 due to rough weather.
Environmental samples were taken during the day, except in February-2017
when they were taken at night. Environmental data were collected
vertically using a SeaBird 911 CTD rosette equipped with temperature,
conductivity, fluorescence, oxygen and photosynthetic active radiation
(PAR) sensors. Density (σθ) was calculated from
temperature and conductivity. Depth of the Deep Chlorophyll Maximum
(DCM, in metre) was defined as the maximum of the fluorescence profile.
Mixed Layer Depth (MLD, in metre) was calculated as the depth for which
σθ is larger or equal to σθsurface +
0.125 kg m-3 (Levitus, 1982). Sediment traps were
allowed to drift for 72 hours at 200 m (Steinberg et al., 2001). The
total organic carbon (TOC) measured at 200 m (in mg
m-2 d-1) was used as a proxy of
carbon export to the mesopelagic layer. TOC data were averaged from
three replicates and blank corrected, following BATS protocol (Steinberg
et al., 2001).
Zooplankton data acquisition
Zooplankton sampling was conducted by employing a 1 m² rectangular net
with 202 µm mesh, towed at a speed of about 1 knot, paying out 250 m of
wire at 15 m per minute both ways (maximum depth sampled: 168.68 m ±
37.60 m, Supp. Table I). Three replicate day and night tows were made
for each cruise, except in October 2016 for which the day sample for
molecular analysis was not obtained due to rough weather. Samples from
two tows were preserved in formaldehyde at 4% for Zooscan analyses (see
following sections). Zooplankton from the last net was fixed in 95%
undenatured ethanol then stored at -20°C until returning to BIOS (1 to 3
days later) for molecular analyses. Ethanol was changed right after
arrival to BIOS.
Zooscan process and analyses
Zooscan process
A fraction of each half-split sample from two net replicates (obtained
with a Motoda splitter; Motoda, 1959) was scanned using the Zooscan
system (Gorsky et al., 2010). All individuals larger than 2 cm were
selected by eye and scanned separately. Remaining samples were sieved
through a 1000 µm sieve and at least 1500 particles from two size
classes (202-1000 µm and >1000 µm) were scanned.
Morphological variables (as in Vilgrain et al., 2021, Supp. Table II)
were calculated for each individual image of zooplankton organism using
the ZooProcess software (Gorsky et al., 2010). Individual images, their
metadata, and their morphological descriptors were stored in the EcoTaxa
web application ( Picheral et al., 2017); project ‘BATS_timeseries
[4236]’). Automatic taxonomic classification was done on these
images at a relatively high level using a random forest algorithm
(Irisson et al., 2022; 42 categories, Supp. Table III). Taxonomic
identification of each individual image was then validated by a human
expert. In total, the dataset gathered 133,389 images. Only the images
from 29 taxonomic categories (in bold in Supp. Table III) and with a
major axis length higher than 500 µm were studied, to focus on
mesozooplankton. Moreover, Actinopterygii images with a major axis
length longer than 2 cm were removed to consider only larvae that cannot
swim against the flow, leading to a total of 69,949 mesozooplankton
images. “Oikopleura” category was renamed Larvacea to take into
account all the Appendicularia’s orders but not be mistaken with the
genus name Appendicularia . For each image, a density factor \(d\)was computed to account for the water volume sampled as follows:
\(d\ =\ \frac{\text{Motoda\ fraction\ scanned}}{\text{Total\ volume\ sampled}}\).
Biovolume and dry weight
computations from images
Area, major and minor axes from Ecotaxa were converted from pixels to
millimetres (1 pixel = 0.0053 mm). The biovolume was computed for all
the 69,949 mesozooplankton images using the Equivalent Spherical
Diameter (ESD).
\begin{equation}
ESD=2\times\sqrt[]{\frac{\text{area}}{\pi}}\nonumber \\
\end{equation}\begin{equation}
Biovolume\ =\ \frac{4}{3}\times\pi\times({\frac{\text{ESD}}{2})}^{3}\nonumber \\
\end{equation}This computation was chosen to prevent overestimation of the
individual’s biovolume, due to unfolded copepods’ antennae for example
(as done in Monferrer et al., 2022). Then, each individual biovolume was
multiplied by its density factor \(d\) to account for the water volume
sampled, and a taxonomic-specific factor \(t\) to get the individual dry
weight from individual biovolume value. These taxonomic-specific values
were obtained and adapted from Maas et al. (2021) who defined linear
relationships between biomass and biovolume for various taxa present at
BATS (in bold, Supp. Table III).
\begin{equation}
Biomass\ =\ Biovolume\ \times\ d\ \times\ t\nonumber \\
\end{equation}Relative biomasses and abundances
For comparison with relative read counts from metabarcoding (see below),
normalised biomasses were transformed into relative biomasses for each
taxonomic category \(i\), going from 1 to \(a\ =\ \)29, and per sample\(j\), the sample rank going from 1 to \(b\ =\ \)27, as follows:
\begin{equation}
\text{Relative\ Biomass}_{\text{ta}x_{\text{i\ j}}}\ =\ \frac{\sum_{i\ =\ 1}^{i\ =\ a}{\text{Biomas}s_{\text{ta}x_{\text{i\ j}}}}}{\sum_{j\ =\ 1}^{j\ =\ b}{\sum_{i\ =\ 1}^{i\ =\ a}\text{Biomass}_{\text{ta}x_{\text{i\ j}}}}\ }\nonumber \\
\end{equation}For the same purpose, relative abundances for each taxonomic category
were computed per sample, using the same formula with\(Abundance\ =\#ind\ \times d\ \) instead of Biomass.
Data are presented in Supp. Table IV for relative biomasses, expressed
as a biomass per volume of water, normalised by the volume of water
sampled (mg m-3); and in Supp. Table V for relative
abundances (ind m-3).
Construction of a multivariate
morphological space from Zooscan images
To summarise the morphological characteristics of each organism in the
mesozooplankton community and identify the main criteria of
morphological variations, a morphological space was defined as in
Vilgrain et al. (2021) by performing a weighted PCA (Legendre &
Legendre, 2012) on a selection of 18 morphological variables (Supp.
Table II) for the 69,949 mesozooplankton images, weighted by their
density factor \(d\) to give more weight to the most abundant organisms.
The 18 morphological descriptors could be divided into four main
categories, describing the shape, the size, the transparency, and the
complexity of the organism itself, at the individual scale. The weighted
PCA then allows to define a reduced morphological space (i.e. , a
morphological space with reduced dimensions) in which each image is
located depending on its main morphological characteristics. By doing
so, we can identify a few significant axes that summarise the main
morphological traits that vary among images. 0.5% of the extreme
variables were transformed into NAs to avoid being influenced by
outlying points. They were all then normalised by the Yeo-Johnson
transformation to satisfy the conditions for PCA application. Axes were
considered significant only if their associated eigenvalue was greater
than the average of all the eigenvalues (Kaiser-Guttman criterion;
Cattell, 1966). The position of each mesozooplankton image in the PCA
space is thus a function of its morphology: this PCA space will then be
named morphospace. To ease the visualisation of the morphospace, example
images were mapped to their position in the morphological space as
follows: for each pair of principal components (i.e., PC1 vs PC2
or PC3 vs PC4) a 15-by-15 grid was defined and five images were randomly
chosen and superposed at each node of the grid, to represent a
morphotype for each node. Individuals, meaning each of the 69,949
images, can also be plotted with dots in the morphospace. Supplementary
variables were added to this PCA and represented in the morphological
space as follows. Carbon export at 200 metres was added as a
quantitative variable and represented by arrows that pointed towards the
morphospace area positively correlated to their variations. Pearson’s
correlation coefficients and their associated p-values were computed
based on the coordinates of each individual in the morphospace and the
carbon export value obtained during the matching sample. The three
molecular-based clusters (see below), the sample time (Day or Night) and
taxonomic categories were also added as qualitative supplementary
variables.
Linear relationship between size and transparency traits and carbon
export
Pearson’s correlation coefficients were computed between carbon export
(ln-transformed to meet the normality assumption), opacity, and size
variables. Each individual’s trait value was multiplied by its density
factor to account for the water volume sampled. Data were averaged by
sample dates (no Day/Night distinction) since there was only one carbon
export measurement per cruise. For significant correlations (p
< 0.05), linear relationships were represented in the graphs.
Metabarcoding process and analyses
DNA extraction and 18S amplification
Half of each ethanol-preserved sample (obtained with a Motoda splitter;
Motoda, 1959) was used for DNA extraction. The other half was kept as
archival material at BIOS. Since we focused the study on
mesozooplankton, macroplankton (> 2 cm) were removed from
the sample (mostly fish larvae). Samples were filtered through a 53 μm
mesh, and gently washed with milliQ water to remove as much ethanol as
possible. They were then transferred to a 50 mL falcon tube and
centrifuged at 3,500 g for 10 minutes to remove as much water and
ethanol as possible. The pellet is then weighed and transferred to a new
50 mL falcon tube and 15 or 25 ml SDS buffer (10 mM Tris-HCl, 100 mM
EDTA pH 8.0, 200 mM NaCl, 1% SDS) depending on the total mass of the
pellet. Samples were then vortexed to ensure rapid mixing of the
plankton with the buffer and thus degradation of the DNA. They were then
ground using a Fisher Scientific FSH125 homogenizer for 5 minutes, at
1/3 of maximum speed to avoid excessive DNA fragmentation and rapid
heating of the sample. After the grinding step, the homogenizer was
rinsed with 10 mL SDS buffer (or 15 mL depending on sample mass) to
minimise sample loss. Between samples, the homogenizer is washed and
sterilised. Proteinase K (Sigma-Aldrich; St. Louis, MO, USA) was added
to the buffer (0.2 mg mL-1 final concentration), with
5 mL sterilised stainless steel Shot, 3/32′′ Ellipse (Rio Grande, CA,
USA). Samples were then placed in an oven at 60°C for 4 h, vortexing at
maximum speed for 30 s every 30 min. Tubes were centrifuged at 3,500 g
for 15 min and three 400 μL replicates of supernatant were transferred
to 1.5 mL Eppendorf tubes. DNA was extracted using the E.Z.N.A. Mollusc
DNA Kit (Omega Bio-Tek, Norcross, GA, USA), with some modifications
explained in Blanco-Bercial (2020). Quality control of extracted DNA was
performed by fluorimetry using a NanoDropTM One instrument. Extracted
DNA was stored at -20°C until amplification.
Amplification of the V1V2 region of the 18S (about 350 bp) was done
following Fonseca et al. (2010). Samples were then sent to the Genomics
Research Center of Rochester University (NY, USA) for sequencing.
Sequencing was performed in an Illumina MiSeq using the MiSeq Reagent
Kit v3 (600-cycles; 2 × 300) V3 chemistry. To test for replicability
within the procedure, two samples were submitted using different indexes
in separate batches.
Bioinformatic pipeline
Demultiplexed samples were analysed in MOTHUR ver. 1.39.5. (Schloss et
al., 2009). The entire annotated pipeline is accessible at the following
link: . Unique sequences were aligned against a V1V2 region of the SILVA
138 database (Quast et al., 2013). Sequences were trimmed to the length
of the V1V2 region, and only complete ones (starting at the first base
and ending at the last base of the alignment) were kept in order to
avoid bias due to unfinished amplifications. After gap removal,
chimaeras were removed using UCHIME (Edgar et al., 2011) as implemented
in MOTHUR, and unique sequences, called thereafter Amplicon Sequence
Variant (ASV) were obtained after denoising using UNOISE2 (Edgar, 2016)
as implemented in MOTHUR (diffs = 1). ASVs taxonomy was retrieved from a
database developed from the complete SILVA 138 release database for SSU,
modified to define clade levels (kingdom, phylum class, order, family,
genus and species) to fixed columns in the taxonomy file (Bucklin et
al., 2021). Using this classification and to focus only on
mesozooplankton, only metazoans were kept for the downstream analyses
(44,441 ASVs removed).
Metabarcoding standardised analyses
ASVs belonging to Vertebrata taxa were eliminated, meaning that fish
ASVs were removed to not be affected by cells from fish larvae able to
swim against currents and thus not belonging to mesozooplankton (17,479
ASVs removed). Moreover, ASVs with less than 2 reads were removed. All
samples were standardised randomly to a minimum number of reads,i.e. , 92,150 reads. Percentages of relative abundance in the
entire sampling period were computed per ASV, as follows, with \(j\) the
sample rank going from 1 to \(b\ =\ \)27 and \(k\) the ASV rank going
from 1 to \(c=\ \)92,150.
\begin{equation}
{\%RA}_{\text{ASV}_{\text{TOTk}}}\ =\ \frac{\sum_{j\ =\ 1}^{j\ =\ b}\text{reads}_{\text{ASV}_{\text{jk}}}}{\sum_{j\ =\ 1}^{j\ =\ b}{\sum_{k\ =\ 1}^{k\ =\ c}\text{reads}_{\text{ASV}_{\text{jk}}}}\ }\times 100\nonumber \\
\end{equation}A minimum percentage of relative abundance threshold of 0.05% was then
applied for all the following analyses. Indeed, metabarcoding reads are
a viable proxy of taxon relative abundances for the major taxonomic
groups, but it should be interpreted cautiously for the less abundant
ones (Matthews et al., 2021). Hence, only 225 ASVs are considered in
this study out of 22,814 ASVs. Finally, relative abundances (in
percentage) estimated per ASV and per sample were computed as follows,
with \(j\) the sample rank going from 1 to \(b\ =\ \)27 and \(k\) the
ASV rank going from 1 to \(c=\ \)225.
\begin{equation}
{\%RA}_{\text{ASV}_{\text{jk}}}\ =\ \frac{\text{reads}_{\text{ASV}_{\text{jk}}}}{\sum_{j\ =\ 1}^{j\ =\ b}{\sum_{k\ =\ 1}^{k\ =\ c}\text{reads}_{\text{ASV}_{\text{jk}}}}\ }\times 100\nonumber \\
\end{equation}These were then Hellinger-transformed (square-root transformation on
relative abundances) to reduce the weight of very abundant ASVs
(Legendre & Gallagher, 2001).
ASVs relative counts of read per Zooscan categories and correlation with
abundance and biomass
For direct comparison with image analyses, 22,814 ASVs were grouped into
the taxonomic levels used in Zooscan/EcoTaxa systems. For each sample,
the relative number of reads per coarse taxonomic category (in bold,
Supp. Table III) was computed. Pearson’s correlation coefficients were
then computed between relative counts of read and relative normalised
abundances and biomasses from Zooscan data.
Succession of ASVs assemblies
throughout the sampling period
To explore the differences in mesozooplankton community throughout the
sampling period, a Principal COordinates Analysis (PCoA, Legendre and
Legendre, 2012) based on Bray-Curtis dissimilarity index (Bray &
Curtis, 1957) was performed. Sample dates were clustered by hierarchical
classification carried out on their coordinates along the first seven
axes of the PCoA to explain more than 70% of the variance with at least
4% in each axis (72.59% of variance explained), using a Euclidean
distance and a synoptic Ward linkage (Murtagh & Legendre, 2014). A
threshold (-0.55, Figure 3) was set to minimise the total within-cluster
variations, defining three meaningful groups of stations with
homogeneous values for the variables used. The significance of the PCoA
clustering pattern was tested using an ANalysis Of SIMilarity (ANOSIM,
1000 permutations, Clarke, 1993) and permutational multivariate analysis
of variance (ADONIS, 1000 permutations, Anderson, 2001) Holm corrected
(Holm, 1979).
Co-occurrences network and
“modules” identification
A network of co-occurring ASVs was built on the relative
Hellinger-transformed abundances. Nodes represented ASVs and edges were
the positive correlations between those, computed with Spearman’s
correlation coefficient (rho, Spearman, 1904). Cut-offs of rho and
p-values were respectively 0.6 and 0.01 to consider the most
statistically robust correlations, meaning only links between pairs of
ASVs that often appeared at the same time in the samples and reducing
the risks to consider by-chance co-occurrences (Barberán et al., 2012).
Modularity estimates the tendency of a network to subdivide in densely
connected modules (or clusters) (Bellisario et al., 2019). Networks with
high modularity have dense connections between the nodes within modules
but limited connections between nodes in different modules. If
modularity is higher than 0.40, then we consider the network can be
divided into modules, i.e. , groups of ASVs there are more densely
connected between each other than with the rest of the network (Newman,
2006). The Leading Eigenvector community detection algorithm was used
(Newman, 2006) to identify the modules. In a few words, the network was
splitted into two parts that would be kept if the separation leads to a
significant increase in the modularity, and each part would be divided
again into two parts at the next step of the algorithm. If the
separation did not lead to a significant increase in the modularity, the
network was splitted into two other parts, and the process started
again. The algorithm stopped when no further separation led to a
significant increase in the modularity.
Identification of taxa composing the modules
A more precise and trustable taxonomy of ASVs belonging to modules was
assigned via BLASTN from the NCBI website (, Altschul et al., 1990)
followed by Batch Entrez analyses to retrieve species identities from
Accession Hit numbers (). Then, obtained files were formatted in R.
Taxonomy was obtained from LifeWatch Belgium services (). Unique best
hits, meaning that an ASV matches with a unique sequence from the NCBI
database at 100% of similitude, were firstly chosen. When multiple
species had equal identity (and when one was not a clear
misidentification), the lowest common taxonomic rank was kept,e.g . if an ASV matches at 95% of similitude with two sequences
in NCBI database belonging to Candacia truncata andCandacia elongata , then it was characterised as Candaciasp. The complete taxonomy of the 225 most abundant ASVs analysed in this
study can be found in Supplementary Table VII.
For each module \(m\) (between 1 and \(e\ =\ \)6), their relative
abundances were computed per sample as follows, with \(j\) the sample
rank going from 1 to \(b\ =\ \)27.
\begin{equation}
{\%RA}_{\text{Mod}_{\text{jm}}}\ =\ \frac{\sum_{m\ =\ 1}^{m\ =\ e}\text{reads}_{\text{ASV}_{\text{jm}}}}{\sum_{j\ =\ 1}^{j\ =\ b}{\ \sum_{m\ =\ 1}^{m\ =\ e}\text{reads}_{\text{ASV}_{\text{jm}}}}}\times 100\nonumber \\
\end{equation}To test the Day/Night effect on the modules’ composition, a
Wilcoxon-Mann-Whitney test (Mann and Whitney, 1947) was conducted on the
percentages of relative abundance per sample, for all the modules.
Finally, a Kruskal-Wallis analysis (Kruskal and Wallis, 1952) helped to
test the impact of the three clusters obtained from the PCoA on the
modules’ percentages of relative abundance, followed by Dunn pairwise
test (Dunn, 1961) with Holm p-values correction for multiple comparisons
to identify which cluster affected the most the modules’ percentages of
relatives abundance. In both cases, non-parametric statistical tests
were realised because Shapiro (Shapiro and Wilk, 1965) and Levene’s
(Brown and Forsythe, 1974) tests showed that not all modules’
percentages of relative abundance were normally distributed with similar
variances among groups. Moreover, Pearson’s correlation coefficients
were computed between percentages of relative abundance per module and
per sample, and their coordinates along each morphological dimension per
sample to have information about the overall morphology of the module;
and carbon export per cruise to see if some modules were related to
higher or lower carbon export.
Numerical tools
All statistical analyses were conducted in the programming environment R
4.1.2 (R Core Team 2021). The package tidyverse (Wickham et al., 2019)
was used for data manipulations, lubridate to deal with temporal
variables (David James and Kurt Hornik, 2020), vegan (Oksanen et al.,
2009) and FactoMineR (Lê et al., 2008) for multivariate analysis,
especially the function dimdesc was used to compute correlations
between carbon export and morphospace axes. pairwiseAdonis () was used
for permutational multivariate analyses of variance, and igraph (Gabor
Csardi & Tamas Nepusz, 2006) for network and modularity analyses.
rstatix (Kassambara, 2021), NbClust (Charrad et al., 2014) and Hmisc
(Harrell Jr & Dupon, 2022) were used for statistical analyses. The
packages RColorBrewer (Neuwirth, 2022), ggplot2 (Wickham et al., 2021),
ggpubr (Kassambara, 2020), ggrepel (Slowikowski et al., 2021) and
cowplot (Wilke, 2020) were used to produce general graphics, morphr
(https://github.com/jiho/morphr ) was used to represent individual
images in the morphological space.
Results
Seasonal
dynamics of the BATS ecosystem
Temporal changes in environmental variables followed a classical
seasonal cycle. As expected, temperatures reflected a deep winter
mixing, from December 2016 to April 2017. Summer stratification was
strong and deep (about 40 m) from July 2016 to September 2016 (Supp.
Figure 1). MLD increased in autumn and allowed the re-injection of
nutrients in the surface layer, creating ideal conditions for an
autumnal bloom. Fluorescence was low in surface waters except during the
deep mixing in January 2017 and April 2017 (Supp. Figure 1). DCM was
around 100 metres. Carbon export was the highest in April 2017 and in
May 2016 and nearly absent in September 2016 (0.75 mgC
m-2 day-1). Otherwise, it fluctuated
between 6 mgC m-2 day-1 and 25 mgC
m-2 day-1.
Carbon export is more
transparency-dependant than size-dependant
The four first axes of the PCA based on the morphological variables were
significant. They explained about 87.68% of the morphological variance
observed in mesozooplankton samples (Figure 1). PC1 (42.87%) was mostly
influenced by size and shape related variables (e.g., Perimeter,
Circularity, FerretDiameter, or Symmetry) with smaller and more circular
organisms for PC1 < 0 and larger and more elongated ones for
PC1 > 0. PC2 (23.89%) characterised the grey level of
mesozooplankton and their size, with PC2 < 0 corresponding to
larger, less transparent organisms and PC2 > 0
corresponding to smaller, more transparent ones. PC3 (14.53%) referred
to the elongation of the organisms and how their appendages were
positioned when organisms were scanned. Negative PC3 < 0
corresponded to organisms with visible appendages while PC3
> 0 represented elongated organisms with a well-defined
body (folded or no appendages). Finally, PC4 (6.59%) described the
heterogeneity of grey colour within each organism, with organisms with
more heterogeneous grey colour for PC4 < 0 and organisms with
homogeneous grey colour for PC4 > 0. Morphology of
mesozooplankton was quite similar between Day and Night samples, except
along PC4 (homogeneity in grey colour) with organisms presenting less
homogeneous grey levels during the Day (Figure 1).
Correlations between carbon export and principal components were
significant for PC1, PC2, and PC3 and reached -0.04, -0.30 and +0.08,
respectively (p < 0.01, Figure 1). Thus, according to the
visible signal in morphological space (Figure 1), confirmed by the
correlations just mentioned, carbon export seems to be more strongly
linked to PC2, the transparency of organisms, than to the other axes. To
validate this signal, linear relationships were computed between carbon
export values and size-related morphological descriptors (to test the
assumption that size is a master-trait), and also transparency-related
morphological descriptors (correlated to PC2). The results are visible
in Figure 2, no significant linear relationships (p-values
> 0.05) were found between carbon export and size-related
morphological descriptors. However, significant linear relationships
(p-values < 0.05) were found between morphological descriptors
related to transparency (correlated to PC2) and carbon export. From
these linear relationships, it is possible to infer the relationships
between imaged organism’s transparency and carbon export rates. First,
lower organisms’ transparency, signified by low values of the three grey
intensity variables (Figure 2. F, G, H) was associated with higher
carbon export. Second, carbon export was higher when StdDevGrey values
were high (Figure 2. I), i.e. , when mesozooplankton individuals
presented larger ranges of grey level values. Third, carbon export was
higher when SkewnessGrey values were higher (Figure 2. J). This
descriptor indicates how skewed the normal distribution of the grey
level values is per image. Therefore, a more normal distribution of grey
values is associated with higher carbon export.
Three communities with different levels of transparency succeed one
another at BATS
A PCoA was conducted to analyse the temporal evolution of the taxonomic
composition from metabarcoding data (Figure 3). The first PCoA axis
explained about 22.99% of the variance in the taxonomic composition,
the 2nd one explained 14.84%. Hierarchical clustering
from the first seven PCoA axes (72% of variance explained) coordinates
revealed that three communities could be distinguished from their ASV
composition (Figure 3). Molecular-based cluster 1 corresponds to all the
samplings done between June 2016 and October 2016. The second contains
day and night samplings from May 2016 and April 2017. Molecular-based
cluster 3 is composed of all the other samplings. Based on their
timeline, these groups represent communities present during the summer
stratification period (cluster #1), the zooplankton bloom period
(cluster #2) and the deep winter mixing season, except samplings done
in April 2016 and May 2017 (cluster #3). These patterns were confirmed
by the identification of ASVs specific to each of the molecular-based
clusters whose identity and functional traits match with environmental
conditions (Supp. Analysis I). Clustering pattern is significant (p =
0.001) and communities are all significantly different from each other
(Holm p-adjusted = 0.003 between all pairs of clusters).
Fitting the three molecular-based communities in the morphospace, it
appears that they were quite similar along PC1 (size and circularity)
(Figure 4). However, individuals belonging to the bloom community tended
to be less transparent (lowest coordinate along PC2 axis) than winter
mixing cluster ones. Individuals from the molecular-based summer cluster
were the most transparent ones (lowest coordinate along PC2 axis). This
signal of transparency changes, along PC2, matches for carbon export
intensity, as explained in the previous paragraph. A similar trend is
observed along PC3 (Supp. Figure II) with taxa from the bloom community
tending to be more elongated with fewer or no visible appendages. On the
opposite end, taxa from the summer stratification community were less
elongated with more visible appendages. The winter mixing community was
between both. This succession of communities along PC2 and PC3 also
matches for carbon export intensity.
Metabarcoding read counts and abundance and biomass values from Zooscan
are positively correlated, for some taxonomic groups
Correlations between relative number of reads and relative abundance or
relative biomass derived from Zooscan images were computed for each
taxonomic category (Table 1). Cyclopoida, Doliolida, Harpacticoida and
Larvacea relative abundance and biomass were positively correlated to
relative read count. Only relative biomass was positively correlated to
relative read count for Cephalopoda (Table 1), while relative abundance
was positively correlated to relative read count for Chaetognatha and
Cladocera. It was negatively correlated for Ostracoda.
Migrating and seasonal of groups’ effects on carbon export
To better identify the degree to which specific taxa drove morphological
variations and carbon export changes described above, a co-occurrence
network was built on the 225 ASVs. It contained 413 edges (average of
3.67 co-occurrences per ASV). It had a modularity coefficient of 0.72,
meaning that the network can be divided into modules. A total of 15
modules were found. Six modules were composed of at least 10 ASVs, named
hereafter A to F. They contained between 11 (module F) and 36 (module D)
ASVs.
Taxa were quite different from one module to another. Module A contained
many Euphausiacea, Ostracoda from three genus, and some gelatinous taxa
(Supp. Table VI). Module B was exclusively composed of Arthropoda (Supp.
Table VI). Module D included some Appendicularia and Thaliacea, among
other organisms (Supp. Table VI). Modules’ entire taxonomic composition
can be found in Supp. Table VII.
Relative abundances trends were highly variable among modules (Figure
5). They were very similar between day and night samples, except for
Module A (Wilcoxon-Mann-Whitney test; p < 0.001, Figure 6 &
Table 2), especially during the winter period (from November 2016 to
January 2017). All modules except A there were significant changes in
their relative abundances between molecular-based cluster samplings
(Kruskal-Wallis test, p < 0.05 except for module A: p =
0.182 ; Figure 6). Compared to the overall study period, module B was
relatively more abundant during the bloom period (molecular-based
cluster 2) while module D was relatively less abundant during the same
period (Dunn pairwise comparison test; Figure 6 & Table 2). Taxa
included in module C were relatively more abundant in BATS
mesozooplankton community during the summer stratification period
(molecular-based cluster 1) (p < 0.001, Figure 6 & Table 2).
Percentages of relative abundance from module F were steady and low
throughout the study period. Morphology differed also between modules
(Table 2). When ASVs from module E were more present, they tended to
increase the size and reduce the circularity (PC1 > 0) of
the overall mesozooplankton community. Sampling positions along the
second axis were significantly and positively correlated to percentages
of relative abundances of modules C and F (respectively +0.74 and
+0.55). Hence, when ASVs from these modules are more abundant, the
overall mesozooplankton community tended to be more transparent (PC2
> 0) than usual. Finally, Pearson’s correlation coefficient
was significantly positive between relative abundances of module D and
sampling positions along the third axis of the morphospace (+0.67). It
was negative between relative abundances of module F, and sampling
positions along the third axis of the morphospace (-0.54). Thus, the
overall mesozooplankton community appeared with folded appendices, or
was more composed of taxa without appendices (PC3 > 0) when
ASVs from module D are relatively more abundant, and inversely when
module F is relatively more present. Relative abundances of module E and
sampling positions along the first axis of the morphospace were
positively correlated (+0.56, Table 2). Finally, carbon export was
positively correlated to relative abundance of module D (+0.87) and
negatively correlated to modules C (-0.51) and F (-0.60).