Characterizing cyanopeptides and transformation products in freshwater: integrating targeted, suspect, and non-targeted analysis with in silico modeling

Quantification of cyanopeptides

A quantitative screening was initially conducted for 28 cyanopeptides across seven families to detect the most common compounds produced by planktonic and benthic cyanobacterial blooms in various water sources across Québec, Canada. Positive samples then underwent further suspect screening and NTA, focusing on samples more likely to contain cyanobacterial secondary metabolites (CSMs). The sampling locations showed distinct cyanopeptide profiles (Tables 1 and 2) and varying dilution factors, which reflect the hydrological characteristics of small water bodies, large lakes, and rivers.

Table 1 Quantification results for microcystins (MCs) (µg L−1)Table 2 Quantification results for aeruginosin 98 A (AG-98A) and anabaenopeptins (APs) (µg L−1)

In Lake Saint-Pierre (Sites 1 to 5), four microcystin congeners (microcystin-YR, microcystin-HtyR, [Asp3]microcystin-LR, and microcystin-LR) were detected at concentrations approaching 1 µg L−1. This marks the first detection of significant microcystin concentrations in a lake historically dominated by Lyngbya wollei proliferations [46]. In the Yamaska River (Sites 6 to 10), a tributary of Lake Saint-Pierre, anabaenopeptins (anabaenopeptin A, anabaenopeptin B, and oscillamide Y) were detected at concentrations up to 1 µg L−1. This is the first report of anabaenopeptins in the Yamaska watershed at levels near World Health Organization's guidelines for microcystins in potable water, indicating a significant cyanobacterial presence. Notably, Site 9, near the mouth of Lake Saint-Pierre, exhibited only microcystin-LR, suggesting a mix of water from both the lake and river.

The Yamaska River showed a dominance of anabaenopeptins during cyanobacterial blooms, while Lake Saint-Pierre primarily featured microcystins. Although cyanobacteria species were not identified, the varying cyanopeptide profiles suggest environmental conditions are driving different cyanobacteria or CSM production [16, 17, 23, 47]. Sample 11 revealed a greater diversity of cyanopeptides, with five anabaenopeptins (-A, -B, -C, FA-A, and OC-Y) and three microcystins (-YR, -HtyR, and -[Asp3]LR) ranging from 0.019 to 5.68 µg L−1. This finding highlights the presence of less common cyanopeptide families, some of which are newly detected in some locations within the lake.

At site 12, primarily featured microcystins, seven congeners (-RR, -YR, -HtyR, -[Asp3]LR, -LR, -HilR, -LA, and -LY) were detected at concentrations ranging from 0.01 to 1.91 µg L−1. Additionally, a rare aeruginosin 98 A congener was detected at 0.035 µg L−1—its first report in Québec. These findings highlight the diversity of cyanotoxin profiles, with newly reported cyanopeptides in the area, influenced by local environmental conditions and cyanobacterial prevalence. The presence of a broad range of cyanopeptides, many at toxicologically significant levels when measured as microcystin equivalents, underscores the potential risks to human health and the environment. Often, these compounds go undetected, leaving gaps in understanding their full ecotoxicological and public health impacts.

Suspect screening and in silico on CSMs from CyanoMetDB

A total of 26 cyanopeptides were detected across six sites, including 22 congeners: twelve microcystins, six anabaenopeptins, one microginin, one aeruginopeptin, one cyanopeptolin, and one aeruginosin. The structures are illustrated in Fig. S3. Table 3 provides detailed data on the identified cyanopeptides, including precursor ions (M + H+), theoretical Log P, retention times (confirming polarity), signal intensities (indicating relative abundance), and the in silico coverage showing the percentage of similarity (range?) between theoretical and experimental spectra. The following sections break down the identification and spectral interpretation for each potential cyanopeptide, grouped by family. Since this is a suspect screening, thorough characterization remains essential given the limited spectral information available. CyanoMetDB only provides exact precursor masses, and many compounds have been reported only once in the literature or lack available mass spectra.

Table 3 Cyanopeptides identification using in silico fragmentation through FISh scoring experimentMicrocystins

Twelve distinct microcystin congeners were identified in samples 2, 11, and 12, all of which were already dominated by the presence of microcystins (Tables 1 and 3). Detailed fragment interpretation for each microcystin found by CyanoMetDB is presented in Table S7. Notably, the search for the key fragment ion m/z 135 assists the initial screening process of microcystins. This ion represents a characteristic fragment ion resulting from the cleavage of Adda moiety, a unique amino acid present in all microcystin variants [41].

In sample 2, three potential microcystins were identified. [Ser7]microcystin-LR was confirmed based on the spectra in Fig. S4, supported by the absence of a thiol group at position 7, which is replaced by a serine. This congener is close to [Asp3]microcystin-LR and microcystin-LR, which were also detected in the sample. [DMAdda5, GluOMe6]microcystin-LHty, a newly discovered congener, shares the same molecular formula as the CyanoMetDB-proposed microcystin-LHty but includes DMAdda at position 5 and GluOMe at position 6 as shown in Fig. 1 and detailed in SI-2. The presence of LHty in the structure is unusual and indicates a potentially greater structural diversity within cyanobacterial blooms, typically dominated by more common variants and their close analogs. Such diversity can affect how we assess the ecological and health risks associated with blooms, as different variants may have varying levels of toxicity, stability, or environmental persistence. Linear [seco-4/5][D-Asp3]microcystin-HtyR, with the structure Adda5-Glu6-Mdha7-Ala1-Hty2-Asp3-Arg4, was also identified, with [D-Asp3]microcystin-HtyR hydrolyzed between Arg4 and Adda5 (Fig. S5). Linear microcystin congeners are often non-toxic due to the importance of the cyclic structure for bioactivity, but more toxicological data is needed [48, 49].

Fig. 1figure 1

Structure characterization of new [DMAdda5, GluOMe6]microcystin-LHty. A Extracted ion chromatogram of ion m/z 1016.5314 and B isotopic pattern of most intense precursor ion. C Extracted ion chromatogram of thiol derivative ion m/z 1094.5599 and D isotopic pattern of thiol derivative. E Fragmentation spectrum and in silico matching with FISh coverage

In sample 11, five microcystins were proposed. [D-Asp3]microcystin-M(O)R, an oxidized form of [D-Asp3]microcystin-MR (Figs. S6 and S7), exhibited a signal three times stronger than the unoxidized version, likely due to degradation within environmental or laboratory conditions. Methionine-containing microcystins are rarely reported, and this is their first detection in Québec watersheds [50]. [DMAdda5]microcystin-YR was identified as a demethylated form of microcystin-YR at the Adda function, supported by its spectral similarity to microcystin-YR, which is the dominating congener of sample 11 (Fig. S8). The demethylation may result from incomplete methylation during biosynthesis or ADMAdda instability [41]. [D-Ser1, D-Asp3]microcystin-HtyR (Fig. S9) was proposed as a candidate closely related to the common microcystin-HtyR, a congener detected in this sample. The last microcystin detected in the sample, [D-Asp3, DMAdda5]microcystin-LR, is closely related to [Asp3]microcystin-LR, with spectra detailed in Fig. S10. The abundance of [Asp3]microcystin-LR in this sample suggests that incomplete methylation or ADMAdda artifacts from environmental transformation may explain the formation of this congener.

In sample 12, four uncommon microcystins were proposed. The [Mdha-GSH7]microcystin-LR variant was identified by the absence of a thiol derivative, as glutathione (GSH) bound to the α,β-unsaturated carbonyl of the Mdha moiety (Fig. S11) [51]. This GSH conjugate, part of the cellular detoxification pathway for microcystin-LR, retains toxicity [51]. [epoxyAdda5]microcystin-LR was found to be likely an oxidized variant of microcystin-LR generated via metabolism or autooxidation (Fig. S12) [41]. [DMAdda5]microcystin-LR was proposed with spectra (Fig. S13), and given the abundance of microcystin-LR, it may result from incomplete methylation during biosynthesis. Lastly, [seco-1/2] microcystin-LR (Fig. S14), a linear form of microcystin-LR hydrolyzed between positions 1 and 2, is a non-toxic biogenetic precursor [48, 49]. Sample 12 exemplifies microcystin diversity, with microcystin-LR predominating alongside a precursor, and metabolized, oxidized, and demethylated variants. Semi-quantitative calculations, relative to microcystin-LR and excluding [seco-1/2], yielded a total concentration of 0.32 µg L−1, which significantly contributes to the total concentration of 1.91 µg L−1 for microcystin-LR. Failing to monitor important TPs and microcystin-LR analogs with similar toxicity can have serious consequences for public health, as it may lead to the omission of drinking water and recreational water advisories, such as consumption bans or swimming restrictions.

The distinct profiles of microcystin congeners across these microcystin-rich samples (2, 11, and 12) reveal insights into the dynamics of cyanobacterial blooms. Sample 2 emphasizes uncommon variants, which may suggest localized environmental conditions favoring their production, while sample 11 showed the occurrence of oxidation and methylation modifications of different congeners, and sample 12 showcased metabolized and transformed forms of microcystin-LR. This suggests that focusing solely on the dominant congener may underestimate the total toxicity, as other variants and degradation products can also contribute to harmful effects.

Anabaenopeptins

Six different anabaenopeptin (AP) congeners were detected in samples 7, 10, and 11, where this cyanopeptide family was prevalent (Tables 2 and 3). A key feature in all identified anabaenopeptins is the presence of fragments at m/z 84 and 129, representing the immonium ion and residue of lysine, respectively. These fragments are essential markers for identifying AP and consistently appear in all intact anabaenopeptins (Fig. S3). A detailed analysis of AP fragments is provided in Table S8.

In samples 7 and 11, anabaenopeptin F and anabaenopeptin E were identified. Although they share the same molecular formula, they differ by a methyl shift between positions 3 and 4 (Ile3-Hty4 for anabaenopeptin F in Fig. S15 and SI-3 and Val3-MeHty4 for anabaenopeptin E in Fig. S16). The co-occurrence of anabaenopeptin F and anabaenopeptin E has been linked with the presence of anabaenopeptin B and oscillamide Y, which were also found in the samples, both common in freshwater blooms dominated by anabaenopeptins [17, 29, 47]. Additionally, anabaenopeptin H was detected in sample 7, with the structure Arg1-CO-Lys2-Ile3-Hty4-MeHty5-Ile6 (Fig. S17). This congener is not detected as frequently as its B and F counterparts, but its similar structure and association with Oscillatoria suggest that its presence is plausible [52].

In sample 10, three anabaenopeptins were identified. Anabaenopeptin HU892 features the structure Arg1-CO-Lys2-Val3-Hph4-MeHty5-Ile6 (Fig. S18). This congener belongs to the presence of an unusual subgroup, characterized by an aliphatic amino acid at the carboxyl end and an N-methyl-homoaromatic amino acid at position 2 [53]. Anabaenopeptin SA3 and anabaenopeptin 679 were found in samples 10 and 11, which show similar fragmentation patterns (Figs. S19 and S20). The proposed structure for anabaenopeptin SA3 is Lys1-CO-Lys2-Val3-Hty4-MeAla5-Phe6. Regarding anabaenopeptin 679, its biosynthesis remains unclear, but the absence of the Lys-CO side chain at position 1 compared to anabaenopeptin SA3 suggests a potential degradation pathway or a different biosynthesis route by cyanobacteria, possibly the Planktothrix species [54, 55]. Further research is needed to confirm their production mechanisms.

Other cyanopeptides

In addition to the dominant identified cyanopeptide families like microcystins and anabaenopeptins, four cyanopeptides rarely documented in the literature were tentatively identified: microginin 690 and aeruginopeptin 228 A in sample 3, and cyanopeptolin 1081 and aeruginosin A in sample 11. Detailed spectral interpretations are provided in Table S9.

Microginin 690 is a linear oligopeptide featuring a decanoic acid derivative (Adha) at its N-terminus, with the structure Adha1-Tyr2-MeMet(O)3-Tyr4 (Fig. S3, spectra in Fig. S21). Although rarely documented, Zervou et al. linked microginin 690 to Microcystis aeruginosa, making its presence in sample 3 highly probable [56]. Aeruginopeptin 228 A, also identified in sample 3, is a depsipeptide known for its cytotoxic properties. It has been associated with Microcystis aeruginosa, strengthening its identification in this sample [57]. Its structure, Hpla1-Gln2-Thr3-Tyr4-Ahp5-Thr6-MePhe7-Ile8, is detailed from spectra in Fig. S22. To our knowledge, this marks the first detection of an aeruginopeptin variant in a Canadian water source.

Cyanopeptolin 1081 is a cyclic depsipeptide with Thr at the first position, linked through an ester bond on a beta-hydroxyl group, and contains an Ahp residue within the cycle (Fig. S3). A few studies indicate that cyanopeptolins cause neurotoxic effects in model organisms, demonstrating a similar toxic impact to that of microcystins [58, 59]. Its structure, Tyr1-GA-Gln2-Thr3-Leu4-Ahp5-Thr6-NMe-Phe7-Ile8, was confirmed through spectra analysis (Fig. S23) and aligns with the findings of Bober et al., who recently identified this cyanopeptolin congener [60]. To our knowledge, this is a rare report of a cyanopeptolin variant in river systems.

Finally, aeruginosin A proved difficult to confirm, as it was detected repeatedly in all samples with four closely related hits between retention times 8.3 and 9.7 min, corresponding to ions m/z 617.3439, 617.3434, 617.3443, and 617.3443, with in silico scores ranging from 49 to 55%. The ion at 9.7 min was finally proposed as aeruginosin A against other signals (Fig. S24). Aeruginosin A is a linear tetrapeptide (Fig. S3) featuring the Choi moiety (2-carboxy-6-hydroxy-octahydroindole), a common component of all aeruginosin congeners. This moiety is linked to a hydrophobic amino acid, a 4-hydroxyphenyl lactate derivative at the N-terminus, and an arginine derivative at the C-terminus [61]. Aeruginosin A is recognized for its inhibitory effects on fVIIa and thrombin [37].

Level of confidence in identification for suspect screening

Tables 1 and S6 provide retention times and Log P values for known standards as well as tentatively identified compounds. While Log P is not automatically correlated with retention times due to structural properties influencing chromatographic behavior, it serves as a useful indicator to assess whether the retention time of tentatively identified compounds falls within a plausible range. For instance, [DMAdda5]microcystin-YR (Log P − 4.37) at 9.6 min can be compared with MC-LR (Log P − 3.64) at 11.17 min, where the more hydrophobic MC-LR shows a longer retention time. Similarly, AP-F (Log P − 0.97) at 9.1 min aligns with AP-B (Log P − 1.47) at 8.73 min, where the lower Log P of AP-B results in a shorter retention time. Such comparisons help increase confidence in identifications during the suspect screening process.

Additionally, according to the Schymanski classification system, cyanopeptides identified through suspect screening using CyanoMetDB are assigned to Level 2b because their identification relies on partial fragmentation coverage and comparisons with a database of known compounds and published mass spectra [42]. In suspect screening, the exact identity of the compound is not predetermined, and the identification is based on mass spectra, fragmentation patterns, and database matches rather than complete fragmentation data. As a result, the identification of cyanopeptides at this level of confidence remains tentative.

NTA and in silico on compound classes scoring

A new NTA strategy was developed and employed, using a compound class coverage method, to search for uncommon cyanopeptide congeners and TPs. Seven TPs were identified, linked to eight microcystins and anabaenopeptins, as summarized in Table 4. This table lists the proposed TP’s chemical formula, precursor ion (M + H+), retention time, class coverage, in silico coverage, related degradation compounds, and the sample in which the TPs were detected. Additionally, the enviPath model was used to cross-check the identity of parent compounds and draw potential degradation pathways, providing additional confidence in the proposed transformations [44].

Table 4 Transformation products (TPs) identification using non-targeted analysisTransformation products (TPs) elucidation

The first TP (m/z 637.3703) in sample 11 likely results from the formation of carbamates (carbamatization) of anabaenopeptin B and anabaenopeptin C, involving the loss of the arginine chain and CO from the cyclic structure, with the configuration Lys2-Val3-HTyr4-Ala5-Phe6 (Fig. 2). The spectra (Fig. 3) share key fragments when compared to anabaenopeptin B used as standard in quality control, framing its structure within amino acid positions 2 to 6. However, key AA1 ions at m/z 129, 158, 175, and 201 are missing, suggesting the Arg group has been lost. TP2 (m/z 680.3755) is a proposed ketonization product common to anabaenopeptin B, -C, -E, and -SA3, found in samples 10 and 11 (Figs. S25 and S26). This TP likely corresponds to anabaenopeptin 679, a degradation product of anabaenopeptin SA3, based on the characteristic fragmentation pattern and retention time. Based on enviPath, this TP can originate from multiple anabaenopeptins with the cyclic structure Lys2-Val3-Hty4-MeAla5-Phe6, supporting the hypothesis that anabaenopeptin 679 is a degradation product of anabaenopeptin SA3. TP3 (m/z 695.3759), found in sample 11, is a putative carbamatization product of oscillamide Y, resulting in the loss of the AA1 side chain (Figs. S27 and S28). The spectrum shares the same key fragments as oscillamide Y, used as a standard in quality control, framing its structure within amino acid positions 2 to 6. However, the spectra lack key fragments at m/z 129, 158, 175, and 201, suggesting the loss of Tyr at AA1. TP4 (m/z 711.3710) is proposed as a product of two consecutive degradation reactions of oscillamide Y, identified in samples 7 and 11 (Figs. S29 and S30). First, hydroxylation at the tertiary aliphatic group was proposed in Ile at AA3, followed by carbamatization, leading to the loss of AA1. Spectra of TP4 closely resemble those of TP3, also lacking Tyr fragments.

Fig. 2figure 2

Proposed transformation product 1 (TP1) formation by carbamatization of anabaenopeptin B

Fig. 3figure 3

Example of structure characterization for the transformation product 1 (TP1). A Extracted ion chromatogram of ion m/z 637.3703, B isotopic pattern of most intense precursor ion, C fragmentation spectrum and compound class coverage compared to anabaenopeptin B, and D fragmentation spectrum and in silico matching with FISh coverage

TP5 (m/z 911.5712), found in samples 2 and 11, is proposed as a degradation product of [Asp3]microcystin-LR (Fig. S31). It forms through a three-step process (Fig. S32), including ring opening between AA6 and AA7. Only fragments below m/z 300 were detected, aligning with decarboxylation at AA3 and AA6 and demethylation at AA7. TP6 (m/z 999.5491) in sample 11 is a proposed linearization product of [Asp3]microcystin-LR, shown in Figs. S33 and S34. It represents a linear [seco-1/7][D-Asp3]microcystin-LR, which could be considered a potential new microcystin congener. The degradation pathway mirrors that described by Ding et al. for microcystin-LR [62]. TP7 (m/z 1061.5301), detected in samples 3 and 11 (Figs. S35 and S36), is a proposed hydroxylation product of microcystin-YR. Tyr at AA2 is hydroxylated to form dopamine, while the rest of the structure remains intact. The detection of dopamine formation from microcystin-YR is novel, suggesting a previously unrecognized transformation pathway or a new microcystin congener. Although enviPath suggests this compound as a TP, its identification remains uncertain due to the lack of information linking dopamine to the structure of microcystins. Therefore, this compound will be identified as a potential TP or metabolite, and further studies are needed to investigate the emergence of this form.

Level of confidence in identification for NTA

According to the Schymanski classification system, the TPs identified through this NTA method are assigned to Level 3, as their identification relies solely on MS2 data interpretation and context [42]. In NTA, the exact identity of the compound is not predetermined, and the identification relies on mass spectra, fragmentation patterns, and comparisons to existing databases or theoretical fragmentations. The identification of TPs at this level remains therefore tentative. However, this identification remains a preliminary proposal for the study of the transformation of these compound families, which are still very poorly understood. Further experimental validation and additional contextual evidence are necessary to confirm the structure of the identified TPs.

Cyanopeptides discovery and methodology challengesCyanopeptides diversity

The discovery of a wide diversity of cyanopeptides in investigated waters uncovered both uncommon and novel congeners, as well as new TPs. Using the quantitative method with standards, a total of 14 cyanopeptides, including 8 microcystins, were identified across samples. By exploiting CyanoMetDB and in silico modeling for suspect screening and NTA, 22 uncommon cyanopeptides were detected, and 7 new TPs were identified through enviPath and compound class matching. Two new microcystin congeners, [DMAdda5, GluOMe6]microcystin-LHty and linear [seco-1/7][D-Asp3]microcystin-LR, were tentatively identified. Identifying new TPs, especially from less studied cyanopeptides like anabaenopeptins, highlights the need to study TPs to better understand their fate and ecological impact. This finding underscores the risks of focusing exclusively on well-characterized cyanopeptides in toxicological assessments. Furthermore, the identification of uncommon cyanopeptide families such as aeruginosins, cyanopeptolins, microginins, and anabaenopeptins challenges the conventional focus on microcystins. For instance, the notable prevalence of anabaenopeptins in the Yamaska River, compared to microcystins in Lake Saint-Pierre, suggests a shift in the distribution of cyanopeptides where the two watercourses differ in physico-chemical characteristics, likely due to various environmental factors. This discrepancy highlights the need for further investigation through expanded sampling and detailed cyanobacteria characterization (taxonomy and genomics) to refine our understanding of cyanopeptide distribution.

Methods comparison for cyanopeptides discovery

Our developed targeted and suspect screening methods are powerful tools for ensuring accurate quantification and confident identification of known and predicted cyanopeptides. Indeed, this study utilized one of the most extensive targeted methods for cyanopeptide analysis to date, quantifying 28 different compounds across seven families, in addition to a robust suspect screening approach that integrated in silico modeling to enhance the detection of predicted compounds within CyanoMetDB. Using this suspect screening methodology, we could indirectly identify a new congener, [DMAdda5, GluOMe6]microcystin-LHty. Unique spectral features that did not match the database were revealed when combined with manual interpretation. However, both approaches are limited by predefined compound lists; therefore, our developed NTA method complemented both targeted and suspect screening methods by broadening the scope of compound detection, discovering novel TPs and potential new congeners including [seco-1/7][Asp3]microcystin-LR and dopamine-modified microcystin-YR that would have otherwise gone undetected.

Challenges of in silico modeling

In silico modeling was employed to streamline and assist data interpretation but increased the risk of false positives without deeper spectral interpretation, due to similar chromatographic and spectral profiles. A minimum match score of 30% was required for in silico fragments; however, not all candidates met this threshold. The limitations were highlighted by the frequent absence of characteristic fragments. For example, in Tables S9 to S11, ions marked with an asterisk were not generated by FISh scoring but were key contributions to structural confirmation. This absence challenges accurate identification in complex matrices where overlapping signals may mask specific fragment ions. As a result, relying solely on computational predictions can yield incomplete or misleading results. This limitation was addressed using experimental data from known cyanopeptides, combined with expert interpretation to ensure better certainty in identification. In a previous study by Zervou et al., the use of in silico approaches has been explored without including standards, relying instead on a spectral database (mzCloud™) limited to a few microcystins congeners [34]. This strategy was successful for microcystins elucidation but restricted the exploration of CSMs due to the lack of available experimental data. Compound class matching with integrated experimental data also aided in discovering new cyanopeptides and studying previously unexplored TPs. Each cyanopeptide family shares a similar substructure, enabling the targeting of characteristic low- and medium-mass fragments, increasing confidence in structural identification within analogs. Future research on CSMs identification should focus on improving in silico algorithms and developing experimental open databases. Also, integrating databases with machine learning could improve identification accuracy, automate large dataset processing, and support toxicological studies and monitoring programs. This would expand geographic coverage and enable better risk assessments and management of newly identified compounds.

Comments (0)

No login
gif