Fish diversity monitoring in Maninjau Lake, West Sumatra using the eDNA with the next generation sequencing (NGS) technique

Various phenomena in nature and human activities have resulted in the loss of biodiversity, such as freshwater fish in Maninjau lake, West Sumatra, which has declined from year to year. A new method to monitor biodiversity quickly and efficiently is the NGS technique on environment DNA (eDNA). The study aimed to know the NGS technique’s ability and effectiveness to detect multiple species at one time from the water samples of Maninjau Lake. Water was taken from the surface as much as one L with two replications. The sequencing was used to identify species as shotgun metagenomic with universal primers. The results showed that 92 individuals were identified with 56 species from 24 genera, 16 families, and 12 orders. As much as 25% of fishes still cannot be grouped into the valid taxa (unclassified), which is allegedly related to the lack of available information database (NCBI) compared to the sequences obtained. NGS on the eDNA method detected two families (Cyprinidae and Cichlidae), three genera (Oreochromis, Cyprinus, and Xiphophorus), and two species of fish (Oreochromis niloticus and Cyprinus carpio) in Maninjau Lake, which were also previously reported using the conventional method. The native species were not detected in Maninjau lake by the eDNA method allegedly due to not being captured DNA from the collected water samples or the low DNA concentration, which cannot be continued to the PCR process. Thus, several efforts are needed to reduce the limitations in monitoring using the eDNA method to obtain the maximal results. The eDNA method can be a useful tool for monitoring biodiversity. The result can determine the conservation strategies, especially for the fishes in Maninjau Lake, West Sumatra.


Introduction
Globally, biodiversity has declined as the impact of various phenomena in nature and the anthropogenic stressors, including climate changes, ocean acidification, habitat degradation, pollution, and over-exploitation [1][2][3]. Living Planet Index (LPI) reported that biodiversity has declined by 52% in the last 40 years, with the highest loss occurred in the freshwater ecosystem (76%) compared to the marine or terrestrial ecosystems [4]. The loss of biodiversity continuously will positively affect the ecosystem's balance. The biodiversity monitoring as an effort to ecosystem management is needed to prevent further decline [5].

Sampling
Water samples were collected from Maninjau Lake, West Sumatra, following the protocol [52,53]. All equipment was sterilized before the sampling to avoid possible contamination. During sample collection, sterile gloves should always be worn and avoid touching anything other than the sample bottle to prevent contamination. Samples of lake water were taken from the surface as much as 1 L with two replications. The water samples were put into the sterile bottles, then placed into a cooling box, and carried to the filtration laboratory. The samples must be kept at low temperatures during the field collection as degradation DNA will occur at moderate temperatures. The ecological data were recorded at the sampling locations as the additional data for analysis (such as date, time, GPS, temperature, and pH). The results of previous monitoring at Maninjau Lake using conventional methods are shown in Table 1.  [30,45]. During the filtration process, DNA particles in the water sample will be captured by the cellulose membrane at the filter funnel's bottom. After filtration, the filter membrane then removed from the funnel using sterile forceps. The filter paper was then folded into four folds (the filter's outer side faces inward). Ensure that the membrane was not unfolded and placed it into sterile microtubes containing ethanol to keep the DNA stable. The filter membrane was stored at -20 0 C until used for DNA extraction [52,53]. DNA was extracted from the filter membrane using the ZymoBIOMICS™ DNA Miniprep Kit. Manufacturer's protocols were used consists of some process with the solution provided, including; Lysis buffer, Binding Buffer, Wash Buffer 1, Wash Buffer 2, and DNase/ RNase nuclear water (elution buffer). DNA from the water samples was eluted in a final volume of 100 μl. DNA isolate was checked by electrophoresis with 1.2% agarose to determine the DNA quality and potential contamination. DNA isolate purity was tested using the Nano Photometer® spectrophotometer (IMPLEN, CA, USA) and the DNA concentration using the Qubit® dsDNA Assay Kit in Qubit® 2.0 Fluorometer (Life Technologies, CA, USA).

Library construction and sequencing
Isolate DNA was used as input material for the PCR process (library preparations). Sequencing libraries constructed with amplified the DNA using NEBNext® Ultra™ DNA Library Prep Kit from Illumina (NEB, USA) and index codes were added to attribute sequences to each sample before the PCR process. The end of DNA fragments was polished and ligated with the full-length adaptor for several hundred base-pairs for Illumina sequencing. Illumina technology (Solexa) utilizes a sequencing-by-synthesis approach. All four nucleotides are added simultaneously to the flow cell channels, along with DNA polymerase, to be incorporated into the oligo-primed cluster fragments. DNA polymerase produces multiple DNA copies or clusters; each represents the single-molecule that initiated the cluster amplification. Each group contains approximately one million copies of the original fragment. PCR products were purified with the AMPure XP system, and to determine the size of the PCR data, results were analyzed using the Agilent 2100 Bioanalyzer and real-time PCR. According to the manufacturer's instructions, the index-coded samples' clustering was performed on a cBot Cluster Generation System. After cluster generation, the library preparations were sequenced on an Illumina NovaSeq 6000 System platform, and paired-end sequences were obtained.

Bioinformatic Analysis
The quality of sequencing data was checked to obtain clean sequence data. Clean sequence data were assembled by Soapdenovo [54] and mapped to scaftigs using Soap Aligner Mapping parameter [55]. Scaftigs were used for ORF (Open Reading Frame) gene prediction by MetaGeneMark [56][57][58]. Data BLAST was performed using the MicroNR database to obtain annotation information of the gene catalog. The taxonomic annotation for each unigene was assigned using the LCA (lowest common ancestor) algorithm [59]. Data BLAST used three databases, including KEGG [60,61] eggNOG [62], and CAZy [63]. Statistical analysis was performed using PCA analysis [64], Anosim analysis, cluster analysis, meta stat analysis [65], PATHWAY analysis, LEfSe analysis, CCA/RDA analysis, and NMDS analysis.

Result
The research is a biodiversity monitoring study using the eDNA method with the NGS technique to determine its ability and effectiveness to detect species using conventional methods. The result of monitoring using the eDNA method has successfully detected fishes from the water sample in Maninjau Lake, West Sumatra. The results showed that 92 individuals were identified, consisting of 56 species from 24 genera, 16 families, and 12 orders. From all identified species (92), 23 individuals (25%) were known as unclassified, and 13 individuals are only classified to the family level. The result of fish identification using the eDNA method with the NGS technique is shown in Table 2. Among the 92 fish individuals, the highest number of detected species were found in Cyprinidae and Salmonidae with four species, respectively. While the average other families detected were represented by single species. However, as well known, Salmonidae is a family that only lives in America and Europe, so it is impossible to find it in Indonesia. As much as 25% of fishes still cannot be grouped into the valid taxa (unclassified), which is allegedly related to the lack of available sequence database (NCBI) compared to the sequences obtained. It should be noted that 35% of the results are fishes not found in Indonesia, especially in West Sumatra. These species are Astyanax, This monitoring study detected groups of fishes that do not live in Maninjau Lake, which were not previously found using the conventional way. Monitoring using the eDNA method only detected two families (Cyprinidae and Cichlidae), three genera (Oreochromis, Cyprinus, and Xiphophorus), and two species (Oreochromis niloticus and Cyprinus carpio) in Maninjau Lake, which were also previously reported using the conventional method. The other fish groups, mainly native fishes in Maninjau lake such as Rasbora, Hampala, Cyclocheilichthys, and Tor, were found in the previous study (Roesma 2010) were not detected by the eDNA method. The other genera distributed in Indonesia's different regions (Oryzias, Carassius, Gambusia, and Scleropages) were also detected, never reported in Maninjau Lake. Oreochromis niloticus and Cyprinus carpio are originally from Africa and Europe; this fish is widely consumed and used in aquaculture in many countries. Cyprinus carpio and Oreochromis niloticus have been cultivated in Maninjau Lake using floating net cages for ages.

Discussion
The present study is the first monitoring using the eDNA method in fishes diversity conducted in Maninjau Lake, West Sumatra. This method was successful in identifying previously reported species. As much as 75% of fishes were classified at the species level, 25% cannot be classified on clear taxa. This percentage showed the monitoring using the eDNA method with the NGS technique can be the alternatives to identify the ecosystem biodiversity. The success of biomonitoring using the eDNA method was also reported in the previous studies by [41,42,44,45,66], which showed that the eDNA method was able to identify more species than the conventional method. Base on [42] reported that monitoring using the eDNA method (Cytb and 12sRNA) was more effective because it detected 70%-90% fish compared to the conventional method that only noticed 25% fish from the previous report.
NGS technique is a sequencing technique that can read all sequences in parallel from multispecies simultaneously. This study used the shotgun metagenomic sequencing with the universal primers, thus amplifying the various taxa sequences. However, the limitation of the available sequences database caused sequences from the Actinopterygii group (25%) that still unable to be classified. Thus, a study using the eDNA method able to identify all the species if the complete reference database is present, such as the available database for freshwater fishes in Balar lake, Mexico, which includes a record for 93% database of the 70 species found in Bacalar Lake [67][68][69].
As previously reported, in 1916, there were 33 species, whereas, in 2008, only 14 species of fishes were found in Maninjau Lake [6]. Meanwhile, monitoring using the eDNA method showed 56 identified fish species; among them, only two families, three genera, and two species that were also previously reported in conventional method study in Maninjau Lake. Besides, no detection of the native fish of Maninjau Lake such as Rasbora, Hampala, Cyclocheilichthys, and Tor; almost 35% of fishes identified only known to occur in America, Europe, and Africa with the highest number of detected species was from Salmonidae. The monitoring study using the eDNA method by [45] also detected the previously never reported fish families in the study area (Monterey Bay National Marine Sanctuary). The study [66] using the same eDNA method also detected several species that had not been reported in the previous monitoring.
Species detected data are essential to determine the value of biodiversity richness in monitoring studies. However, the detection results can produce biased results known as false positive and false negative detection. False-positive is a false detection where the found DNA is not the target / is not 7 present in the system. In contrast, a false negative is a condition where there is target DNA in the system, but it is not detected during monitoring.
The detection of nearly 35% of native species of Europe, America, and Africa in the biodiversity monitoring study in Maninjau Lake was included in the false positive detection group. False-positive was also found in the other monitoring studies by [30] of 8.3%, [45] of 3%, and [42]. Meanwhile, [44] reported detection results using the eDNA method, and conventional methods found no false positives. The detection of false-positive samples may occur due to the contamination of the samples collected from other sources such as; wastewater, dead animals, or feces from other predator animals [42,70]. According to previous authors [42,[70][71][72][73], false-positive also occur due to several things. Among them are contamination in the laboratory, low DNA concentration for the PCR process, the low specificity of primers and probes, competition between target DNA with non-target DNA, crosscontamination between samples, and barcode misassignment (unavailable), and misidentification of species deposited in the NCBI database. Also reported [66] that the appearance of non-target DNA (Lachnolaimus maximus, which is commercial fish in Huay Pix) in Bacalar Lake, almost certainly mediated by human activity, which makes it as consumption fish in that area.
No detection of native fish of Maninjau Lake, such as; Rasbora and other genera that have been reported in the monitoring using the conventional method [6], can be said to the false negative detection. In other studies, false-negative detection was also written with presence rates of 0%-8.2% [23,42,74]. The study results by [42] showed the absence of two Lamprey species that previously have been reported in the Windermere Lake, Inggris. Referring to several authors [12,70,73,75], the presence of false negatives in the monitoring using the eDNA method may occur for three reasonsfirst, the collection of water samples not sufficient to represent all the eDNA targets. Second, contamination in sample collection and the molecular process. Third, there is no complete database that is available for identification. Based on the references, it can be stated that the native fish of Maninjau Lake were not detected by the eDNA method. After all, the target DNA was not carried in the collected water samples because the amount of DNA concentration was low, so it could not be continued for the next process. Studies by [42,45] revealed that species detection using the eDNA method would be more representative if the sample collection were conducted from many sites at different depths with several replications. According to [45], water sample collection without repetition was not representative of all DNA in the lake waters because of the possibility that the eDNA had not being mixed homogeneously in all water samples. The discrepancy between eDNA is found at different sites and depths, potentially due to the physical factors of eDNA such as; density or relationship among the particles. Maninjau Lake is one largest lakes in West Sumatra. Lack of sampling sites is thought to be why not all DNA species in Lake Maninjau are represented. Thus, sampling with an increase in the sampling sites and replications potentially increases the likelihood of finding a DNA target and eliminates false negative detection.
Oreochromis niloticus and Cyprinus carpio were detected using both monitoring methods. Oreochromis niloticus is native species from Africa which has been introduced in many countries. Oreochromis niloticus also one of the cultivated fish using floating net cages (KJA) in Maninjau lake. However, the rapid increase of the Oreochromis niloticus population makes it invasive species, which is believed to decline native fish in Maninjau Lake. That condition impacts the difficulty of finding native species using the direct capture or eDNA method. According to [76], invasive species introduction was the third leading cause of global biodiversity loss. Referring to [77] that invasive species threatened almost 40% of North America's freshwater fish. Studies using eDNA are also reported [66] that the increase of invasive species and water quality decrease impacted the native fish population. Thus, the studies using the eDNA method is needed for more effective monitoring. The presence of false positive and false negative detection in this study can be concluded because of 1) the collection of water samples were not sufficient to represent all the eDNA targets; thus, the increase of sampling sites, depths, and replications are needed to cover all biodiversity fish in Maninjau Lake; 2) contamination in the samples collection such as; the presence of non-target DNA from feces of predatory animals or the other sources; 3) possible contamination in the molecular process such as; the This study confirmed the eDNA method's ability with the NGS technique as a tool for rapid biodiversity monitoring. Although the result does not show 100% success, many eDNA metabarcoding studies in the aquatic system have shown that their effectiveness is characterized by the absence of false positives and false negatives. Both present and previous studies described that 1) the main challenges and the limit of the eDNA methods are; the risk of false-positive and false-negative detections (because of the DNA degradation or bias in the PCR process), which lead to errors in determining the species richness [21], 2) this method does not provide the size information, the developmental stage, and sex of the target DNA, 3) not able to distinguish hybrid from their maternal species, and 4) the eDNA method does not provide the quantitative estimates (density or biomass) for the target species which is often required to determined the species status [44]. Studies using the eDNA method with the NGS technique become an effective solution to biodiversity monitoring. Some of the reasons that support are 1) the higher ability to detected species per site compared to the conventional methods, 2) more efficiency in terms of time, cost, and labor expended, 3) does not cause any disturbance to the target species and the ecosystem, 4) allows the detection of almost all species without having to know their presence in the environment; and 5) the using of universal primers able to identify all species, which can be used for monitoring the biodiversity globally. However, some effort to decrease the eDNA method's limitations can be carried out before the monitoring to obtain the maximal result [44,70].

Conclusion
This study represented the eDNA method's ability with the NGS technique for monitoring fish biodiversity in Maninjau Lake. The results detected 92 fish individuals consists of 56 species from 24 genera,16 families, and 12 order with the highest number of species detected was Cyprinidae and Salmonidae with four species, respectively. The native species were not detected in Maninjau lake by the eDNA method allegedly due to not being captured the native species DNA from the collected water samples or the low DNA concentration, which cannot be continued to the PCR process. Thus, several efforts are needed to reduce the limitations in monitoring using the eDNA method to obtain the maximal results. The eDNA method can be an effective tool for monitoring biodiversity. The result can determine the conservation strategies, especially for the fishes in Maninjau Lake, West Sumatra.